Rank: Newbie Groups: Member
Joined: 5/28/2009 Posts: 1 Points: 3 Location: Indiana
|
I was wondering if someone could help me out with an issue. I am currently trying to create a sitemap for a website that has around 80,000 to 100,000 pages on the website. For some reason, the sitemap spider is able to find almost 1.5 million different pages on the website and I am trying to figure out why. Has anyone else had a similar problem with their website? This process is going to take forever. The website itself is just basically the services that we provide. Is there a way to prevent it from going into subdomains?
Thanks for the help, I read through the starting guide and I think I understand the functions but couldn't figure out how to restrict subdomains.
|
|
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 590 Points: 396 Location: Chicago, IL
|
There are 3 main reasons the software is finding more URLs that you'd expect: 1. Subdomains are being crawled. 2. Your pages support querystring parameters. Every URL with different parameters is a unique page. So if your URLs have a sort parameter, it's a new URL. The page and content might be the same, but it's still a unique URL. If you want to ignore querysting parameters, set that in the Preferences. 3. You have more URLs than you thought.
If you're unsure, please PM us your URL and we'll take a look.
Thanks, KeyLimeTie
|