Welcome Guest Search | Active Topics | Members | Log In | Register

1,400,000 pages!!! Options
Karusune
Posted: Thursday, May 28, 2009 10:01:20 AM
Rank: Newbie
Groups: Member

Joined: 5/28/2009
Posts: 1
Points: 3
Location: Indiana
I was wondering if someone could help me out with an issue. I am currently trying to create a sitemap for a website that has around 80,000 to 100,000 pages on the website. For some reason, the sitemap spider is able to find almost 1.5 million different pages on the website and I am trying to figure out why. Has anyone else had a similar problem with their website? This process is going to take forever. The website itself is just basically the services that we provide. Is there a way to prevent it from going into subdomains?

Thanks for the help, I read through the starting guide and I think I understand the functions but couldn't figure out how to restrict subdomains.
Sponsor
Posted: Thursday, May 28, 2009 10:01:20 AM
Get your Sitemap Generator license today! http://www.keylimetie.com/Checkout/Quick-PayPal/
KeyLimeTie
Posted: Thursday, May 28, 2009 11:03:11 AM
Rank: Administration
Groups: Administration

Joined: 1/31/2007
Posts: 590
Points: 396
Location: Chicago, IL
There are 3 main reasons the software is finding more URLs that you'd expect:
1. Subdomains are being crawled.
2. Your pages support querystring parameters. Every URL with different parameters is a unique page. So if your URLs have a sort parameter, it's a new URL. The page and content might be the same, but it's still a unique URL. If you want to ignore querysting parameters, set that in the Preferences.
3. You have more URLs than you thought.

If you're unsure, please PM us your URL and we'll take a look.

Thanks,
KeyLimeTie
Users browsing this topic
Guest


Forum Jump
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Main Forum RSS : RSS

None
Powered by Yet Another Forum.net version 1.9.1.2 (NET v2.0) - 9/27/2007
Copyright © 2003-2006 Yet Another Forum.net. All rights reserved.
This page was generated in 0.048 seconds.