Welcome Guest Search | Active Topics | Members | Log In | Register

Prepurchase questions Options
somewebguy
Posted: Thursday, April 01, 2010 4:18:29 AM
Rank: Newbie
Groups: Member

Joined: 4/1/2010
Posts: 3
Points: 9
Location: US
I am considering imediate purchase of the software, but have a few questions I cannot answer using thetrial, because of its 100 url limit (it won't index the url's I need to test, which appear after the first 100).

Question 1:
Can you please tell me whether I can filter out URLs by their parts, for example I have a series of URL's that contain:
*/index.php?dispatch=*
/store/*?sort_by=*
/store/*/function.array-merge

The asterisks in the above url patterns are wildcards and represent a part of the pattern that is unique for each URL , so I cannot filter for them individually. Will your software let me use patterns to exclude URL's to spider or can I eliminate the URL's (regardless of what's in the asterisked portion) by just filtering for one of the unique words in the preferences tab's "ignore URL strings?" field (ie dispatch,sort_by,function.array-merge would remove anything with the above patterns).

Question 2:
Can I manually add URL's the crawler missed to the URL list used to generate the sitemap, if so how? I was unable to find any way of adding URL's to the results, only filtering out using "edit results".

Question 3:
Can you please explain in detail how ignore query string works? When I enter a querystring (ie dispatch as in the URL patterns in question 1 above), will the entire URL that contains dispatch be ignored or will the query string simply be stripped from the url.

Question 4:
For "ignore folders" is there a way to ignore a folder and its subfolders, or must i enter the path to each individually, separating each path with a comma?

Question 5:
Can wildcards or regex be used in any of the programs fields? In particular within the ignore fields?

Question 6:
Does keylimetie offer a money back guarantee if the program cannot be set up to effectively spider my website? My biggest hesitancy is that I have a large site with many dynamic url's to be filtered out and I cannot test this on your trial because of the limited URL's allowed. I've tried a few programs and had them only partially spider the site or not allow filtering out the pages we would like to eliminate from the sitemap.

Sponsor
Posted: Thursday, April 01, 2010 4:18:29 AM
Get your Sitemap Generator license today! http://www.keylimetie.com/Checkout/Quick-PayPal/
KeyLimeTie
Posted: Thursday, April 01, 2010 8:48:44 AM
Rank: Administration
Groups: Administration

Joined: 1/31/2007
Posts: 586
Points: 384
Location: Chicago, IL
Question 1:
Asterisks/wildcards will not work.

Question 2:
No. If the crawler didn't find them, Google and other search engines probably won't either. It would be much better to make the links accessible from your website.

Question 3:
Only the specified query string parameters will be stripped from the url. The other parameters will remain.

Question 4:
If a folder is ignored, all subfolders will also be ignored.

Question 5:
No

Question 6:
Sorry, no. Because this is a software download, we cannot offer a refund. Unfortunately, there have been people who have spidered their entire site(s) and generated sitemaps, and then asked for a refund.
somewebguy
Posted: Thursday, April 01, 2010 2:49:41 PM
Rank: Newbie
Groups: Member

Joined: 4/1/2010
Posts: 3
Points: 9
Location: US
In regards to question 1, above can you please clarify. You said asterisks/wildcards will not work. However you did not answer the second part of the question about the filtering of unique words and whether it would remove the URLs and not crawl them. For example, assume I have the following URL's:
http://www.xyz.com/store/baby-toys/wooden-toys/page-2/index.php?category_id=198&mipp=18
http://www.xyz.com/store/baby-toys/wooden-toys/page-2/?sort_by=product&sort_order=desc
http://www.xyz.com/store/baby-toys/wooden-toys/page-2/?features_hash=V43
http://www.xyz.com/store/baby-toys/wooden-toys/index.php?return_url=index.php?dispatch=products.view&product_id=30408

To not have the above crawled, could I then use the following in the preferences tab's "ignore URL strings?" field:
category_id,sort_by,features_hash,dispatch

Please confirm doing the above would let me omit any such url's that contain those terms so they are not crawled/indexed into the sitemap the software would create.
KeyLimeTie
Posted: Thursday, April 01, 2010 2:58:48 PM
Rank: Administration
Groups: Administration

Joined: 1/31/2007
Posts: 586
Points: 384
Location: Chicago, IL
Yes, that would work.
somewebguy
Posted: Thursday, April 01, 2010 3:20:48 PM
Rank: Newbie
Groups: Member

Joined: 4/1/2010
Posts: 3
Points: 9
Location: US
Great, thank you , for the fast reply. Can you tell me during what hours is forums support available during?
KeyLimeTie
Posted: Tuesday, May 18, 2010 1:56:33 PM
Rank: Administration
Groups: Administration

Joined: 1/31/2007
Posts: 586
Points: 384
Location: Chicago, IL
Forums messages are automatically emailed to our supported team and answered Monday through Friday 9am to 5pm CST.
Users browsing this topic
Guest


Forum Jump
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.

Main Forum RSS : RSS

None
Powered by Yet Another Forum.net version 1.9.1.2 (NET v2.0) - 9/27/2007
Copyright © 2003-2006 Yet Another Forum.net. All rights reserved.
This page was generated in 0.071 seconds.