|
|
Rank: Newbie Groups: Member
Joined: 2/23/2007 Posts: 9 Points: 27 Location: Mi
|
Since I have loaded and run the program I have noticed a large amount of hits. I know the program loads as a service and runs in the background. At times of low internet usage I have noticed large bandwidth use on my router.
At first I thought it was because of the way the scheduling is set up. No matter what you do when you open the program it starts to respider the site the following day and then the interval specified. So what I did was make sure that I did not open the program to trip the program into running again.
Sure enough on some pages they were spidered 47 times.
I know I have a large site but 647,000 hits in one day is out of the norm. The page views do not indicate that there was any one viewing the pages. I have racked up 4,500,000 hits to the site since installing the program.
I am afraid google is going to think this is excessive and drop me in the page rankings. I have successfully gone from about 2 thousand listed pages to about 67 since installing the program and setting it up. I still can not load the .gz file like google requests for large sites. This has been an issue since the first time I loaded it up to the site. You told me you could download it and read it and were going to get back to me.
You asked that I post the requests in the forum so others could learn so here you go.
Myck
|
|
|
|
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
Please disable scheduling as it sounds like it may have a bug. I am investigating this now and will have resolution tonight.
Also, I just checked your site on Google and you have 2,110 URLs indexed. The software has not affected your status on Google.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
I have resolved the issue and will be posting an update version tonight. I will email you when it's available to download.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
iArchitect Sitemap Generator v4.1.1 available: http://www.iarchitect.net/Blog/2007/3/13/Sitemap-Generator-v411-Released/This should resolve the excessive page hits on your website. As for the gzip issue, I did respond to your email immediately and said I could open it no problem. I have researched on Google and they are not providing any helpful information. I will look a little more and see if you can open a request to them to investigate.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
From Google:
Compression error "Google encountered an error when trying to uncompress your compressed Sitemap file. Recompress your Sitemap (using a tool such as gzip), upload it to your site, and resubmit it. If you continue to have trouble, try submitting an uncompressed version of the file."
Have you tried uploading your sitemap uncompressed?
|
|
Rank: Newbie Groups: Member
Joined: 2/23/2007 Posts: 9 Points: 27 Location: Mi
|
It is uploaded as an uncompressed file. It has been verified by google sitemaster tools. It will not verify using a .gz file though and gives me the error. The only reason I care is that my site is so large it is recommended to use a .gz file. I would like to see how you came up with the number for the indexed files for my site. If you use the site:url.com command it comes up differnt. Myck
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
I do the site:url.com and the site:www.url.com searches. But run the search a few times. You will get different results as each Google index might have a different number of pages cached. One might have 1,000 while another has 5,000.
|
|
Rank: Newbie Groups: Member
Joined: 2/23/2007 Posts: 9 Points: 27 Location: Mi
|
Errors Line: 1 Parsing error We were unable to read your Sitemap. It may contain an entry we are unable to recognize. Please validate your Sitemap before resubmitting. [?] Line: 1 Invalid URL This is not a valid URL. Please correct it and resubmit
This is the error I am getting with the xml sitemap with Google.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
I checked your sitemap out and it's perfect. Google is having a problem for some strange reason. I will investigate it and get back to you ASAP.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
Here's where I'm at: 1. I'm going to download your sitemaps. 2. I need to add a few namespace headers to your sitemaps. (see https://www.google.com/webmasters/tools/docs/en/protocol.html#sitemapValidation) 3. I can then validate your sitemaps against the Google schema. They look perfect to me and this is very strange. I will get back to you ASAP.
|
|
Rank: Newbie Groups: Member
Joined: 2/23/2007 Posts: 9 Points: 27 Location: Mi
|
Ok I have been waiting on a reply on this. The condition now is this. The GZ file still errors out on Google. When I spyder the site I get 30,000 lines of 404 errors. The top ranking page on my site now is 404. It appears to be truncating the url. (similar to http:/domain.name/directory/folder/file to http:/domain.name/directory/file). The RS feed page it too big to be useful. It is probably due to the fact that the program does not mark the new files published as new, rather they are all new each time the site is spyderd. Other than that things are going just great.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
MrMyckster, I apologize for the delay, but I was out of town on business the last few days and just got back tonight. I will have an answer for you tomorrow. Brian
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
MrMyckster, I pulled down your sitemap files again and am validating them against the schema. It will take a little time as you have a lot of URLs. I will have an update for you in the morning. Thanks, Brian
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
MrMyckster, I ran all 5 of your sitemap files through an XML validator and they are all well-formed and comply with the schema. You will need to report your issue to Google. I will try to find a way to do that.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
From Google.com: Who can I contact if I need help? If you need help with any technical issues, or if you want to discuss the program in general with other webmasters, please visit our discussion forum for webmasters page. You'll find answers to most of your questions, and information will continue to collect there as more people join in. We'll also be reading the discussions, and may offer assistance if required.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
I took some time to write an application to validate your sitemap xml against the schema. To verify it works perfectly, I changed a few things like the changefreq from "weekly" to "weeekly". It immediately reported the bad data/format. If you would like to try it out as well, I made it available at: http://www.iarchitect.net/Blog/2007/3/24/Simple-XML-Validator/I know this doesn't really help your situation, but if the XML is correct and Google is having problems with it, then they need to tell you why.
|
|
Rank: Administration Groups: Administration
Joined: 1/31/2007 Posts: 440 Points: 646 Location: Chicago, IL
|
Locking topic since the issue has been explained.
|
|
|
Guest |