Wednesday, May 30, 2012

How Site Map Beneficial to Search Engine (Information from Google Employee)

I found very great answer from John Mueller that he posted about Site Map in Stackexchange. . 

Sitemap file helps search engines to discover new and updated URLs on your website. In particular, if your website is fairly large, then this can help them to be able to focus on the new & updated content, instead of having to blindly crawl through everything to see if anything has changed. That can result in new content being found much faster, which can be quite noticeable especially if the site is larger or more complex.
  • Find the number of indexed URLs for your website:
  •  These statistics are recalculated daily and very accurate. You can find these in the Sitemaps detail page.
  • Discover canonicalization issues:
  •  If the numbers there don't match up, that's frequently a sign that you're specifying URLs in the Sitemap file that don't match what we find during our crawling. That's usually a sign that you need to work on canonicalization.
  • Help with canonicalization:
  •  When we find multiple URLs on your site that show identical content, we will give any URL that's listed in a Sitemap an extra edge, even if you don't use other canonicalization methods.
  • Find badly-indexed parts of your site:
  •  These counts are supplied per Sitemap file, so you can create separate Sitemap files for logical sections of your site, to discover areas where Google isn't indexing as much as you'd like.
  • Prioritize crawl errors:
  •  In the crawl errors section, URLs that were specified in Sitemaps files are listed separately. Since you specifically supplied these URLs, we assume that you want them indexed, and that any crawl errors there are important.
Additionally, you can use several extensions in Sitemaps files (eg for images, video, News, or internationalization), should you choose to do that. These extensions are all optional. 
For most websites, the most visible element of Sitemaps files is that you can see the indexed URL count. It can take a day or so to appear, so if you just submitted a Sitemap for the first time, you may need to be a bit patient. While other ways (eg a site:-query) are very, very rough approximations, this count is extremely acccurate.