Google Robots And Why They Are Important
Robots? No this post isn’t about sci-fi. From my own research, and the work that I do at my web design agency OwenDevelopment, Google robots are very real and important in helping your website get indexed in search results.
Also known as ‘Spiders’, ‘Crawlers’ and the ‘Googlebot’, these programs scrawl the internet constantly, from page-to-page, site-to-site, reading the content on the page and reporting back to Google and other search engines about what your site is about and what keyword or phrases would be relevant to display your site in it’s results.
Every day hundreds of them go out and scour the web, whether it’s Google trying to index the entire web, or a spam bot collecting any email address it could find for less than honorable intentions. As site owners, what little control we have over what robots are allowed to do when they visit our sites exist in a magical little file called “robots.txt.”
“Robots.txt” is a regular text file that through its name, has special meaning to the majority of “honorable” robots on the web. By defining a few rules in this text file, you can instruct robots to not crawl and index certain files, directories within your site, or at all. For example, you may not want Google to crawl the /images directory of your site, as it’s both meaningless to you and a waste of your site’s bandwidth. “Robots.txt” lets you tell Google just that.
You may impose restrictions on which web pages to disallow indexing. By default, most users will want to allow all directories except their /cgi-bin/ directory, which commonly holds scripts, and their images directory /images/. To enable all web pages, select Yes to “Enable All Webpages,” then enter each web page or directory path in the exclusion box, one per line.
Example: “http://www.yourdomain.com/images/” (Excludes the /images/ directory)
Example: “http://www.yourdomain.com/welcome.html” (Excludes the /welcome.html web page)
For my readership (because I love each and every one of you), I have included below a generator to create a robot.txt file to upload and use on your own sites:
|
|||||||||||||||
























Share This Post!