Robots.text Example For Blogger and Website ( SEO Optimize)
SEO-friendly robots.txt file for Website
The robots.txt file is an important part of search engine optimization (SEO) as it tells search engine crawlers which pages or files on your website should or should not be crawled or indexed. Here's an example of an SEO-friendly robots.txt file:
User-agent: *
Disallow:
User-agent: Googlebot
Disallow: /admin/
Disallow: /private/
Disallow: /tmp/
Disallow: /uploads/
User-agent: *
User-agent: Mediapartners-Google
Disallow:
User-agent: Adsbot-Google
Disallow:
User-agent: Googlebot-News
Disallow:
User-agent: Googlebot
Disallow:
User-agent: Googlebot-Image
Disallow:
User-agent: Googlebot-Video
Disallow:
User-agent: Googlebot-Mobile
Disallow:
User-agent: *
Allow: /
Sitemap: https://example.blogspot.com/sitemap.xml
In this example, the User-agent: * directive applies to all search engine crawlers. The Disallow: directive with no value means that all pages and files are allowed to be crawled.
Next, we specify the rules for Googlebot. In this example, we're disallowing access to certain directories, such as /admin/, /private/, /tmp/, and /uploads/. This can help prevent sensitive or duplicate content from being indexed by search engines.
Finally, we include a Sitemap directive to tell search engines where to find our XML sitemap file. This can help search engines discover new content on your site and index it more quickly.
It's important to note that the robots.txt file is a public file, so don't include any sensitive information in it. Additionally, be careful not to accidentally block important pages or directories that you want search engines to crawl and index. Always test your robots.txt file using Google's robots.txt tester tool to ensure that it's configured correctly.
All Robots.text Example:
User-agent: *
User-agent: yeti
User-agent: psbot
User-agent: Bot1
User-agent: Slurp
User-agent: Teoma
User-agent: Nutch
User-agent: Bingbot
User-agent: MSNBot
User-agent: Gigabot
User-agent: naverbot
User-agent: Robozilla
User-agent: Googlebot
User-agent: baiduspider
User-agent: Adsbot-Google
User-agent: Googlebot-News
User-agent: ia_archiver
User-agent: Mediapartners-Google
User-agent: Googlebot-Image
User-agent: Googlebot-Video
User-agent: Googlebot-Mobile
User-agent: yahoo-mmcrawler
User-agent: yahoo-blogs/v3.9
User-agent: *
Allow:
Allow: /
Allow: /gpt/
Allow: /blog/
Allow: /local/
Allow: /learn/
Allow: /tag/js/
Allow: /products/
Allow: /Index/
Allow: /blog/post-title/
Allow: /folder/page.html
Allow: /link-intersect/$
Allow: /static/glade/
Allow: /static/glade.js
Allow: /ads/preferences/
Allow: /pagead/show_ads.js
Allow: /pagead/js/adsbygoogle.js
Allow: /pagead/js/*/show_ads_impl.js
Allow: /products/content/
Allow: /site-explorer/$
Allow: /plugins/system/jch_optimize/assets/
Allow: /researchtools/ose/
Allow: /researchtools/ose/$
Allow: /researchtools/ose/anchors$
Allow: /researchtools/ose/dotbot$
Allow: /researchtools/ose/dotbots
Allow: /researchtools/ose/links$
Allow: /researchtools/ose/just-discovered
Allow: /researchtools/ose/pages$
Allow: /researchtools/ose/domains$
Allow: /directory/
Allow: /marketplace/
Allow: /keywords/
Allow: /local/search/
Allow: /content/search/*
Allow: /site-explorer/*
Allow: /site-explorer/ajax/
Allow: /admin/assests/
Allow: /admin/admin-ajax/
Allow: /admin/admin-ajax.js
Allow: /admin/admin-ajax.js/*/show_ads_impl
Allow: /services/
Allow: /page-strength/*
Allow: /marketplace/
Allow: /script/
User-agent: *
Disallow: /
User-agent: Googlebot-image
Disallow:
User-agent: *
Disallow: /staging/
User-agent: *
Allow: /myfolder/
User-agent: *
Disallow:
Disallow: /
Allow: /content/uploads/
Allow: /content/page/
Disallow: /content/
Disallow: /client/
Disallow: /admin/
Disallow: /includes/
Disallow: /content/plugins/
Disallow: /content/themes/
Disallow: /feed/
Disallow: /trackback/
Disallow: */feed/
Disallow: */trackback/
Disallow: /cgi-bin/
Disallow: /wp-login/
Disallow: /wp-register/
Disallow: /20*
Disallow: /static/glade/
Disallow: /tag/js/
Disallow: /folder1/
Disallow: /folder2/
Disallow: /assests/
Disallow: /folder3/
Disallow: /Googlebot/
Disallow: /Adsbot-Google/
Disallow: /Googlebot-Image/
Disallow: /Googlebot-Mobile/
Disallow: /example-page/
Disallow: /content/audit/*
Disallow: /folder/
Disallow: /link-intersect/*
Disallow: /local/enterprise/confirm
Disallow: /local/details/
Disallow: /cpresources/
Disallow: /thumbs/*
Disallow: /researchtools/ose/
Disallow: /something/
Disallow: /something-else/
Disallow: /start
Disallow: /vendor/
Disallow: /cgi-bin/
User-agent: *
Sitemap: https://example.com/sitemap.xml
Sitemap: https://example.com/sitemap_index.xml
The Robots.txt file is a text file that website owners use to communicate with web robots (such as search engine crawlers) about which pages or sections of their site should be crawled or excluded from crawling. It is located in the root directory of a website and can be accessed by adding "/robots.txt" to the site's URL.
To test your Robots.txt file using the Google Robots.txt Tester, Visit the Google Google Robots Tester page
The tool will then analyze the Robots.txt file and provide feedback on any potential issues or errors it detects. It will also show you which URLs are allowed or disallowed based on the rules specified in the Robots.txt file.
Comments
Post a Comment
Thanks for your Comments.