What is robots Text and How it's work ?
Web page all contents allowed crawl to Index
The default robots.txt file allows search engine crawl to access all pages and posts on the website. Here is the default robots.txt.
User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: [Your-Blog-URL]/sitemap.xml
The robots.txt provided allows specific user-agents to access different types of content
on the website. Here is a breakdown of the directives in the robots.txt file:
Details Robots txt:
Which Robts.txt uses.
User-agent: * (This directive applies to all other user-agents crawlers)Allow: / (This allows crawling of all other pages on the website)Disallow: /search (This disallows crawling of pages under the /search directory)User-agent: Mediapartners-Google (used for Google AdSense)User-agent: Googlebot (used for regular web crawling)Disallow: /nogooglebot/" (used for which parts of a website they should not crawl)User-agent: Adsbot-Google (used for AdWords campaigns)User-agent: Googlebot-News (used for crawling news content)User-agent: Googlebot-Image (used for crawling images)User-agent: Googlebot-Video (used for crawling video content)User-agent: Googlebot-Mobile (used for crawling mobile content)
For all of the specified user-agents, no specific disallow rules are provided, which means
they are allowed to crawl all content on the website.
This robots.txt file allows specific Google user-agents to access all content on the website,
while other user-agents are allowed to access all content except for pages under the
search directory.
Comments
Post a Comment
Thanks for your Comments.