# robots.txt - Bloqueio de Bots Problemáticos # Gerado em: 01/Jul/2025 # ========================================== # BLOQUEIO DE BOTS IDENTIFICADOS NOS LOGS # ========================================== # Microsoft Bing Bot - Causando erros 500 User-agent: bingbot Disallow: / User-agent: Bingbot Disallow: / User-agent: BingBot Disallow: / User-agent: msnbot Disallow: / User-agent: MSNBot Disallow: / # ========================================== # OUTROS BOTS PROBLEMÁTICOS # ========================================== # Scrapers e bots maliciosos User-agent: imageSpider Disallow: / User-agent: ImagesiftBot Disallow: / User-agent: imagesift Disallow: / User-agent: scrapedia Disallow: / User-agent: scrapedia-receive Disallow: / User-agent: trendictionbot Disallow: / User-agent: ChatGPT-User Disallow: / # ========================================== # BOTS ADICIONAIS COMUNS PROBLEMÁTICOS # ========================================== # Bots que frequentemente causam sobrecarga User-agent: SemrushBot Disallow: / User-agent: AhrefsBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: YandexBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: PetalBot Disallow: / # ========================================== # BOTS PERMITIDOS (Importantes para SEO) # ========================================== # Google (sempre permitir) User-agent: Googlebot Disallow: /admin/ Disallow: /private/ Crawl-delay: 1 User-agent: Googlebot-Image Disallow: /admin/ Disallow: /private/ # Facebook (para compartilhamentos) User-agent: facebookexternalhit Disallow: /admin/ Disallow: /private/ # Twitter (para cards) User-agent: Twitterbot Disallow: /admin/ Disallow: /private/ # LinkedIn User-agent: LinkedInBot Disallow: /admin/ Disallow: /private/ # ========================================== # CONFIGURAÇÕES GERAIS # ========================================== # Todos os outros bots (regra geral) User-agent: * Disallow: /admin/ Disallow: /private/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /cgi-bin/ Disallow: /*.php$ Disallow: /*?* Crawl-delay: 2 # ========================================== # SITEMAP # ========================================== # Sitemap principal (substitua pela URL real) Sitemap: https://www.gauss.com.br/sitemap.xml Sitemap: https://www.gauss.com.br/sitemap_index.xml # ========================================== # NOTAS IMPORTANTES # ========================================== # ATENÇÃO: robots.txt é apenas uma "sugestão" # Bots maliciosos podem ignorar estas regras # Use sempre .htaccess como backup para bloqueio efetivo # # Para verificar se está funcionando: # https://www.gauss.com.br/robots.txt # # Teste com Google Search Console: # https://search.google.com/search-console/robots-txt-tester