Skip to content

robots.txt should steer search engines away from old docs #94

Closed
@smontanaro

Description

@smontanaro

See https://mail.python.org/pipermail/pydotorg-www/2016-November/003921.html for original discussion.

When searching Google for "Python timeit" recently, the first hit was for

https://doc.python.org/2/library/timeit.html

The second hit, unfortunately, was for

https://doc.python.org/3.0/library/timeit.html

The first page of results didn't mention

https://doc.python.org/3/library/timeit.html

at all. It seems that the robots.txt file should be tweaked to strongly discourage search engine crawlers from traversing outdated documentation, at least < 3.2 or < 2.6. It's been a long while since I messed with a robots.txt file (so I won't pretend I could submit a proper PR), but something like

User-agent: *
disallow: /3.0/
disallow: /3.1/
disallow: /2.5/
disallow: /2.4/
disallow: /2.3/
disallow: /2.2/
disallow: /2.1/
disallow: /2.0/

should steer well-behaved crawlers away from obsolete documentation.

Metadata

Metadata

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions