We've experienced some runaway growth of Gitea archive cache files
on one of our backends, which according to upstream is often caused
by web crawlers indexing the archive URLs. They recommended updating
our robots.txt to the current state of https://gitea.com/robots.txt
in order to help mitigate the issue.
I've kept things we expressly commented out before still commented
out, or anything that seems similar to what we commented out on the
assumption that the reasons would carry over.
After some discussion in IRC, we also decided it would make sense to
disallow /avatars and /user/* like they do.
Change-Id: I2b43b89de08c9a9d170e1ecbd14b1e6336fd2c84