Real Data

Posted: **Sun Feb 09, 2025 8:46 am**

But occasionally a well-known person might upload their car or something like that, and we don't want to block that page in robotsxt because in that case we'd be wasting those external links. What we might do is link to that page on an internal link and do a no-follow on that link internally. That means it can be crawled, but only if Google has found it through external links or something like that.

There is a middle ground. Currently technically nofollow is implied. In my experience, Google will not crawl a page that is only linked to by an internal nofollow. If it finds the page some other way, obviously it will still crawl it. But overall this is probably effective as a way to limit crawl quota. You lose some PageRank through this nofollow link. It still counts as a link.

noindex, nofollow
Noindex and nofollow are not available, which is apparently a very china mobile database common solution for such pages on e-commerce websites.

But once Google gets to that page, it will see that it is no-indexed, and over time it will crawl it less often because there is less point in crawling no-indexed pages.

Obviously, it can't be indexed. Noindexing doesn't pass PageRank outward. But PageRank is still passed to this page, it just doesn't pass PageRank outward because there's a nofollow. This isn't a very good solution. We have to reach some compromise here to save crawl quota.

noindex, follow
Many people in the past thought that the solution to this problem was to use no-index-but-follow as the best of both worlds. You put a no-index-single-follow tag in the top section of one of your pages and we each still get the same crawling benefits from it.

Real Data

In this case, the page can be crawled

In this case, the page can be crawled