Modern search engines can detect duplicate and almost-duplicate documents on the web. Among the
duplicates found, which one is the main? Where is the original document?

Using document weights based on some calculations (such as link popularity) gives fairly good results.
They are not always rightful, though.

The idea offered by Martijn Koster was to use the following tag:
<LINK rel=original href="some url">
("rel" property name is to be discussed, probably "original" is not the best one)
+ I don't see any contradiction between Host field in robots.txt and this tag.

What do you think:
a) Will search engines support this tag, if offered?
b) Will webmasters support this tag?
c) What name should this tag have?

Received on Sunday, 26 January 2003 18:31:52 UTC