W3C home > Mailing lists > Public > www-html@w3.org > January 2005

RE: rel="nofollow" attribute (PR#7676)

From: Mark Birbeck <mark.birbeck@x-port.net>
Date: Fri, 21 Jan 2005 20:00:36 -0000
To: "'Beth Epperson'" <beppers2@cox.net>
Cc: <w3c-html-wg@w3.org>, <www-html@w3.org>
Message-ID: <00ab01c4fff3$e26a0030$6f01a8c0@W100>


> Again, I have to disagree - what may seem daft to you, may be 
> brilliant to someone else, ...

Well that's clearly the case! We agree on that.

> ... and if this option resolves a problem, then it isn't daft
> at all.

I don't see how that follows. Solving one problem by creating others is
pretty inefficient -- and I'd say that's daft.

> If you have a more elegant solution for them, then clearly define it, 
> and provide examples.

That's a very odd approach to take Beth, and my first reply would be "why
should I"? We are being presented with a fait a compli, and a kludge to
boot. I don't recall the problem being brought to this list, and a
discussion taking place in an atmosphere of cooperation and problem-solving.
So please don't imply that the negativity is coming from those who disagree
with the proposal, whilst the poor search engine people are just trying to
make life better.

However, that doesn't mean I can't see ways to solve it ... it is not
exactly a difficult problem (always assuming that the problem as stated is
*really* the problem).

For example, whenever someone posts a comment to my blog on Blogger.com, I
get an email. Why not ask me to approve the comment before it's posted?
Alternatively, the blogger software could still make the post, but convert
all links to simple text until the blog owner has approved the post (the
URLs would appear in the text, but not as anchors, so an interested reader
could still go to the link if they wanted with a cut-and-paste).

That obviously requires changes on the part of the blogger software
companies, but if comment spam really is the problem then they just have to
accept that the work needs to be done, and either of these solutions -- or
one of plenty of others -- could address it.

And if the problem is less to do with the presence of comment spam, and more
to do with the skewing of the search rankings -- which is actually all that
the "nofollow" proposal claims to solve anyway -- then a much better
solution would be for the search engines to give a lower rating to links
that come from comments in a page, than links in the main page. So if, in my
site I refer to the web-site of product A, then Google can do what it does
now, which is give some weight to that. But if *someone else* links to the
web-site of product A by placing a comment on my site then that should be
given a lower weight. (And if we indicated the source of the comment then
Google could weight it accordingly anyway!)

This is something that should be looked at more generally anyway -- if both
of us link to an article on the BBC news web-site then Google should rightly
give that some sort of significance. But if both of us populate a corner of
our home page with links from the BBC's RSS news feed, have we really both
linked to the same article? Can Google really infer from the two links to
the same story that this story has some authority? If we could mark up the
area where the links are we could tell Google that they are from an external

So, how could you mark that up? Well it's not going to be that difficult to
indicate that some <div> contains content that was generated outside of the
site that is hosting it. In XHTML 2 we might use the new @role attribute,
and simply say:

  <div role="imported">

Alternatively we might use <link> and just indicate which sections contain
imported data:

    <title />
      <link rel="imported" href="#comments" />
      <link rel="imported" href="#rss" />
      <div id="comments">
        some comments not under my control
      <div id="rss">
        some news sites, also not under my control

The latter solution could also be used in XHTML 1. Another solution for
XHTML 1 might be to use the class attribute. Whatever solution is used, it
would be pretty easy to generate the mark-up automatically on most servers
that use XSLT-type publishing models.

Anyway, whatever way it is done, the point is that we're not creating some
daft (yes, daft) value for an attribute that just doesn't mean anything
other than to one type of user agent.

And one final point -- if Google no longer index pages that are linked to by
comments in blogs then they are in effect not indexing a major part of the
web. It's related to the point I made earlier about the BBC feeds -- during
some large event like the latest war in Iraq, may people will link on their
home pages or even on corporate sites to the 'conventional' news sources.
However, at the next level down, the blogs and the comments on the blogs may
link to 'alternative' sources of news. Whilst I think it would be wrong to
give these second-level links the same weight as the top-level ones, it
would be even more inaccurate to ignore them altogether. In the future a
search for news on the conflict will be enormously skewed towards the
'established' and the 'conventional'. I'm not making an argument there for
their 'voices' to be heard, but simply saying that the search results will
be just plain inaccurate.



Mark Birbeck
x-port.net Ltd.

e: Mark.Birbeck@x-port.net
t: +44 (0) 20 7689 9232
w: http://www.formsPlayer.com/
b: http://internet-apps.blogspot.com/

Download our XForms processor from
Received on Friday, 21 January 2005 20:01:28 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 30 April 2020 16:20:55 UTC