[whatwg] Trying to work out the problems solved by RDFa from Dan Brickley on 2009-01-10 (public-whatwg-archive@w3.org from January 2009)

From: Dan Brickley <danbri@danbri.org>
Date: Sat, 10 Jan 2009 01:05:17 +0100
Message-ID: <4967E63D.2000107@danbri.org>

On 10/1/09 00:37, Ian Hickson wrote:
> On Fri, 9 Jan 2009, Ben Adida wrote:
>> Is inherent resistance to spam a condition (even a consideration) for
>> HTML5?
>
> We have to make sure that whatever we specify in HTML5 actually is going
> to be useful for the purpose it is intended for. If a feature intended for
> wide-scale automated data extraction is especially susceptible to spamming
> attacks, then it is unlikely to be useful for wide-scale automated data
> extraction.

I've been looking at such concerns a bit for RDFa. One issue (shared 
with HTML in general I think) is user-supplied content, eg. blog 
comments and 'rel=nofollow' scenarios).  Is there any way in HTML5 to 
indicate that a whole chunk of Web page is from an (in some 
to-be-defined sense) untrusted source?

I see http://www.whatwg.org/specs/web-apps/current-work/#link-type-nofollow

"The nofollow keyword indicates that the link is not endorsed by the 
original author or publisher of the page, or that the link to the 
referenced document was included primarily because of a commercial 
relationship between people affiliated with the two pages."

While I'm unsure about the "commercial relationship" clause quite 
capturing what's needed, the basic idea seems sound. Is there any 
provision (or plans) for applying this notion to entire blocks of 
markup, rather than just to simple hyperlinks? This would be rather 
useful for distinguishing embedded metadata that comes from the page 
author from that included from blog comments or similar.

Thanks for any pointers,

cheers,

Dan

--
http://danbri.org/

Received on Friday, 9 January 2009 16:05:17 UTC