[whatwg] Trying to work out the problems solved by RDFa from Charles McCathieNevile on 2009-01-02 (public-whatwg-archive@w3.org from January 2009)

From: Charles McCathieNevile <chaals@opera.com>
Date: Fri, 02 Jan 2009 17:12:55 +1100
Message-ID: <op.um38vt1gwxe0ny@widsithpro.lan>
On Fri, 02 Jan 2009 05:43:05 +1100, Andi Sidwell <andi at takkaria.org> wrote:

> On 2009-01-01 15:24, Toby A Inkster wrote:
>> The use cases for RDFa are pretty much the same as those for  
>> Microformats.
>
> Right, but microformats can be used without any changes to the HTML  
> language, whereas RDFa requires such changes.  If they fulfill the same  
> use cases, then there's not much point in adding RDFa.
...
>> So why RDFa and not Microformats?

(I think the question should be why RDFa is needed *as well as* ?formats)

>> Firstly, RDFa provides a single unified parsing algorithm that
>> Microformats do not. ...

> This is not necessarily beneficial.  If you have separate parsing  
> algorithms, you can code in shortcuts for common use-cases and thus  
> optimise the authoring experience.

On the other hand, you cannot parse information until you know how it is  
encoded, and information encoded in RDFa can be parsed without knowing  
more.

And not only can you optimise your parsing for a given algorithm, you can  
also do for a known vocabulary - or you can optimise the post-parsing  
treatment.

>  Also, as has been pointed out before in the distributed extensibility  
> debate, parsing is a very small part of doing useful things with content.

Yes. However many of the use cases that I think justify the inclusion of  
RDFa are already very small on their own, and valuable when several  
vocabularies are combined. So being able to do off-the-shelf parsing is  
valuable, compared to working out how to parse a combination of formats  
together.

>> Secondly, as the result of having one single parsing algorithm,
>> decentralised development is possible. If I want a way of marking up my
>> iguana collection semantically, I can develop that vocabulary without
>> having to go through a central authority.
>
> You can develop vocabularies without going through a central authority  
> already, via class or id, and many people already do.
>
>> Because URIs are used to
>> identify vocabulary terms, I can be sure that my vocabulary won't clash
>> with other people's vocabularies.
>
> Again, you can do this with class, by putting your domain name in the  
> class attribute.  It also depends on how much of an issue you think  
> clashes will be with an iguana collection-- I would suggest that due to  
> the specialised nature of the markup, clashes would be quite unlikely.

It depends how many people work on iguana collections - or Old Norse and  
Anglo Saxon text, which was the use case that got me involved in the Web  
in the very early 90s. It turns out that people don't, in the ?formats  
world, use unambiguous names, especially when they are privately  
developing their own information. By contrast, those who come from an RDF  
world do this by habit.

>> It can be argued that going through a
>> community to develop vocabularies is beneficial, as it allows the
>> vocabulary to be built by "many minds" - RDFa does not prevent this, it
>> just gives people alternatives to community development.
>
> RDFa does not give anything over what the class attribute does in terms  
> of community vs individual development, so this doesn't really speak in  
> RDFa's favour.

In principle no, but in real world usage the class attribute is considered  
something that is primarily local, whereas RDFa is generally used by  
people who have a broader outlook on the desirable permanence and  
re-usability of their data.

>> Lastly, there are a lot of parsing ambiguities for many Microformats.
>> One area which is especially fraught is that of scoping. The editors of
>> many current draft Microformats[1] would like to allow page authors to
>> embed licensing data - e.g. to say that a particular recipe for a pie is
>> licensed under a Creative Commons licence. However, it has been noted
>> that the current rel=license Microformat can not be re-used within these
>> drafts, because virtually all existing rel=license implementations will
>> just assume that the license applies to the whole page rather than just
>> part of it. RDFa has strong and unambiguous rules for scoping - a
>> license, for example, could apply to a section of the page, or one
>> particular image.
>
> Are there other cases where this granularity of scoping would be  
> genuinely helpful?  If not, it would seem better to work out a solution  
> for scoping licence information...

Yes.

Being able to describe accessibility of various parts of content, or point  
to potential replacement content for particular use cases, benefits  
enormously from such scoping (this is why people who do industrial-scale  
accessibility often use RDF as their infrastructure). ARIA has already  
taken the approach of looking for a special-purpose way to do this, which  
significantly bloats HTML but at least allows important users to satisfy  
their needs to be able t produce content with certain information included.

Government and large enterprises produce content that needs to be  
maintained, and being able to include production, cataloguing, and similar  
metadata directly, scoped to the document, would be helpful. As a trivial  
example, it would be useful to me in working to improve the Web content we  
produce at Opera to have a nice mechanism for identifying the original  
source of various parts of a page.

> What would you do with scoped copyright information, anyway?  I can see  
> images being an issue, but ideally information about a resource should  
> be kept in that resource, and as such the licence should be embedded in  
> the image rather than given by a Web page.  In the case of particular  
> sections having particular licences, is there any practical use of  
> marking up different sections with different licences over just doing  
> that with text?

Mash-ups. If they have a use-case, and I think it is widely accepted that  
they do, then it would seem obvious that being able to identify the source  
of each part, and any conditions that vary between different sources, is a  
use case.

cheers

Chaals

-- 
Charles McCathieNevile  Opera Software, Standards Group
     je parle fran?ais -- hablo espa?ol -- jeg l?rer norsk
http://my.opera.com/chaals       Try Opera: http://www.opera.com
Received on Thursday, 1 January 2009 22:12:55 UTC