Re: Clarifying CURIEs [was Re: RDFa - Dublin Core Metadata - [Fwd: Draft of revised version of Expressing DC in X/HTML meta/link elements]] from Ivan Herman on 2007-08-10 (public-rdf-in-xhtml-tf@w3.org from August 2007)

From: Ivan Herman <ivan@w3.org>
Date: Fri, 10 Aug 2007 13:52:51 +0200
To: Mark Birbeck <mark.birbeck@formsPlayer.com>
CC: Dan Brickley <danbri@danbri.org>, Niklas Lindström <lindstream@gmail.com>, W3C RDFa task force <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <46BC5193.6060009@w3.org>
Mark,

what you propose is, in fact, orthogonal to the other issue.

- I fully agree that, for example, the 'next' should be used with the
xhtml namespace (or some similar one). I think that is true _regardless_
of the '.' issue, and I should have raised that before on the mailing list.

- I am not sure, however, how your proposal handles the, say, openid or
dublic core cases. Does it mean that we build in, as 'known', all the DC
terms? That would create much more problems, because any implementation
should know all the dublin core terms (and we are talking about a moving
target here!). What about openid? Would we also pre-build those, too?
This is a bit of a mess, requires a precise documentation (which may
become a pain in the back), etc.

Again, I agree and understand that the '.' notation is not nice to have
but, in this case, I believe pragmatism has a value. I think Dan's
formulation is the best one: we make it clear that we accept those in
the header part as deprecated, but still existing features. Not in the
body, though.

Ivan

Mark Birbeck wrote:
> Hi Ivan,
> 
> I do think that Dan and Niklas have a point about whether we should be
> too accommodating to awkward syntax that has the negative consequence
> of making things more difficult to learn. I know the 'pragmatism'
> argument gets used a lot in these discussions, and I don't really want
> to go into it here, but I would just say that the 'pragmatic approach'
> (the idea that it is difficult to change people's habits) is what gave
> us the GRDDL approach, not the RDFa one. I don't know what others
> think, but reading this list at the moment, I'm getting a strong sense
> of a healthy willingness on the part of many people to look at how
> they can use RDFa to do things in an interoperable way.
> 
> Anyway, putting that aside, I have another proposal that might tread a
> middle ground; how about we don't create a generalised syntax such as
> one that uses the '.' notation, but instead have a set of predefined
> values that our parsers 'know' about? This is what we're having to do
> anyway for the predefined HTML @rel values, since if they were treated
> as CURIEs they would be relative to the document and not HTML. I.e.:
> 
>   <link rel="next" href="..." />
> 
> doesn't give a predicate of:
> 
>   <next>
> 
> but:
> 
>   xh:next
> 
> This works because we 'know' about the value, so why not just make a
> list of values that we recognise from day one, with the '.' notation
> not actually meaning anything in particular. We could still propose
> that we don't allow any more values after that, and that the
> recommended way to create new values is to use the new--i.e.,
> RDFa--syntax.
> 
> All of this could be done in a preprocessing step which might just be
> Ben's hGRDDL proposal from before.
> 
> Regards,
> 
> Mark
> 
> On 10/08/07, Ivan Herman <ivan@w3.org> wrote:
>>
>> Dan Brickley wrote:
>>> Ivan Herman wrote:
>>>> Niklas,
>>>>
>>>> I share your uneasiness, and I agree that the acceptance of the
>>>> rel="dc.creator" type of mechanism is less clean. But we have to face
>>>> realities (and, with all my respect to Ian and eRDF, this has nothing to
>>>> do with eRDF!). The fact of the matter is that the
>>>>
>>>> <link rel="dc.copyright" href="blabla"/>
>>>>
>>>> type of markup in HTML pages to include basic DC terms have become very
>>>> widespread indeed. To take an example, this is what the W3C home page
>>>> uses:-) Of course, those can (and are) understood and interpreted by
>>>> GRDDL, but I think it would be a shame if an RDFa processed XHTML file
>>>> did not understood those perfectly valid metadata or, as the DCMI
>>>> document and mail I forwarded, the DCMI recommendations decided to
>>>> ignore RDFa.
>>> I don't know. We still get a *lot* of grief for RDF/XML having so many
>>> syntactic variants, which were added in 97/8 each for fine reasons. RDFa
>>> is already complex; the more we add, the harder it is going to be to
>>> learn. We always also have GRDDL to fall back on. Could GRDDL be
>>> involved by citing the XSLT from the documents that the link/rel/href
>>> constructs point to?
>>>
>> Of course. As I say, the W3C home page uses that mechanism, as well as
>> an XSLT to produce an RSS feed...
>>
>>> Before changing RDFa here, I'd like to know: 1. just how many documents
>>> are we talking about
>> Well, this is what the DCMI currently proposes, ie, potentially a lot.
>>
>> http://dublincore.org/architecturewiki/DCXHTMLGuidelines/2007-07-27
>>
>> We could of course convince them to use the RDFa notation but I am not
>> sure that will happen.
>>
>>>                        2. how many of them are XHTML 3. whether the
>>> resulting RDF would be any good. For example, there is no such thing as
>>> dc:copyright or dc.copyright (dc:rights is the closest).
>>>
>> Dan, this was just the example I put in writing the mail! Forget about
>> the details, call it dc.blabla!:-)
>>
>>>> Dublin Core is not the only metadata that uses this mechanism. For
>>>> example, if I use OpenID, I am supposed to add the following <link>
>>>> elements into my HTML header (that I can then use as my OpenID URI):
>>>>
>>>> <link rel="openid.server" href="http://pip.verisignlabs.com/server" />
>>>> <link rel="openid.delegate"
>>>> href="http://ivanherman.pip.verisignlabs.com/" />
>>>>
>>>> It would be a perfectly o.k. to have RDF statements on your page on
>>>> openid, but we hit the same traditional data format!
>>> It is (I guess, can't promise :) not too late for the OpenID community
>>> to adopt and evangelise a more modern syntax. But to be fair, RDFa isn't
>>> finished yet, so we can't blame them for using a more "traditional" format.
>>>
>> Yes, exactly:-)
>>
>>>> However, there is an alternative to make the pill less bitter. We can:
>>>> accept the 'dot notation' as an alternative to the 'coloumn notation' in
>>>> the <head> only, ie, essentially for <link> and <meta> elements only.
>>>> These would allow RDFa to include the traditional metadata formats.
>>> The seems healthy (perhaps also mark it explicitly "for backwards
>>> compatibility" or even deprecated?
>> O.k. I can perfectly live with this; actually, I do think this is indeed
>> better and cleaner than what I originally said. I hereby change my
>> position:-) and keep it to the <head> only!
>>
>>>> None of these create any implementation problem. As I am in an
>>>> implementation mood these days:-) I could add these features to my
>>>> implementation in about 30 minutes (with testing) without any particular
>>>> problem, and a change to the alternative above would not be an issue
>>>> either. But I do not think ignoring the issue is good for us.
>>> Changes to parsers aren't the only cost. We also have to think about
>>> people learning to read this stuff. Any new rule is still a new rule.
>>>
>> Sure. But with the caveat of marking it as 'deprecated' I think we can
>> live with it...
>>
>> Ivan
>>
>>
>>> cheers,
>>>
>>> Dan
>>>
>>>> My two pence...
>>>>
>>>> Thanks
>>>>
>>>> Ivan
>>>>
>>>>
>>>> Niklas Lindström wrote:
>>>>> Hello!
>>>>>
>>>>> This is regarding the alternative mechanisms of a "dot-notation"
>>>>> shorthand and namespace binding by "schema.*" ("schema-dot"). It
>>>>> really bakes my noodle. :)
>>>>>
>>>>> Doesn't this amount to the incorporation of eRDF into RDFa? While that
>>>>> *may* be fortunate (since eRDF is in use today), something makes me
>>>>> feel very uneasy about it. I think it introduces a bit of
>>>>> arbitrariness at the core, which among other things (probably) makes
>>>>> it harder to learn.
>>>>>
>>>>> In my opinion, eRDF is a competent but less powerful alternative to
>>>>> RDFa, with different (though quite overlapping) sets of features and
>>>>> trade-offs. Although eRDF works with HTML 4.01 and RDFa does not
>>>>> (right?), which could merit this, I'd still like to stay on the XHTML
>>>>> side of things. Having the current syntax for CURIEs and namespace
>>>>> declarations as the only one is a big part of that IMHO.
>>>>>
>>>>> While having RDFa capable of coexisting with microformats (and eRDF)
>>>>> is a goal, I see this as legacy compatibility (I would never mix them
>>>>> if I could avoid it). If needed, GRDDL.
>>>>>
>>>>> What is left is how to handle e.g.:
>>>>>
>>>>>     <link rel="dc.creator" href="http://example.org/Fred" />
>>>>>
>>>>> I felt a little chill thinking about it (reminiscing the issues with
>>>>> mingling of values in @class when it was the proposed rdf:type
>>>>> shorthand). Would this lead to the generation of a nonsensical triple
>>>>> by today's rules? It is of course directly related to the issue of how
>>>>> to handle non-prefixed names. I have silently discarded it with a
>>>>> thought like "nah, non-prefixed names will be ignored", but that
>>>>> hasn't been resolved, has it? Nor what happens when undeclared
>>>>> prefixes are used?
>>>>>
>>>>> I think of RDFa as a pristine approach. If reasonable, as mentioned
>>>>> above, all kinds of things should peacefully coexist without any
>>>>> meaning to RDFa (hence the exclusion of @class value parsing). But
>>>>> @rel="dc.creator" is definitely too close to home to be left
>>>>> unconsidered. As is:
>>>>>
>>>>>     <link rel="schema.dc" href="http://purl.org/dc/elements/1.1/" />
>>>>>
>>>>> ; a mechanism which looks very odd to me in an RDFa context (too
>>>>> ad-hoc, for one; raising nesting/scope issues as well).
>>>>>
>>>>> I think I'd prefer if these constructs just aren't meaningful in the
>>>>> RDFa sense. eRDF is for GRDDL, as I see it.
>>>>>
>>>>> I am very open to more debate though (and this is a real issue). The
>>>>> thing to really consider is that if "dot-notation" is interpreted but
>>>>> the "schema-dot" namespace declaration is *not*, the handling of
>>>>> undeclared prefix usage is of utmost importance ('schema'..). And if
>>>>> none is of meaning, how shall non-prefixed values in @rel/@rev work?
>>>>>
>>>>> (I might actually end up *supporting* this for pragmatic reasons. But
>>>>> only with a desire to explicitly state that it would be for legacy
>>>>> reasons and discouraged when authoring new XHTML+RDFa. I'm in need of
>>>>> more convincing though.)
>>>>>
>>>>>
>>>>> [Side-note: If non-prefixed names in @rel/@rev *do* mean "default
>>>>> namespace + name", wouldn't that -- since that namespace is almost
>>>>> invariably <http://www.w3.org/1999/xhtml> -- lead to inadvertent
>>>>> references to an unmanaged set of URIs all beginning with
>>>>> "http://www.w3.org/1999/xhtml"? For this alone, I vote for ignoring
>>>>> them. Allowing ":somename" - perhaps. At least it would be (more)
>>>>> intentional. My guess is that many host language namespaces for RDFa
>>>>> will not also be RDF vocabularies. Predefineds like @rel="alternate"
>>>>> in (x)html is another thing I think. Are they even concepts prefixed
>>>>> by the xhtml syntax namespace? That's reasonably another debate
>>>>> though.]
>>>>>
>>>>>
>>>>> Best regards,
>>>>> Niklas
>>>>>
>>>>>
>>>>>
>>>>> On 8/9/07, Ivan Herman <ivan@w3.org> wrote:
>>>>>> O.k. That is actually in line with what I said and done. I always
>>>>>> referred to the value of @rev, @rel, @property, @instanceof as 'sort
>>>>>> of'
>>>>>> qnames, and I think what I meant is (and the way I implemented it) is
>>>>>> exactly what you say: I split the string at the ':' point, take the
>>>>>> left
>>>>>> part as an index into an associative array ('dictionary' in Python
>>>>>> speak) and simply concatenate the value with what is on the right hand
>>>>>> side of ':'. Adding an alternative branch which does that for '.'
>>>>>> instead of ':' is a piece of cake (and I have already added it to the
>>>>>> code locally...:-)
>>>>>>
>>>>>> However. We clearly would need proper TF resolution on these issues.
>>>>>>
>>>>>> Thanks Mark
>>>>>>
>>>>>> I.
>>>>>>
>>>>>> Mark Birbeck wrote:
>>>>>>> Hi Ivan,
>>>>>>>
>>>>>>> We might as well start another thread. :)
>>>>>>>
>>>>>>> The CURIEs spec is not actually about the square bracket stuff--that's
>>>>>>> just a way to disambiguate between a CURIE (compact URI) and a URI.
>>>>>>>
>>>>>>> The idea of CURIEs is that instead of using QNames as a URI
>>>>>>> abbreviation syntax, we use something that is specifically designed
>>>>>>> for the job. The problem with the definition of 'QName' is that it is
>>>>>>> essentially about defining element and attribute names in XML:
>>>>>>>
>>>>>>>   <a:b c:d=:x" />
>>>>>>>
>>>>>>> However, over the years QNames have been used to abbreviate URIs
>>>>>>> (RDF/XML) and to namespace various features like functions (XPath) and
>>>>>>> data types (XML Schema). At least those uses are inside XML documents,
>>>>>>> but the use of QNames in SPARQL is genuinely odd, since it has nothing
>>>>>>> to do with XML.
>> --
>>
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
>>
>>
>>
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Friday, 10 August 2007 11:53:00 UTC