Re: ISSUE-143 (Prefixes too complicated): Use of prefixes is too complicated for a Web technology [RDFa 1.1 in HTML5] from Ivan Herman on 2012-11-06 (public-rdfa-wg@w3.org from November 2012)

From: Ivan Herman <ivan@w3.org>
Date: Tue, 6 Nov 2012 10:18:44 -0500
To: "Tab Atkins Jr." <jackalmage@gmail.com>
Cc: Manu Sporny <msporny@digitalbazaar.com>, RDFa Working Group <public-rdfa-wg@w3.org>, Stéphane Corlosquet <scorlosquet@gmail.com>
Message-Id: <D3AE04C1-20E4-40A1-9DD2-1F118ADF8010@w3.org>
Tab,

This is just to explain one bit of RDFa, related to Stéphane's question, that you may not be aware of.

RDFa has the concept of 'initial context'. The core initial context, defining a number of prefixes, are defined at:

http://www.w3.org/2011/rdfa-context/rdfa-1.1

Apart from the vocabularies defined by W3C, the prefixes in use are defined through a process documented at:

http://www.w3.org/2010/02/rdfa/profile/data/

and it is the plan to try to get new data for the list of accepted prefixes time to time, to document the appearance of new, widely used vocabularies.

Indeed, the difference between what you propose and what is currently there already, is whether non-registered prefixes would be completely disallowed in your scheme.

Ivan

On Nov 6, 2012, at 10:06 , Stéphane Corlosquet wrote:

> Hi Tab,
> 
> On Mon, Nov 5, 2012 at 7:16 PM, Tab Atkins Jr. <jackalmage@gmail.com> wrote:
> On Wed, Oct 24, 2012 at 10:37 AM, Manu Sporny <msporny@digitalbazaar.com> wrote:
> > On 10/24/12 12:30, RDFa Working Group Issue Tracker wrote:
> >> ISSUE-143 (Prefixes too complicated): Use of prefixes is too
> >> complicated for a Web technology [RDFa 1.1 in HTML5]
> >>
> >> http://www.w3.org/2010/02/rdfa/track/issues/143
> >
> > Hi Tab,
> >
> > The RDFa WG has officially recorded your formal objection for the
> > HTML+RDFa 1.1 specification. We're tracking it in our issue tracker now.
> > Could you please outline one or more proposals that would result in the
> > withdrawal of your formal objection?
> 
> Yes.
> 
> As outlined in the original threads that introduced this issue, usage
> in the wild shows that authors very commonly author "invalid" markup
> which uses a common prefix without specifying the prefix.  Consumers
> have evolved to recognize these common prefixes without the
> declaration, and in some (most?) cases may actually ignore the
> declaration entirely and simply always assume that the common prefix
> translates to the common URL.
> 
> This presents us with several problems:
> 
> 1. Authors appear to usually use only a handful of common prefixes,
> and assign intrinsic meaning to these prefixes.  This suggests that
> the indirection of prefixes may be too complex and unnecessary in the
> first place, and we would be better served by just treating the
> prefixes themselves as meaningful, rather than as a shortener for the
> "real" meaningful things, the URLs.
> 
> 2. The developers of consumers either *also* share this
> misunderstanding, or just don't find it worthwhile to be correct when
> they can do just as well in practice by treating the prefix as
> meaningful.  This suggests that there may be a real interoperability
> danger if an author *properly* declares a prefix where the prefix is a
> common one, but the URL is to something other than what common use
> points to - in "correct" consumers the document will be interpreted as
> the author intended, but in many common consumers it will instead be
> misinterpreted to be using the common vocabulary rather than what the
> author intended.
> 
> 3. In addition to the theoretical interop problem above, we have a
> real interop problem already - many consumers will happily consume
> pages that don't declare their prefix, as long as they use a
> "well-known" prefix for it.  A conformant consumer, on the other hand,
> would *not* do so, and would find no valid data on the pages.  You
> have to reverse-engineer the web to find out which prefixes need to be
> supported without a declaration, and what URL they should be bound to.
>  This is an obvious failure mode of a standard.
> 
> There are two possible changes that would resolve my objection:
> 
> 1. Discover and document the common prefixes in use, define them to
> always be bound to the URL they're commonly bound to, even without an
> actual declaration, and don't allow them to be bound to a URL other
> than that predefined one.
> 
> In your suggestion above, to be crystal clear, are you implying that any "common" predefined prefix could not be overwritten locally? For example, taking 'dc' as a common prefix which is usually bound to the Dublin Core namespace, would this markup would be invalid then?
> 
> <html prefix="dc: http://mynamespace.org/mydc"
> ....
> <span property="dc:customprop">value</span>
> 
> right? or rather, you would simply ignore the override at the top?
> 
> 
> 2. Drop the indirection of prefixes entirely, and simply declare that
> prefixes themselves are meaningful.  Predefine the common prefixes in
> use.
> 
> what about the non-common prefixes that are either not known or existing yet, or too "niche" to be documented in such spec. RDFa is designed to allow people to define their own vocabularies, some of them might not even be shared publicly if say they only apply to say internal corporate schemas.
> 
> Steph.
>  
> 
> Either would be acceptable, though I greatly prefer #2.  I argue that
> #2 is perfectly acceptable for two reasons:
> 
> 1. If people adopted the convention of simply using their domain name
> (quite reasonable, I think, and likely more-or-less what people will
> naturally use anyway), it would convey the exact same meaning and
> uniqueness as a full URL, but with less typing - "http://foo.com" is
> 11 characters longer than "foo".
> 
> 2. This does not harm the ability of generic consumers to process
> data.  The URL that a prefix is bound to has no official meaning
> anyway - it's solely a uniquifing mechanism - so generic consumers can
> infer nothing from it in the general case.  They can do exactly as
> much with a non-URL prefix.  When a consumer *does* know what the URL
> means (it's a vocabularly it recognizes), it can do something special
> (inferring defaults, etc.), but it can do the exact same thing when it
> knows what a particular prefix means (which is what consumers do
> today).
> 
> However, if #2 is for whatever reason unacceptable, #1 is the *bare
> minimum* that needs to be done for the RDFa spec to document reality,
> such that a consumer can follow the spec and reasonably expect to
> correctly consume content already on the web.  If this is not done,
> the RDFa spec is vastly less useful, and shouldn't be pursued.
> 
> ~TJ
> 
> 
> 
> 
> -- 
> Steph.


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Tuesday, 6 November 2012 15:19:15 UTC