Re: varieties of datatyped tagged literals

On Wed, 2011-09-07 at 13:54 -0700, Gavin Carothers wrote:
> On Wed, Sep 7, 2011 at 12:06 PM, Sandro Hawke <sandro@w3.org> wrote:
> > On Wed, 2011-09-07 at 19:30 +0100, Andy Seaborne wrote:
> >>
> >> On 07/09/11 17:42, Pierre-Antoine Champin wrote:
> >> > Following todays's discussion, let me rephrase the rationale of each
> >> > "family" of solution:
> >>
> >> Thanks. Pat gives teh details; this is good to discuss the general
> >> intent of each approach.
> >>
> >> >
> >> > 1. Don't change anything: literals will have *either* a datatype or a
> >> > literal.
> >> >
> >> > In the following options, we unify literals by ensuring that every
> >> > literal has a datatype.
> >> >
> >> > 2. The language tag is still "outside" the (lexical/value) mechanism of
> >> > the datatype; the various sub-options differ in how this
> >> > extra-information is introduced in the system.
> >> >
> >> > In the following options, we unify literals even more by making
> >> > language-tagged literals a special case of datatyped literal.
> >> >
> >> > 3. The language tag is attached to the by the datatype.
> >> >
> >> > 4. The language tag is attached to the lexical form.
> >>
> >> A RDF 1.0 literal has three parts:
> >>     (lexical form, language tag, datatype)
> >>
> >> with lang and datatype being optional.
> >>
> >> Options 2, 3 and 4 remove the optionality on datatype.
> >>
> >> Option 2 still has optional language tag; there is a single datatype for
> >> lang-tag literals.
> >>
> >> Option 3 removes the lang slot and encodes it into the URI.
> >> (or requires a dereference).
> >>
> >> Option 4 removes the lang slot and encodes it into the lexcial form.
> >>
> >> For 3 vs 4, if you emphasis datatypes more than lexical forms, you like
> >> 3 and conversely, if you emphasis lexical forms, 3 is preferable to 4.
> >>
> >> Options 3 and 4 reduce the dimensionality to 2 by encoding.
> >>
> >> All options make language tags "special" in some way.  Option 2 does it
> >> bypassing L2V; options 3 and 4 rely on micro-parsing (further parsing a
> >> string).
> >
> > Very, very nicely put.   I dislike 2 because it doesn't get us down to
> > two elements.
> 
> We have three elements today, so we don't get two in the future... meh.

Not sure I agree.  In some sense the datatype is already two elements,
since many people think of it as a namespace and an entry in that
namespace.   Option 3 adds more complexity to the datatypes, true, but
it seems to me the complexity is only there for people who need it,
instead of being in the way of people who don't need it.

> >  I prefer 3 over 4 because I think datatype URIs are a
> > better place to do the encoding than data values -- URIs are already
> > full of delimiters and parameters understood by different components.
> 
> http://www.w3.org/DesignIssues/Axioms.html#opaque
> 
> The only thing you can use an identifier for is to refer to an object.
> When you are not dereferencing, you should not look at the contents of
> the URI string to gain other information.
> 
> Recommending the use of non opaque URIs seems like a backwards step.

TimBL wrote that many years ago in response to the trend of people and
software making unwarranted assumptions about the structure of URLs.  In
this case, we're talking about a warranted assumption -- a standard,
even, so the situation is different.   It's more like a namespace, or
the .well-known/genid thing.

I'm fairly confident Tim prefers option 3 here, but he's traveling for
the next few weeks, so I'm not sure I can get a solid answer from him.
If his opinion on this is likely to change anyone's mind, I'm happy to
try to get his attention (or you can email him directly, of course).

   -- Sandro


> --Gavin
> 
> > Forcing the data values to also be parsed doesn't feel right, although I
> > concede it does work.
> >
> >     -- Sandro
> >
> >
> >>       Andy
> >>
> >>
> >
> >
> >
> >
> 

Received on Thursday, 8 September 2011 12:01:18 UTC