Re: varieties of datatyped tagged literals

On 9/8/2011 8:01 AM, Sandro Hawke wrote:
> On Wed, 2011-09-07 at 13:54 -0700, Gavin Carothers wrote:
>> On Wed, Sep 7, 2011 at 12:06 PM, Sandro Hawke<sandro@w3.org>  wrote:
>>> On Wed, 2011-09-07 at 19:30 +0100, Andy Seaborne wrote:
>>>>
>>>> On 07/09/11 17:42, Pierre-Antoine Champin wrote:
>>>>> Following todays's discussion, let me rephrase the rationale of each
>>>>> "family" of solution:
>>>>
>>>> Thanks. Pat gives teh details; this is good to discuss the general
>>>> intent of each approach.
>>>>
>>>>>
>>>>> 1. Don't change anything: literals will have *either* a datatype or a
>>>>> literal.
>>>>>
>>>>> In the following options, we unify literals by ensuring that every
>>>>> literal has a datatype.
>>>>>
>>>>> 2. The language tag is still "outside" the (lexical/value) mechanism of
>>>>> the datatype; the various sub-options differ in how this
>>>>> extra-information is introduced in the system.
>>>>>
>>>>> In the following options, we unify literals even more by making
>>>>> language-tagged literals a special case of datatyped literal.
>>>>>
>>>>> 3. The language tag is attached to the by the datatype.
>>>>>
>>>>> 4. The language tag is attached to the lexical form.
>>>>
>>>> A RDF 1.0 literal has three parts:
>>>>      (lexical form, language tag, datatype)
>>>>
>>>> with lang and datatype being optional.
>>>>
>>>> Options 2, 3 and 4 remove the optionality on datatype.
>>>>
>>>> Option 2 still has optional language tag; there is a single datatype for
>>>> lang-tag literals.
>>>>
>>>> Option 3 removes the lang slot and encodes it into the URI.
>>>> (or requires a dereference).
>>>>
>>>> Option 4 removes the lang slot and encodes it into the lexcial form.
>>>>
>>>> For 3 vs 4, if you emphasis datatypes more than lexical forms, you like
>>>> 3 and conversely, if you emphasis lexical forms, 3 is preferable to 4.
>>>>
>>>> Options 3 and 4 reduce the dimensionality to 2 by encoding.
>>>>
>>>> All options make language tags "special" in some way.  Option 2 does it
>>>> bypassing L2V; options 3 and 4 rely on micro-parsing (further parsing a
>>>> string).
>>>
>>> Very, very nicely put.   I dislike 2 because it doesn't get us down to
>>> two elements.
>>
>> We have three elements today, so we don't get two in the future... meh.
>
> Not sure I agree.  In some sense the datatype is already two elements,
> since many people think of it as a namespace and an entry in that
> namespace.

In what way to people use current datatypes like that?

>  Option 3 adds more complexity to the datatypes, true, but
> it seems to me the complexity is only there for people who need it,
> instead of being in the way of people who don't need it.

I think the sample code for checking if a literal is a string shows that 
the complexity comes through almost no matter what.

Lee

>>>   I prefer 3 over 4 because I think datatype URIs are a
>>> better place to do the encoding than data values -- URIs are already
>>> full of delimiters and parameters understood by different components.
>>
>> http://www.w3.org/DesignIssues/Axioms.html#opaque
>>
>> The only thing you can use an identifier for is to refer to an object.
>> When you are not dereferencing, you should not look at the contents of
>> the URI string to gain other information.
>>
>> Recommending the use of non opaque URIs seems like a backwards step.
>
> TimBL wrote that many years ago in response to the trend of people and
> software making unwarranted assumptions about the structure of URLs.  In
> this case, we're talking about a warranted assumption -- a standard,
> even, so the situation is different.   It's more like a namespace, or
> the .well-known/genid thing.
>
> I'm fairly confident Tim prefers option 3 here, but he's traveling for
> the next few weeks, so I'm not sure I can get a solid answer from him.
> If his opinion on this is likely to change anyone's mind, I'm happy to
> try to get his attention (or you can email him directly, of course).
>
>     -- Sandro
>
>
>> --Gavin
>>
>>> Forcing the data values to also be parsed doesn't feel right, although I
>>> concede it does work.
>>>
>>>      -- Sandro
>>>
>>>
>>>>        Andy
>>>>
>>>>
>>>
>>>
>>>
>>>
>>
>
>
>
>

Received on Thursday, 8 September 2011 12:38:07 UTC