Re: datatyping revised draft from Patrick Stickler on 2002-06-03 (w3c-rdfcore-wg@w3.org from June 2002)

From: Patrick Stickler <patrick.stickler@nokia.com>
Date: Mon, 03 Jun 2002 12:08:26 +0300
To: Pat Hayes <phayes@ai.uwf.edu>, RDF Core <w3c-rdfcore-wg@w3.org>, Sergey Melnik <melnik@db.stanford.edu>
Message-ID: <B9210CBA.15F57%patrick.stickler@nokia.com>
On 2002-06-01 20:18, "ext patrick hayes" <phayes@ai.uwf.edu> wrote:

> 
> Sorry this isn't quite finished, but as I will be travelling for a
> few days here's the current state of the re-edit of Patrick's
> datatypes document:
> 
> http://www.coginst.uwf.edu/~phayes/RDF_Datatyping060102_draft.html

Overall, looks good (insofar as how it captures the stake-in-the-ground
proposal).

A few comments/questions:

1. Section 1.5: should the order of the 'triplet' as discussed here
reflect the order of its N-Triples structure. I.e., bit/string/lang
rather than bit/lang/string?

2. Section 2.1: Is the statement "(The model theory assumes that there is a
fixed global mapping L2V from datatypes to their datatype mappings,
corresponding to this external datatype-retrieval machinery.)" necessary
here?  Or at least, can "L2V" be omitted?
 
3. Section 2.2: The use of 'element' such as in
"RDF uses literals to indicate elements of the lexical space of datatypes"
may lead to confusion given the significance of that term to XML. I
suggest that 'member' be used instead. I.e.
"RDF uses literals to indicate members of the lexical space of datatypes".

4. Section 3: We may want to say "less precise" rather than "cruder".

5. Section 3.1: Should probably change

"RDF merely provides means for the designation of the datatyped literal
pairings upon which such validation would be performed."

to

"RDF merely provides means for the designation of the datatyped literal upon
which such validation would be performed."

to consistently remove the "pairing" terminology

6. Section 3.2: "One can think of rdfd:lex as a ...
 superproperty of all other datatype properties."

Should we consider making this more explicit/formal. I.e. that all
datatype properties *are* subproperties of rdfd:lex?

{
   ?d rdf:type rdfd:Datatype .
}
log:implies
{
   ?d rdfs:subPropertyOf rdfd:lex .
}

such that it inherits a fixed range of rdfs:Literal
per the following 'bootstrapping' closure rules:

{}
log:implies
{
   ...
   rdfd:lex rdf:type rdf:Property .
   rdfd:lex rdfs:domain rdf:Resource .
   rdfd:lex rdfs:range rdfs:Literal .
}

Eh?

7. Section 3.3: I think we will encounter problems with the view suggested
in the last sentence "For applications which are primarily interested in
describing datatype values as property values, therefore, the in-line use of
literals is less useful, and the datatype property or lexical form idioms
should be used."  What this essentially is saying is that most folks should
change the idiom they are using, because most folks are using the inline
idiom with datatype value semantics. I.e., they don't care about the lexical
form -- they are interpreting the property value to be the datatype value
represented by the lexical form.

This is why I feel the present stake-in-the-ground proposal won't fly.

We are supposed to be clarifying and strengthening current and common
usage of RDF. The most prevalent usage of datatyped inline literals
is to denote the datatype values, and that is the semantics of interest
to the applications. The fact that the inline idiom cannot, in this
present proposal, provide the datatype value to the application is,
IMO, a fatal flaw which must be corrected.

Insofar as the present WD is concerned, I think it would be good to
add a little discussion here as to the full ramifications of this
proposal. I.e. to state explicitly that e.g. CC/PP applications will
not be able to consider RDF as providing datatype values, etc. and
that if that is a requirement, then CC/PP will need to change to
one of the bnode idioms. Likewise for much of the DC usage.

It's important that we make clear to the community exactly what
this will mean in practice for them.

I predict that the assymmetry between the bnode and inline idioms
as expressed by the statement "Note that although the association of
datatype with a literal defines a unique datatype value, this in-line use
of the literal says that the value of the property is the literal, not the
datatype value." in section 3.4.1 will confuse alot of folks. After
all, if you've identified a unique value, why ignore/discard/hide it?

We'll see...

8. Section 4.2: The first sentence of the last paragraph is not
capitalized.

9. Section 5: What is the significance of the square brackets? Also,
is it really necessary to use NTriples? Why can't we use RDF/XML
througout?

<soapbox>
For the record, I believe that NTriples have no place in any
normative sections of the RDF specifications. Users should not
have to learn NTriples in order to understand any normative
definition of RDF. NTriples are for the test cases, and the only
place you should see them and see a definition of the NTriples
syntax is in the test cases document. All other documents should
use RDF/XML exclusively for serializing RDF statements.

I'm don't even think the Primer should provide any coverage of
NTriples. No, I'm not anti-NTriples (or even anti-N3). NTriples
are very, very important. But they have a very specific and
most importantly non-normative role, and I'm afraid that if
we start peppering all the specs with NTriples that users will
feel they must also understand and use NTriples to use RDF
and they don't. RDF/XML is the only official serialization of RDF,
so let's set a good example and use it.
</soapbox>

10. Appedix 7.1: Should the title be "FAQ"?

Here's a question:

Q: I've recieved a large body of RDF content that uses the inline idiom
exclusively and associates all properties with a specific datatype. My
application needs the datatype values represented by those literal values.
Can I assume that the content creator meant those specific datatype values
or must I simply reject the content as under-specified?


> The appendices are still unfinished

I think that the appendices should not be touched until after we
have had 1-2 public releases of the WD. We should focus our
energies first on the normative stuff, until that is nailed
down very well.

Of course, if someone has the extra time, OK, but I'd hate to see
needless edits/re-edits if/as the core datatyping spec emerges.

> ... Some of the 
> pictures are as before, some have been re-drawn.

I will do this once Sergey has taken his pass and we're pretty
much ready for first publication of the WD.

> Ive indulged in a little creativeness in the nomenclature, and
> re-christened  rdfd:datatype (small 'd') to be rdfd:dcv, called
> Datatype Constraint on Value,

Argh. Thirty lashes with a wet towel ;-)

So much for mnemonicity...

And, by the way, I don't see much difference between the pair
of class/property names rdfd:Datatype/rdfd:datatype and
rdf:Resource/rdf:resource. Eh?

Capitalized is a class. Lowercase is a property with a range of that class.

Anyway, it helps to keep the terminology stable, even if we think
we may change it, otherwise, examples and dicussions are too difficult
to follow when everyone is using their own preferred names, etc.

> to make it as unlike 'range' as
> possible. 

Eh? rdfd:datatype suggests 'range'?

> This is a placeholder for a better name to be added later.

Well, that could have simply been said of rdfd:datatype.

> I will undertake to re-draw the pictures to use whatever new name we
> finish up with.

No worries. Updating the pictures is not a big task (presuming we
actually need to redraw them ;-)

Cheers,

Patrick 

--
               
Patrick Stickler              Phone: +358 50 483 9453
Senior Research Scientist     Fax:   +358 7180 35409
Nokia Research Center         Email: patrick.stickler@nokia.com
Received on Monday, 3 June 2002 05:07:01 UTC