RE: comments on http://www.w3.org/TR/2009/WD-rdf-text-20090421/ from Boris Motik on 2009-04-22 (public-owl-comments@w3.org from April 2009)

From: Boris Motik <boris.motik@comlab.ox.ac.uk>
Date: Wed, 22 Apr 2009 12:16:18 +0100
To: "'C. M. Sperberg-McQueen'" <cmsmcq@blackmesatech.com>, <public-owl-comments@w3.org>
Message-ID: <C3967792DE4F4D0D931C885F9003EBF3@wolf>
Hello,

Thank you very much for your comments. Please find my answers inline. These are
my personal answers and explanations, and they don't necessarily reflect the
viewpoints of other editors, the OWL WG, or the RIF WG.

Regards,

	Boris

> -----Original Message-----
> From: public-owl-comments-request@w3.org [mailto:public-owl-comments-
> request@w3.org] On Behalf Of C. M. Sperberg-McQueen
> Sent: 22 April 2009 01:06
> To: public-owl-comments@w3.org
> Cc: C. M. Sperberg-McQueen
> Subject: comments on http://www.w3.org/TR/2009/WD-rdf-text-20090421/
> 
> [Speaking for myself and not for any organization or working group]
> 
> I've just read "rdf:text: A Datatype for Internationalized Text"
> in the version of 21 April 2009.  Nice work.
> 
> I do have a few questions or comments.
> 
> (1) Typo in two namespace names?
> 
> In section 2, you define conventional meanings for several
> namespace prefixes, including
> 
>    xs for http://www.w3.org/2001/XMLSchema#
>    fn for http://www.w3.org/2005/xpath-functions#
> 
> I realize that for reasons I think I once understood (but do not
> now recall -- explain if you like, but I don't mind if you spare
> yourself the effort) RDF users often create namespace names with
> trailing hash marks.  But I'm pretty sure that there is no
> trailing hash mark in the XML Schema namespace defined by the XML
> Schema spec at
> 
>    http://www.w3.org/TR/xmlschema-1/
>    http://www.w3.org/TR/xmlschema-2/
> 
> or, for XSD 1.1, by
> 
>    http://www.w3.org/TR/xmlschema11-1/
>    http://www.w3.org/TR/xmlschema11-2/
> 
> If you are endeavoring to refer to that namespace, you have a
> typo and should (I think) remove the hash mark.  Simple-minded
> readers who copy and paste the namespace name into (say) a schema
> document will be disappointed, perhaps, to find that most XSD
> validators don't recognize the form with the hash mark.  And a
> quick test reveals that some of them are fairly nasty about it.
> 
> If on the other hand you are endeavoring to refer not to that
> namespace but to a different one, related conceptually to the
> first (thus motivating the mnemonic of having a similar
> spelling), it would probably be helpful to the reader to mention
> that fact.
> 
>  From uses of the xs: prefix later in the document (e.g. the
> reference in 5.1.1 to xs:string), I think the former more likely.
> 
> It may be a mistake on the part of the XML Schema WG not to have
> provided our namespace with a hash mark, but if so, it's a
> mistake we've made (note the past tense here) and cannot now
> unmake.
> 
> Similar remarks apply to the fn namespace.
> 

The term "namespace" is a misnomer and we are not really using it in either OWL
2 or rdt:text. Please let me explain.

The XML Namespaces specification uses QNames, which are pairs of the form
(namespace,localName). Thus, when one writes

<a:B>

one actually says that the element's name is a pair whose first element is the
URI associated with a, and whose second element is B.

RDF (and consequently OWL as well), however, works only with URIs. That is,
resources on the Web are not QNames (i.e., they are not pairs), but are URIs
(i.e., strings of characters). Now RDF/XML says that, when you write <a:B>, you
should generate the URI by concatenating the namespace associated with a with B.
Thus, the RDF suggests that it is using the XML Namespaces specification for URI
abbreviation; however, it is actually using its own mechanism based on
concatenation. This mechanism has later been sanctioned in the CURIE
specification.

Because of all that, we are not using the term "namespace" anywhere in either
OWL 2 or rdf:text. Instead, we speak of prefix URIs, which seem like namespaces,
and of prefix names, which seem like local names. Consequently, to obtain a
proper URI after the prefix URI is concatenated with the prefix name, you need
to terminate the prefix URI with #. After all this, you do get a proper URI that
matches Section 3 of XML Schema Datatypes 1.1.

http://www.w3.org/TR/xmlschema11-2/#built-in-datatypes


> (2) Should XSD 1.1 refer to rdf:text?
> 
> As you may know, XSD 1.1 differs from XSD 1.0 in allowing
> conforming validators to accept primitives, and facets,
> additional to those defined by the XSD 1.1 spec itself.  It
> occurs to me that it might be helpful to refer, from the XSD 1.1
> spec, to the rdf:text spec as an example of a published
> definition of such an additional primitive datatype, with
> (voila!) a facet defined for it.  Would the OWL and RIF working
> groups have any objection to my suggesting this to the XML Schema
> wg?
> 

I don't believe that XSD 1.1 needs to worry about rdf:text at all. The main
motivation behind rdf:text was to provide adequate names for the corresponding
sets of plain literals of RDF (I'll elaborate more on this below). Thus,
rdf:text specification is RDF-centric and should not concern much the general
XML datatype architecture.

> (3) Required export to plain literals
> 
> In section 4, you require that all RDF tools translate rdf:text
> values into plain literals before exporting data to exchange with
> another RDF tool.  This seems likely to have the effect that some
> toolmakers, at least, will argue that there is no need to support
> rdf:text because no one is using it, they never see any instances
> of it.  (The rules in XML 1.1 which encourage users of XML 1.1 to
> label their data as XML 1.0 whenever possible have led to similar
> arguments that there is no XML 1.1 data anywhere, nor any XML 1.1
> processors, both of which are falsehoods but apparently cannot be
> rooted out.)
> 
> I wonder if it would be better just to encourage, or require,
> that RDF tools which support rdf:text provide user control over
> whether to export to plain literals or not.  It's your decision,
> of course: since rdf:text and plain literals are semantically
> interchangeable, I suppose it may not matter as much as I
> imagine.
> 

The main goal of rdf:text is to provide names for the set of literals you
already have, not to introduce new types of literals. As the document's
introduction states, names for various sets of literals are often needed in OWL
(and to some extent in RDFS as well) if you want, for example, to place
appropriate range restrictions on data properties. Consequently, an OWL tool
vendor will need to support rdf:text. I can't see how the RDF export limitation
might dissuade the vendor from supporting rdf:text: with or without the export
restriction, the tool vendor is not gaining any additional expressivity. 

> (4) rtfn:length function
> 
> In section 5.3.1 you define an rtfn:length function.  To avoid
> confusion or error, it might be helpful to remind the reader and
> implementor explicitly here that what are counted are characters,
> not 16-bit code units or octets.
> 
> Otherwise, it seems inevitable that someone is just going to
> implement the length function with a call to strlen(), oblivious
> to the havoc that shortcut will wreak later on.
> 

The definition says "the number of characters" -- I can't see how this could be
misunderstood. Note that we never talk about various UNICODE encodings, such as
UTF-8, and doing so at this place might come a bit out of the blue.

> (5) Internationalization issues
> 
>  From the fact that rdf:text values are pairs of UCS strings and
> language tags, I infer that the type is intended to handle
> natural-language text.
> 
> But if I understand correctly, some authorities strongly
> recommend the use of explicit XML markup both for bidirectional
> text (which, n.b., is not necessarily polyglot text) and for text
> with ruby-style annotations.
> 
> I assume that one reason you don't allow internal XML markup is
> that that would break compatibility with plain literals.
> 
> I think your document would be the stronger if you explained what
> is to be done with Japanese text with ruby annotations, or with
> Hebrew or Arabic text for which the Unicode bidi algorithm does
> not suffice (and which therefore appears to need internal XML
> markup to be handled reliably).
> 

I agree that these might be important issues; however, they clearly exceed the
scope of rdf:text. The main goal of this specification was to provide adequate
names for the sets of plain literals in RDF, and not to solve all
internationalization problems one might have.

> 
> Good luck with the spec.
> 
> --
> ****************************************************************
> * C. M. Sperberg-McQueen, Black Mesa Technologies LLC
> * http://www.blackmesatech.com
> * http://cmsmcq.com/mib
> * http://balisage.net
> ****************************************************************
>
Received on Wednesday, 22 April 2009 11:17:35 UTC