review of rdf:text, dated 2008-11-04

I reviewed the current draft of the rdf:text specification [1].
I subdivided my comments into criticism on the content, criticism on the
structure, errors in the document, and editorial issues.

Criticism on the content
====
- to assure maximum compatibility with current and future versions of
XML schema datatypes, the string parts of both the lexical and value
space should be based on the respective spaces of the XML schema
datatype string.
- the set of characters is finite, and thus it cannot be assumed that it
is infinite. The problem that some OWL 2 implementations might have some
issue with the finiteness of this set is of no concern to this datatype
per se. In fact, the XML schema string datatype is based on a finite set
of characters, and so OWL 2 implementations will run into problems with
this datatype.
If there is really a problem to be expected with implementations of OWL
2, it should be dealt with in the OWL 2 specification, and not the
specification of this datatype.
- concerning the definition of fn:text-length: It is not obvious that
this function should return the length of only the string part of the
text. A user might expect the language tag, and perhaps even the
separator used in the lexical space, to be taken into account when
computing the length.
Therefore, I believe no text-length function should be provided.

Criticism on the structure
====
- the sections 3.1 and 3.2 are not logically part of the definition of
the data type, and so should not be included in section 3.

Errors in the document
====
- In the example in section 3.2 it is claimed that the string "Padre de
familia" is mapped to the same value as the text "Padre de familia@".
This is clearly not true.
- In the definition of text-from-string-lang, $arg2 must be a string as
specified in BCP 47, and otherwise an error must be raised.

Editorial issues
====
- abstract: "both in" => "in both"
- introduction: the text about how this document came to be and about
the collaboration between the working groups might be interesting for
the "purpose of this document" section, but not for the specification
document itself. However, I guess that for the first public working
draft it's not really an issue.
- the references of the form [1],... are awkward. Please use the same
style for all references.
- some of the references are italicized, and some are not, e.g., the
second sentence of section 2.
- sections 4.1.3 and 4.1.4: please specify the return values; extraction
is a process.
- the text and summaries in sections 4.2.1 and 4.2.2 is not entirely
clear. Please use symbols for referring to the individual parts of the
arguments and to state properties about them, like in sections 4.1.3 and
4.1.4.
- there is a question-mark in the signature declaration in section
4.3.2. It is not clear what this means.


[1] http://www.w3.org/2007/OWL/draft/ED-owl2-rdf-text-20081104/

Received on Thursday, 6 November 2008 18:43:15 UTC