rdfms-literal-is-xml-structure: Why?

Dan Connolly raised this issue, stating it as:

   A statement with a parseType of 'Literal' has as its object
   an XML structure, not a simple string. For example, the first
   character of the literal <foo>bar</foo> is not '<'.

This is an interesting suggestion. It raises several questions.
I'll confine myself to one (at least for now)...

1) What evidence is there that this was the intent of the
M&S 1.0 specification?


Searching through the archives of the w3c-rdf-syntax-wg
list for 'infoset' turns up VERY few messages. Other than
the indexes into the lists, I see only three:

http://lists.w3.org/Archives/Member/w3c-rdf-syntax-wg/1998Oct/0089.html

http://lists.w3.org/Archives/Member/w3c-rdf-syntax-wg/1998Oct/0093.html

http://lists.w3.org/Archives/Public/www-rdf-comments/1998OctDec/0030.html

In fact, these are all in the same thread, meaning that in
all the email generated by the rdf-syntx WG, 'infoset' had one
unique mention. 

Re-reading those messages, IMHO, supports a very different
interpretation of the WG intent - that parseType="Literal" was a
stop-gap measure to let us deal with embedded XML content through
the simple expedient of turning off RDF parsing of that content.
In fact, the phrase "generates no tuples" is used in the emails
above in a manner that seems to indicate that the WG wanted to
completely ignore the content and markup in the Literal, and treat it
as a simple string. Later applications might do something with the
markup. 

If that is the case, then the clarification document can't say
that M&S 1.0 requires the generation of tuples for the infoset of
the embedded content. That seems the opposite of the intent.


Dan's suggestion could be within the scope of a 2.0 revisitation
of M&S, but clearly seems to exceed our chartered tasks.

(At that time, there may be an approach we can take which reconciles
the views. We might say that in 1.0, a Literal is just a String,
but that in 2.0, we have some extra info in the model so that
we not only have the string, we have a URI for it. (We should also
agree on just what those URIs are). That URI can be used as the
subject for all sorts of statements. We could use it in statements
which have a predicate called something like 'rdf2:hasInfoset'. The
rest is left as an exercise for the future.)


But for now, I think that as far as RDF 1.0 processors are concerned,
Literals are just strings, and the first character of a string
like "<foo>bar</foo>" would be '<'.


Ron Daniel Jr.
Standards Architect
Tel: +1 415 778 3113
Fax: +1 415 778 3131
Email: rdaniel@interwoven.com 

Visit www.interwoven.com
Moving Business to the Web 

Received on Wednesday, 11 July 2001 23:05:46 UTC