RE: rdfms-literal-is-xml-structure: Why?

Dan Connolly wrote:
> Ron Daniel wrote:
> > Dan Connolly raised this issue, stating it as:
> > 
> >    A statement with a parseType of 'Literal' has as its object
> >    an XML structure, not a simple string. For example, the first
> >    character of the literal <foo>bar</foo> is not '<'.
> > 
> > 1) What evidence is there that this was the intent of the
> > M&S 1.0 specification?
> 
> [[[
> If the
>      content of E contains no XML markup or if parseType="Literal" is
> specified in the start tag of E then v is the
>      content of E (a literal).
> ]]]
> 
> --        Resource Description Framework (RDF) Model and Syntax
> Specification
> http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/
> Wed, 24 Feb 1999 14:45:07 GMT
> 
> 
> E referes to an XML element in that bit of the spec.
> The content of an XML element isn't (in general) a string; it's
> a sequence of data characters and/or elements, PIs, and comments:

It is also worth noting that the paragraph which follows and explains
the bit you quoted contains the sentence:
      The value 'Literal' specifies that the element content is to
      be treated as an RDF/XML literal; that is, the content must
      not be interpreted by an RDF processor. 

But lets concentrate on what we can agree to.

> I could live with encoding the structured thing as a string
> provided
> 	(1) namespace info isn't lost; in the example above,
> 	the resulting string must capture the namespace
> 	name associated with <apply/> etc.

Yes. M&S 1.0 was a very early application of namespaces. Implementation
experience clearly shows the necessity of this, which is not something
that I recall the group ever discussing at the time. So, the clarified
M&S must require namespace information to be available for XML Literals.

I note that 'Jena' is one implementation that already provides access
to a list of the declared namespaces. I think that is enough to meet
this requirement. But any namespace declarations provided
within the Literal itself are not RDF's problem. They would not appear
in the list of namespaces the implementations must make available.
It is for the subsequent XML processor to dig those out.

> 	(2) it remains distinguishable from a string that
> 	happens to have the same characters.

Yes, the fact that parseType='Literal' was specified is an important
bit of info that must be preserved. The application which makes use
of the RDF info is then free to take the literal string value and
hand it off to an XML processor to obtain a structured thing, if that
is what they want to do. 

> i.e. I'm OK with "delaying" the parsing, so long as we don't
> lose information.
[...]
> As long as we somehow distinguish
> 	<prop parseType="Literal"><foo>bar</foo></prop>
> from
> 	<prop>&lt;foo&gt;bar&lt;/foo&gt;</prop>

Massive agreement.

Ron

Received on Thursday, 12 July 2001 12:31:07 UTC