- From: pat hayes <phayes@ihmc.us>
- Date: Thu, 3 Jul 2003 13:43:06 -0500
- To: Martin Duerst <duerst@w3.org>
- Cc: <w3c-rdfcore-wg@w3.org>
>> > [Pat:]4. Regarding Martin's other beef, that some XML without
>>any markup in
>> > it is 'really' just plain text,
>>
>>[Patrick:]I'm not 100% sure that this is in fact Martin's position,
>>but if it is, or
>>is
>>anyone else's position,
>
>{Martin:]It indeed is.
>
>>then my reply is that this is simply wrong.
>>
>>The difference between a plain literal and an XML literal, regardless
>>of the presence of markup, is that an RDF application is free to
>>presume that an XML literal constitutes well-formed XML, whereas
>>a plain literal need not. Period. It's as simple as that.
>
>Wrong. You are confusing some particular representations with
>the abstraction behind it. A plain literal can always be represented
>as a well-formed XML fragment. In RDF/XML, it actually is represented
>that way. In an RDF store, you may choose a different representation,
>but that's an implementation detail, which should not affect the
>design.
Ah, this may reveal the source of some of the heat in this
discussion. IF one takes the RDF/XML to be the 'real' representation,
and the RDF graph to be simply an implementation - call this view X -
then certain identities appear obvious. However, if you take the RDF
graph to be the 'real' representation and the RDF/XML to be simply an
implementation (an interchange syntax for the graphs) - call this
view G - then different identities seem obvious. The entire RDF
design has for some time now been based on view G rather than view X,
and on that view the design seems much more intuitive since the XML
text inside XML literals is just a chunk of text being treated as
XML; on this view, there is virtually no connection between the XML
fragment inside a paseType="Literal" (which is just a hunk of text
being plonked into a literal node in the RDF graph and labelled 'XML
stuff') and the XML outside it (which is the XML encoding of the RDF
graph, used only as an interchange syntax). I can see, and empathize
with, Patrick's point of view here: on view G, there really is no
connection between them, and the XML inside the paseType="Literal" in
a sense isn't even really a part of the surrounding XML document at
all, and shouldn't inherit such things as lang tags by XML inclusion
rules. On the other hand, there is no doubt that an RDF/XML document
sure *looks* like XML, and I also can see Martin's point that it is
particularly weird from a position X point of view to have an XML
document in which the pieces of it which are explicitly labelled as
being RDF-XML are the only pieces that apparently violate normal XML
rules. The point of my suggestion was not to deny view G (looking at
Patrick here) but only to suggest that there might be a low-cost way
to arrange that the view-X vision of RDF/XML was somewhat less
peculiar.
.....
>
>>Furthermore, the comparison of XML literals is not
>>based on string-equality. These important distinctions from plain
>>strings are captured semantically in the definition of the datatype
>>rdf:XMLLiteral and in the RDF datatyping model.
>
>Is there any case where
> <foo:prop>Some text here</foo:prop>
>and
> <foo:prop rdf:parseType="Literal">Some text here</foo:prop>
>actually behave differently with respect to equality?
Well, yes, of course. For example suppose some app is manipulating
XML documents by taking them apart into pieces, checking various
properties of those pieces, maybe substituting other pieces for some
of them, and then reassembling the XML; and suppose it is using RDF
to encode the fragments in order to reason about them in some way. It
would be natural to do a kind of type checking, using rdf:XMLLiteral,
to ensure that only genuine XML fragments were included. Under these
circumstances, the distinction between text drawn from an XML source
and the 'same' text from some other source might be critical, and the
reasoner might want to carefully distinguish them and treat them as
distinct. The fact that some piece of XML text happens to not have
any XML markup in it would be largely irrelevant to this kind of
application; and to rely on that kind of internal examination of the
text would be infeasable as an implementation strategy and would
nullify the point of having the typing information in the first
place. This is the entire point of having datatypes, for many
purposes: it frees code from the need to check that is methods are
appropriate, by labelling the data with a 'type' which guarantees
that they are.
>I.e. is it possible to construct two text strings A and B,
>both conforming to XML #PCDATA, where
> <foo:prop>A</foo:prop> and
> <foo:prop>B</foo:prop> are equal
>but
> <foo:prop rdf:parseType="Literal">A</foo:prop> and
> <foo:prop rdf:parseType="Literal">B</foo:prop> are not
>(or the other way round)?
>
>I do not know of any such case. If you know of any, please tell us.
>If there is none, the above argument is moot.
>
>>Though this is not pointed out anywhere explicitly in the RDF specs
>>(which is a shame, but understandable, since it needs a bit more
>>testing to ensure there are no major dragons) the present RDF datatyping
>>solution, with XML Literals modelled as a datatype, allow for us
>>to support the entire range of XML Schema types, including complex
>>types! And thereby, define property ranges to be e.g. xhtml:title,
>>asserting that all property values conform to the content model
>>constraining the lexical space of xhtml:title elements.
>
>First, xhtml:title is really a boring example, as it is currently
>only #PCDATA. But that should change with XHTML2, so let's assume
>we are there for the sake of this example.
>
>It is important to point out that there is a distinction between
>xhtml:title, and the content model of xhtml:title. The former
>looks like
> <foo:prop rdf:parseType="Literal"><xhtml:title
> >Your <xhtml:em>Title</xhtml:em> Here</xhtml:title></foo:prop>
>The later looks like
> <foo:prop rdf:parseType="Literal"
> >Your <xhtml:em>Title</xhtml:em> Here</foo:prop>
>
>I am very convinced that the later will be much more frequent
>in the context of RDF, because it is better to model the
>fact that this is a title as an RDF property. This is also
>the more important case for I18N.
>
>>Such benefits
>>simply dissappear if XML Literals are not modeled as typed
>>literals
>
>Well, they may.
>
>>-- and we certainly don't want to go back to treating
>>rdf:XMLLiteral as a special case of datatype with lang tag.
>
>Why not? It is special, so why don't you treat it specially?
There are technical reasons (to do with identity substitution on
datatype names) why it is unworkable to have a 'special' datatype
which violates the structural assumptions of the datatyping model,
and it is not feasible or desireable to include lang tags in the
datatyping model. So *if* we treat XML literals as being typed by an
XML datatype, it is infeasible to include lang tags as part of the
literal structure. They could be included as part of the XML literal
string itself, by requiring all such literals to have a special
rdf-wrapper onto which the lang tag can be attached by normal XML
conventions; but then of course the actual XML literal string no
longer looks like the XML fragment included in the RDF/XML document.
But on the other hand, from view X, I guess that would just be an
implementation detail.
Pat
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 3 July 2003 14:43:10 UTC