- From: pat hayes <phayes@ihmc.us>
- Date: Thu, 3 Jul 2003 13:43:06 -0500
- To: Martin Duerst <duerst@w3.org>
- Cc: <w3c-rdfcore-wg@w3.org>
>> > [Pat:]4. Regarding Martin's other beef, that some XML without >>any markup in >> > it is 'really' just plain text, >> >>[Patrick:]I'm not 100% sure that this is in fact Martin's position, >>but if it is, or >>is >>anyone else's position, > >{Martin:]It indeed is. > >>then my reply is that this is simply wrong. >> >>The difference between a plain literal and an XML literal, regardless >>of the presence of markup, is that an RDF application is free to >>presume that an XML literal constitutes well-formed XML, whereas >>a plain literal need not. Period. It's as simple as that. > >Wrong. You are confusing some particular representations with >the abstraction behind it. A plain literal can always be represented >as a well-formed XML fragment. In RDF/XML, it actually is represented >that way. In an RDF store, you may choose a different representation, >but that's an implementation detail, which should not affect the >design. Ah, this may reveal the source of some of the heat in this discussion. IF one takes the RDF/XML to be the 'real' representation, and the RDF graph to be simply an implementation - call this view X - then certain identities appear obvious. However, if you take the RDF graph to be the 'real' representation and the RDF/XML to be simply an implementation (an interchange syntax for the graphs) - call this view G - then different identities seem obvious. The entire RDF design has for some time now been based on view G rather than view X, and on that view the design seems much more intuitive since the XML text inside XML literals is just a chunk of text being treated as XML; on this view, there is virtually no connection between the XML fragment inside a paseType="Literal" (which is just a hunk of text being plonked into a literal node in the RDF graph and labelled 'XML stuff') and the XML outside it (which is the XML encoding of the RDF graph, used only as an interchange syntax). I can see, and empathize with, Patrick's point of view here: on view G, there really is no connection between them, and the XML inside the paseType="Literal" in a sense isn't even really a part of the surrounding XML document at all, and shouldn't inherit such things as lang tags by XML inclusion rules. On the other hand, there is no doubt that an RDF/XML document sure *looks* like XML, and I also can see Martin's point that it is particularly weird from a position X point of view to have an XML document in which the pieces of it which are explicitly labelled as being RDF-XML are the only pieces that apparently violate normal XML rules. The point of my suggestion was not to deny view G (looking at Patrick here) but only to suggest that there might be a low-cost way to arrange that the view-X vision of RDF/XML was somewhat less peculiar. ..... > >>Furthermore, the comparison of XML literals is not >>based on string-equality. These important distinctions from plain >>strings are captured semantically in the definition of the datatype >>rdf:XMLLiteral and in the RDF datatyping model. > >Is there any case where > <foo:prop>Some text here</foo:prop> >and > <foo:prop rdf:parseType="Literal">Some text here</foo:prop> >actually behave differently with respect to equality? Well, yes, of course. For example suppose some app is manipulating XML documents by taking them apart into pieces, checking various properties of those pieces, maybe substituting other pieces for some of them, and then reassembling the XML; and suppose it is using RDF to encode the fragments in order to reason about them in some way. It would be natural to do a kind of type checking, using rdf:XMLLiteral, to ensure that only genuine XML fragments were included. Under these circumstances, the distinction between text drawn from an XML source and the 'same' text from some other source might be critical, and the reasoner might want to carefully distinguish them and treat them as distinct. The fact that some piece of XML text happens to not have any XML markup in it would be largely irrelevant to this kind of application; and to rely on that kind of internal examination of the text would be infeasable as an implementation strategy and would nullify the point of having the typing information in the first place. This is the entire point of having datatypes, for many purposes: it frees code from the need to check that is methods are appropriate, by labelling the data with a 'type' which guarantees that they are. >I.e. is it possible to construct two text strings A and B, >both conforming to XML #PCDATA, where > <foo:prop>A</foo:prop> and > <foo:prop>B</foo:prop> are equal >but > <foo:prop rdf:parseType="Literal">A</foo:prop> and > <foo:prop rdf:parseType="Literal">B</foo:prop> are not >(or the other way round)? > >I do not know of any such case. If you know of any, please tell us. >If there is none, the above argument is moot. > >>Though this is not pointed out anywhere explicitly in the RDF specs >>(which is a shame, but understandable, since it needs a bit more >>testing to ensure there are no major dragons) the present RDF datatyping >>solution, with XML Literals modelled as a datatype, allow for us >>to support the entire range of XML Schema types, including complex >>types! And thereby, define property ranges to be e.g. xhtml:title, >>asserting that all property values conform to the content model >>constraining the lexical space of xhtml:title elements. > >First, xhtml:title is really a boring example, as it is currently >only #PCDATA. But that should change with XHTML2, so let's assume >we are there for the sake of this example. > >It is important to point out that there is a distinction between >xhtml:title, and the content model of xhtml:title. The former >looks like > <foo:prop rdf:parseType="Literal"><xhtml:title > >Your <xhtml:em>Title</xhtml:em> Here</xhtml:title></foo:prop> >The later looks like > <foo:prop rdf:parseType="Literal" > >Your <xhtml:em>Title</xhtml:em> Here</foo:prop> > >I am very convinced that the later will be much more frequent >in the context of RDF, because it is better to model the >fact that this is a title as an RDF property. This is also >the more important case for I18N. > >>Such benefits >>simply dissappear if XML Literals are not modeled as typed >>literals > >Well, they may. > >>-- and we certainly don't want to go back to treating >>rdf:XMLLiteral as a special case of datatype with lang tag. > >Why not? It is special, so why don't you treat it specially? There are technical reasons (to do with identity substitution on datatype names) why it is unworkable to have a 'special' datatype which violates the structural assumptions of the datatyping model, and it is not feasible or desireable to include lang tags in the datatyping model. So *if* we treat XML literals as being typed by an XML datatype, it is infeasible to include lang tags as part of the literal structure. They could be included as part of the XML literal string itself, by requiring all such literals to have a special rdf-wrapper onto which the lang tag can be attached by normal XML conventions; but then of course the actual XML literal string no longer looks like the XML fragment included in the RDF/XML document. But on the other hand, from view X, I guess that would just be an implementation detail. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Thursday, 3 July 2003 14:43:10 UTC