- From: pat hayes <phayes@ai.uwf.edu>
- Date: Thu, 21 Nov 2002 14:18:56 -0600
- To: "Jeremy Carroll" <jjc@hplb.hpl.hp.com>
- Cc: w3c-rdfcore-wg@w3.org
>Summary: > >For me, the move in namespace from rdfs:XMLLiteral to rdf:XMLLiteral may >impact the order in which material is presented in the model theory, but >should not impact *any* entailments. >If it does, then I would want to revisit that decision. Well, obviously it moves some things that were datatype entailments into rdf-entailment. Only the XMLLiteral cases, of course. > >In particular, correct treatment of rdf:XMLLiteral (this canonicalization >stuff) should not be required for rdf-entailment. Having it as an optional >extra, supported in datatype aware systems, was part of the intent of moving >it to be a datatype. > >I had two comments against the first concepts WD saying that XML Literal was >too central in the abstract syntax. Both came from W3C people. I note that >we have changed the namespace of the term (from rdfs to rdf) at the request >of the W3C rep. >If that has the side-effect of making XMLLiteral more central again, then I >suggest we request either Dan to consult the community they represent. We >may find that there isn't anyone proposing this change. > >====================== > > >> Well, make it more concrete. > >The current WD is very concrete! I will work through a simple example. > >> The value space consists of XML >> thingies. Can there be cases of two different XML literal strings >> denoting the same one of those thingies? > >Yes e.g. "a<em/>"^^rdf:XMLLiteral and "a<em></em>"^^rdf:XMLLiteral > >>From the WD we use a language identifier of "" and create the two (Unicode) >strings >"<rdf-wrapper xml:lang=''>a<em/></rdf-wrapper>" >and >"<rdf-wrapper xml:lang=''>a<em></em></rdf-wrapper>" >(Using the five part concatenation formula). > >These are then encoded in UTF-8 (in this case we can read them as ASCII). >These UTF-8 strings form two 'XML documents', in the sense of >http://www.w3.org/TR/REC-xml#sec-documents which also avoids the Platonic >issues). > > >We canonicalize both (as in the L2V mapping in the WD) and we get: >"<rdf-wrapper xml:lang=\"\">a<em></em></rdf-wrapper>\n" >(using N-triple escape notation - again this is a UTF-8 string). > >That is both strings map to the same canonical XML document. > >FYI: differences are: > - the ' quotes for attribute values got mapped to " quotes > - the "<em/>" empty tag got mapped to a start tag followed by an end tag > - the whitespace outside the document element (rdf-wrapper) was >normalized, to being a newline after the end-tag. > >Canonicalization will also make other changes. > > >>That is, are there any cases >> that ought to trigger the inference rule rdfD-2 for XMLLiteral? > >Yes (not that I've looked at that rule yet). > >> (Assume there are no lang tags.) If not, I propose that we just say >> that the value space is the same as the lexical space. > >No, it's not. OK. > >> But if there >> are any rdf:XMLLiteral-datatyping entailments then I ought to say >> what they are and incorporate them into rdf-entailment. > >I think not. I see datatyping as an optional layer on top. We have said that >systems should (lowercase) support XSD (including rdf:XMLLiteral), that's >good. No, we have now said that this is built into the RDF namespace, and RDFS includes all of RDF. >Also when we went down this canonicalization route we were very aware that >there are perfectly good RDF implementations that cannot tell when two XML >literals are identical. I don't think that position is now tenable. >That is the canonicalization stuff is a cost to the >cheap and cheerful implementor, and requiring them to even understand it >before understanding the basics of the model theory seems mistaken. > >I would hope that the DPH can take away something from the model theory - I >would be surprised if it were XML canonicalization. > >> >> If there are lang tags, does having different lang tags guarantee >> that the canonical XML docs are distinct, or can there be cases where >> the lang tags dissolve into nothing and leave the docs identical? > >Different lang tags means different documents (modulo case - since we now >normalize language tags on input, I think) > >> If >> the former, then the MT can treat XMLLiterals just like plain >> literals but with an XML syntax check added, which would be very nice >> and easy. >> > >No - it doesn't work. > >The old version of the abstract syntax did required canonicalization for >syntactic well-formedness - with that then yes they are quite like plain >literals. Since we only really need it for equality, and we only really need >equality for semantic reasons, moving the C14N into the L2V mapping has been >an improvement. > > > >> >I don't know whether anyone would care to argue whether a document >> >is or is not an XSD string. >> >> Lets agree that they are not, as far as we are concerned. After all, >> literal strings are not XSD strings either. Saying what XSD-anything >> is, is up to XMLS to do, not our job. > >*We* certainly won't say, but I think we are not saying that literal strings >are not xsd:string either. True, I mis-spoke. > >> >> >I would think not ... an xsd:string is a sequence of unicode code >> >points, whereas a document is a sequence of bytes (a canonical XML >> >document is a sequence of bytes in the UTF-8 charcater encoding). >> > > > >Jeremy -- --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes s.pam@ai.uwf.edu for spam
Received on Thursday, 21 November 2002 15:19:01 UTC