- From: pat hayes <phayes@ihmc.us>
- Date: Fri, 1 Aug 2003 15:34:30 -0500
- To: "Patrick Stickler" <patrick.stickler@nokia.com>
- Cc: w3c-rdfcore-wg@w3.org
- Message-Id: <p06001a03bb50734db262@[10.0.100.23]>
> > >----- Original Message ----- >From: <mailto:phayes@ihmc.us>ext pat hayes >To: <mailto:bwm@hplb.hpl.hp.com>Brian_McBride ; ><mailto:jjc@hpl.hp.com>Jeremy Carroll ; ><mailto:gk@ninebynine.org>Graham Klyne >Cc: <mailto:w3c-rdfcore-wg@w3.org>w3c-rdfcore-wg@w3.org ; ><mailto:pfps@research.bell-labs.com>Peter F. Patel-Schneider >Sent: 01 August, 2003 06:41 >Subject: XML literals > >Gentlemen: > >I am completely sick of all these debates about XML literals. > > >Join the club ;-) > >Allow me to suggest a possible solution, along the lines suggested >by Peter, which will serve to resolve them without making any >substantial changes to the current RDF design and to everyone's >general satisfaction. This is a wording change to the Concepts >document; I do not believe it amounts to any real change in our >current design, and may be easier to follow. > >1. Concepts section 5.1 modified as follows (change starts at ***) >..... > >Such content is indicated in an RDF graph using a typed literal >whose datatype is a special built-in datatype rdf:XMLLiteral , >defined as follows. >A URI reference for identifying this datatype >is http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral . >The lexical space is the set of all strings which: >are well-balanced, self-contained XML data [ XML ]; >correspond to exclusive Canonical XML (with comments, with empty >InclusiveNamespaces PrefixList )[XML-XC14N] ; >when embedded between an arbitrary XML start tag and an end tag form >a document conforming to XML Namespaces [XML-NS] >*** > > >Fine so far (i.e. the unchanged bit ;-) > > >The value space is some set of entities, called XML values, which is: >disjoint from the lexical space > > >OK. > >disjoint from the value space of any XML schema datatype (refer XSD) > > >Debatable, I'm saying we are *defining* it this way. No debate is possible. >and we should be careful about making this claim. Better to say nothing >than something that may end up biting us later. > > >disjoint from the set of Unicode character strings (refer Unicode) > > >Maybe. See above > > >in 1:1 correspondence with the lexical space. > >Right. >The exact nature of XML values is not specified. > > >No. This bothers me. Alot. > >It is our responsibility to define what the values of XML Literals are. >It's *our* datatype, and no'one else should have to define it, or guess. > Thats exactly what I am doing. Its OUR set, and we are saying that its not the same as any Unicode or XML Schema set. However, it seems to me that we could say: ..... is not specified, but can be thought of as the Xpath nodeset of the XML literal. >Taking this position is irresponsible, at best. > >We can either take the present position, whereby XML literals have >lexical forms constituting canonical XML fragments encoded as >Unicode strings and values constituting the lexical form in UTF-8. > >Or, if that bothers I18N, we could take a position whereby XML literals >have lexical forms constituting canonical XML fragments encoded as >Unicode strings and values corresponding to infosets. > The problem with saying that it IS almost anything defined by someone else, is that we get into arcane debates about the exact edges of the identity criteria for those things. Peter and Brian and Jeremy and Graham can legitimately disagree about issues like whether <br></br> and <br /> indicate the same one of these things or not. The only way to break out of this tangle is either to re-write the entire XML document suite more coherently as though it were written with a single voice (not an option - CMSMcQ just quoted Saussure at me, for God's sake; when it gets to Derrida I am out of here) or else to just say as clearly as possible what assumptions WE are making about these things. And my suggestion for doing the latter is to make them distinct from everything else they can possibly be confused with. Then later on, other spec writers are free to say that what they are talking about is the same as what we are talking about, if they want to do that. >This latter option has always struck me as the correct solution, as >after all, that's really what we're talking about, XML documents, >not just uninterpreted sequences of characters, and is not the >very point of the Infoset spec to provide a consistent interpretation >of sequences of characters in terms of XML? I have no idea. I am surprised that you would object to the above on the grounds that we arent being honest, but you are willing to swallow the idea of an infoset. I have read this document more times than I want to admit, and I still have absolutely no idea what infosets ARE. The document never tells you : it just says they are an abstract set satisfying certain rather complicated conditions. My proposal is to do the same, but with much simpler conditions. > >I've never understood the opposition to having a value space >consisting of infosets. I wish someone would tell me what significant >problem or issue I'm missing... My only worry is, how do I tell when two XML literals denote the same infoset? And will Peter, you , me and uncle Tom Cobbley come to the same conclusions? Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32501 (850)291 0667 cell phayes@ihmc.us http://www.ihmc.us/users/phayes
Received on Friday, 1 August 2003 16:34:33 UTC