- From: Martin Duerst <duerst@w3.org>
- Date: Wed, 30 Jul 2003 10:03:56 -0400
- To: Graham Klyne <gk@ninebynine.org>, Brian McBride <bwm@hplb.hpl.hp.com>
- Cc: rdf core <w3c-rdfcore-wg@w3.org>, i18n <w3c-i18n-ig@w3.org>
At 09:33 03/07/30 +0100, Graham Klyne wrote: >At 17:21 29/07/03 -0400, Martin Duerst wrote: >Moving on two your examples, assuming we're talking about the current >specifications, I was not speaking about the current specifications. I was trying to show how we can construct denotations where denotations of some plain Literals and some (appropriate!) XML Literals are the same. This is not so in the current spec, so it would be impossible for me to talk about the current spec. >I disagree with: > >>(3) >>Concrete Syntax: <eg:prop pt="L"><br/></eg:prop> >>(additional example) >> >>Abstract Syntax: "<br/>"^^rdf:XML >> >>Denotation: >> sequence(character('<'), character('b'), character('r'), >> character('/'), character('>')) > >(a) it's not a character sequence, but an octet sequence, In the spec a week ago. Not in my proposal. Not, as far as I understood from http://lists.w3.org/Archives/Public/www-rdf-comments/2003JulSep/0105.html. But maybe I misinterpreted that, and XML Literals are still defined as denoting octet sequences? That would be very unfortunate. >(b) in the canonical XML representation the '<' and '>' should be escaped. Yes, but if we explicitly say they are single characters, there is no need anymore to unescape them. >>(4) >>Concrete Syntax: <eg:prop pt="L"><br/></eg:prop> >> >>Abstract Syntax: "<br></br>"^^rdf:XML >> >>Denotation: >> sequence(markup('<br>'), markup('</br>')) > >I don't understand what you mean by markup(x). If you mean something like >C(x), I would agree, so we would have: > > UTF8("<br></br>") > >... <breaks off>: > >Now I see what you're doing, I think: using character() and markup() not >as mapping functions but as type designators. Yes, I guess that comes close. >If I may lapse into Haskell [2] for a moment, our current design has the >denotation of a literal being: > > type XMLLiteralDenotation = [Octet] > >What you are doing here is changing the type of the denotation to >something more like: > > data XMLAtom = Character Char | Markup String > type XMLLiteralDenotation = [XMLAtom] > >(The data statement defines a new datatype that is a kind of discriminated >union: a string labelled as "Character" or a string labelled as "Markup". >See [2] for more about the notation) That seems to be what I wanted, as far as I understand. >I think there is a possible approach here that satisfies your goals, but >it represents a fundamental redesign of the way that literals are handled >in the formal semantics: plain literals are no longer self denoting. > >Using Haskell type notation again, a plain literal is: > > type PlainLiteral = [Char] > type PlainLiteralDenotation = [Char] > >and the mapping function is: > > PlainLiteralL2V :: PlainLiteral -> PlainLiteralDenotation > PlainLiteralL2V = id -- identity function > >Your proposal changes this quite radically: > > type PlainLiteral = [Char] > type PlainLiteralDenotation = [XMLAtom] -- XMLAtom as above Well, that could as well stay as [Char], because the only 'XMLAtom's allowed in PlainLiteralDenotation are characters. Programming languages are not very good at handling such kinds of restrictions (the restriction of a sequence of an union of characters and markup-pieces to only characters is a sequence of characters) because that's not how implementations work. Haskell is most probably not as tightly tied to implementations as C or C++, but still there must have been such considerations, or such a case was just considered too rare to deal with. In a purely denotational framework, I don't see any problem with this, because obviously a sequence of characters is a sequence of characters, whether this is implicit (as in the case of plain literals) or explicit (as in the case of XML literals that don't contain any markup). In other words, the notation sequence(character('<'), character('b'), character('r'), character('/'), character('>')) is just a notation to make it clear how XML Literals and plain literals relate, not a notation to indicate an explicit type identifier. In still other words, in a denotational world, characters are characters, and markup is markup if we define that it is markup, in the same way as in the real world, apples are apples, and oranges are oranges, without the need for any of them to have an explicit flag "Hello, I'm an Orange". A box full of apples is a box full of apples, independently of whether some other boxes also contain oranges or not, and independently of whether this box is allowed to contain oranges or not. > PlainLiteralL2V :: PlainLiteral -> PlainLiteralDenotation > PlainLiteralL2V = map Character -- force character interpretation > >My earlier positing [1] on this topic was quite clear that the position I >was stating was one of proceeding on the basis of no such fundamental >change. I think there's a real danger that if we make a fundamental >design change this late in the process we'll inadvertently introduce some >more damaging error. I agree that we shouldn't make such a fundamental change. But I don't think there is any need to make it. Denotations are clearly restricted by logic, but they should not be restricted by any particular programming language, even a very logical one. Regards, Martin. >#g >-- > >[1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2003Jul/0339.html > >[[ >The only way we're going to be able represent this kind of data, *and* to >handle markup in the same uniform framework, is to completely revisit the >design of RDF literal data so that a lexical form is not just a sequence of >Unicode characters, and is self-denoting. To change that would be a >late-stage fundamental change to the design with who-knows-what kinds of >repercussion. >]] > >[2] http://www.haskell.org/tutorial/ > (sections 2.2 and 2.3 cover the type notation used above.) > > >------------------- >Graham Klyne ><GK@NineByNine.org> >PGP: 0FAA 69FF C083 000B A2E9 A131 01B9 1C7A DBCA CB5E
Received on Wednesday, 30 July 2003 11:03:57 UTC