- From: Kevin D. Keck <kdkeck@lbl.gov>
- Date: Sat, 12 Jul 2003 22:55:29 -0700
- To: <www-archive@w3.org>
I remember being bothered by this when I first saw it, but it's only now that I'm starting to try to use it for an actual (open source!) product that I'm motivated enough to complain. And, happily, it seems that the natural evolution of the language has left an opening for a simple resolution of this issue. To refresh everyone's memory, the de facto spec. for N3 (http://www.w3.org/DesignIssues/Notation3) contains the following section: > Identifier munging > > This syntax does not allow minus signs in identifiers, whereas the XML > encoding for RDF does. > > The current solution is mapping sequences of "-" and "_" into sequences of "_" > by taking a contiguous sequence of - and _, replacing _ with 0 and - with 1, > then adding a leading "1", taking what you have as a binary number, > subtracting 1 from the result, and then writing that many _ signs. The mapping > is 1:1, and maps the simple case of - onto __ and _ onto _. The only > disadvantage is that those who go crazy with n consecutive occurrences of - > and/or _ in XML will pay for it in an even crazier 2**n long sequence in the > N3. (2001/12/4) > > A messy thing from the N3 point of view is that for content negotiation to > work between RDF/XML and N3, the fragment identifier sysntax must be the same, > and this would suggest that both use the "-" (XML) form. > It would be nice to be able to outlaw - in IDs. The professed reason for this is that Tim wanted to reserve '-' for [future] operators. First of all, proper accommodation for '-' as a non-initial character in identifiers does not preclude the use of '-' in operator names, nor require a look-ahead lexer to parse (as erroneously asserted in an earlier post). It only means that some whitespace will be required between identifiers and any operators which begin with '-'. However, this would seem generally required in order to avoid confusion, i.e., to prevent unwary people from misunderstanding the meaning of such a string, anyway. Besides which, N3 appears to have been pretty stable for quite some time now, and it hasn't included an operator which begins with - since '->' (and '<-') were dropped, apparently over a year ago. But most damningly, the escape mechanism devised to address this issue is not only unwieldy and unfriendly, it is not even complete: how do you escape "a-to-_"? My guess is cwm would currently escape it as a__to___, which collides with the encoding of "a-to__". Not to mention the fact that many N3 parsers don't seem to implement "identifier munging" at all (see http://infomesh.net/2002/n3qname.html). So here's hoping Tim is now ready to excise this wart from N3. If he were serious about banishing '-' from RDF IDs, he should have fought that battle a long time ago, back when RDF was being finalized. It would seem much too late for that now, so the sensible thing to do is just accept it, properly accommodate it, and move on. And while we're at it, the same thing goes for '.'. I expect some will reflexively protest that the use of '.' as the statement terminator makes this problematic. They might have a leg to stand on if it weren't for the fact that '.' is already also permitted within numeric (rational) literals. So what do you say? Can we finally add proper support for '-' and '.' as non-initial chars in indentifiers to N3? - Kevin D. Keck http://vimss.lbl.gov/~kdkeck/ 510-486-4856
Received on Sunday, 13 July 2003 01:59:27 UTC