RE: IDs - make them case sensitive

W. Eliot Kimber <eliot@isogen.com> writes:
> At 07:45 PM 6/27/97 -0700, Murray Altheim wrote:
> >Andrew Layman <andrewl@microsoft.com> writes:
> >> Lauren and Tim have some good points regarding IDs.  It is a benefit if
> >> an id can easily map between an element and either a real world entity
> >> or an object in another database system.  Unless there is some bad
> >> consequence to permitting a freer use of characters, relaxing the
> >> restrictions just seems to simplify document construction and
> >> processing.
> >
> >Well, the first thing that comes to mind would be that XML documents using
> >IDs not beginning with a name start character would not work in existing
> >SGML systems, and my understanding of the RCS is that it's not possible to
> >modify what constitutes a valid name start character to allow digits. This
> >would simply make XML documents and systems that create them incompatible
> >with SGML. I certainly stand to be corrected, but it seems pretty explicit
> >in 8879.
> 
> Not true, if I understand your point.  The reference concrete syntax does
> not specify the digits as name start characters, but XML is not using the
> reference concrete syntax and few useful SGML tools do not allow variant
> syntaxes (and therefore will allow you to redefine the name start
> characters) [The only one I can think of off hand in Panorama, which
> doesn't read SGML declarations at all and simply provides a longer name
> length automatically.].

Well, unless I was doing it incorrectly (which is certainly possible), I 
couldn't get nsgmls to accept digits as namestart characters. I changed
both LCNMSTRT and UCNMSTRT to "0123456789" in the DCL and it produced 

    nsgmls:html32.dcl:48:15:E: digits assigned to LCNMCHAR, UCNMCHAR,
        LCNMSTRT or UCNMSTRT: 48-57

If this is an error I'm making, a correction would be welcome. If I'm not,
and nsgmls is operating in conformance, and my last perusal of the SGML
Handbook was not in error, I believe this may be not only a requirement of
the RCS, but of 8879 in general. Again, I bow to those with greater experience
in these matters.

But I do think you understood my point, which was that XML is slated as
"SGML on the Web" (although I see now that some would prefer to ignore that
aspect) and that 'conformance with SGML' (XML-Lang sec.1.1) was not simply
a matter of raw conformance with 8879:1986, but with the intent that
existing tools would be able to process XML documents with little or no
modification, ie., that XML was a 'subset' of SGML, not a 'variant' (to use
DocBook's terminology).

> Thus, adding digits (or anything else within reason) would not make XML
> documents incompatible with SGML systems in general, although it might with
> particular tools (for example, adding "_" to name start could thoroughly
> confuse ADEPT*Editor, which uses "_" as a "reserved tag prefix"--not a good
> implementation choice in retrospect). 

It still seems a mistake to deliberately break conformance with existing,
conformant SGML applications when it is so simple to add or strip an alpha
prefix in an ID. How far XML strays from conformance will affect the extent
to which existing tools must be modified, and how modifications to those
tools may break conformance with existing document sets.

Murray

...........................................................................
Murray Altheim, SGML Grease Monkey                  <altheim[@]eng.sun.com>
Member of Technical Staff, Tools Development & Support
Sun Microsystems, 2550 Garcia Ave., MS UMPK17-102, Menlo Park, CA 94043 USA
         "Give a monkey the tools and he'll build a typewriter."

Received on Tuesday, 1 July 1997 15:13:34 UTC