[Prev][Next][Index][Thread]

Re: Shortrefs fatally flawed



At 9:54 AM 10/2/96, W. Eliot Kimber wrote:>
>I think we can pick a name that is highly likely not to be used.
It must still be a special name, and explicitly documented, and (possibly)
part of the application interface (where explaining XYZZY elements will be,
at the least, picturesque).

And if I generate tags by some automated process (say for distinct
formatting styles in a document coversion product) I might come up with the
"unlikely name" as a result of an algorithm. (base-26 encoded integers,
anyone).

This is a fragile, arbitrary, and ugly-to-describe "solution".

>  We could
>also use some sort of psuedo SGML declaration to rename both the
>psuedo-element element type and the delimiter pair used to quote data, e.g.:
>
><?XML PSEUDO="XXZZY" PEO="[" PEC="]">
><Mydoc>
> <foo>[This is quoted data]</foo>
></Mydoc>
Again, explaining the Pseudo-type is hard -- it's meaningless in terms of a
document user's needs. Just a bit of complication that must be adapted to,
not a new source of capability.

Redefining the delimiters makes the lexical analyzer hard to implement in
LEX, which is the standard tool for lexical analysis. And it' raises the
issue of checking that the delimiters don't conflict with anything that
might be legitimate in some other syntactic context.

>Note also that "\" *can* be used with the RCS as a shortref delimiter, you
>just have to use a numeric character reference in the SGML Declaration, e.g.:
>
>         DELIM    GENERAL  SGMLREF
>                  SHORTREF SGMLREF  "&#092;<"
>                                    "&#092;&"
>                                    "&#092;["
>                                    "&#092;]"
>                                    "&#092;&#34;"
>                                    "&#092;&#39;"
>                                    "&#092;&#092;"
>
>These shortrefs provide "escapes" for the common delimters, e.g. \<, \&,
>\[, \], \\, etc.

  This is a fine solution to the scaping problem, as it has _no_ impact on
the XML definition that's visible to the user, and enables a convenient and
widely familiar quoting model to be used instead of the uglier and
less-familiar SGML ones (although entity references are pretty familiar
nowadays, thanks to HTML).

    -- David

RE delenda est.

--------------------------------------------+--------------------------
David Durand                  dgd@cs.bu.edu | david@dynamicDiagrams.com
Boston University Computer Science          | Dynamic Diagrams
http://www.cs.bu.edu/students/grads/dgd/    | http://dynamicDiagrams.com/



Follow-Ups: