Re: XHTML character entity support

James Graham scripsit:

> Why would one want to make the availability of one's site critically 
> dependent on not hitting those bugs given the availability of an 
> alternative?

That's just silly.  Why not avoid all libraries everywhere, all the time,
and write all your own code in assembly language so you don't have to
worry about compiler/interpreter bugs, either?

Because the probability that a widely used library has a bug is much
lower than the probability that a piece of bespoke code has a bug.
If you're Chuck Moore, go ahead and write all your own code.  If you're
a more humble programmer, use libraries whenever you can, and if you
find bugs, work around them or get them fixed.

FWIW, one of those bugs is about characters on supplementary planes in
Xalan, the other is about the fact that genx, the C XML writer library,
doesn't barf when you try to write out an element and pass NULL as
the name.  Neither of these bugs will be hit very often in practice.

> Why would one want to sink resources into an architecture that
> required XML-centric design (always use a tree model, never do string
> concatenation,

A great many XML applications use streaming parsers and streaming output
generators.  Using XMLWriter (versions available for Perl, Python,
PHP, Java, C#) or genx (for C) definitely allows string concatenation,
but escapeworthy characters are properly escaped (differently
for attribute values and for character content, as XML requires) and
an exception is raised if tags aren't properly balanced.

Given that, the library can be as simple as startDocument(),
startElement(name, attributeMap), outputText, endElement(name),
endDocument(), and still be guaranteed to produce well-formed XML
(modulo bugs).  In practice, you want namespace support, the ability to
write out PIs and DOCTYPE declarations and comments, and so on.

> religiously remove all XML-disallowed characters from any input,
> anywhere, deal with the speed hit implied by these things) given an
> alternative option?

If you do not *like* crottled greeps, do not order them.

> On the other hand consistent parsing with rules understandable by 
> mortals is nice. I don't think anyone would want a language with foster 
> parenting or the adoption agency if it wasn't really needed for 
> compatibility.

I'm glad we agree there.

> Doing XML:the good bits ("XML5") seems like a no-brainer 
> if it can give you the best of both approaches.

I look forward to seeing your design.

-- 
What has four pairs of pants, lives             John Cowan
in Philadelphia, and it never rains             http://www.ccil.org/~cowan
but it pours?                                   cowan@ccil.org
        --Rufus T. Firefly

Received on Thursday, 12 November 2009 16:13:38 UTC