Re: What to do about newlines in attribute values?

James Clark scripsit:

> a) allow newlines but normalizes to spaces (as XML): compatible with
> XML, but leaves MicroXML with ugly and surprising

Yes, we don't want Teh Ugly, and transforming data behind the scenes
is Teh Ugly.  I assume this normalization was included in the days of
fixed-length records (actual or virtual punch cards) where you might be
forced to split an attribute value across physical records.  But even
the simplest editors allow long lines nowadays.

> b) disallow literal newlines: compatible with XML; users can still
> include newlines using numeric character references; possibly better
> error recovering when a closing quote is missing; maybe a surprising
> limitation, but then again lots of programming languages don't allow
> literal newlines in string literals

I think this is the least surprising behavior for people who don't know
XML's quirks: if they try putting multi-line text into an attribute
value, it will fail fast.  At Google, I was asked to take a look at an
XML-based API that was "losing data" between the sender and the receiver.
It turned out that the format called for putting a whole RFC 822 email
into an attribute value.  "Won't work, guys", said I.

> c) allow newlines but don't normalize them: most useful behaviour but
> incompatible with XML

And therefore a non-starter.

> Currently the spec has (c), but I can see definite advantages to (b).

I think we should shift to (b).

-- 
An observable characteristic is not necessarily         John Cowan
a functional requirement.  --John Hudson                cowan@ccil.org

Received on Thursday, 13 September 2012 06:48:38 UTC