Re: What to do about newlines in attribute values?

On Thu, Sep 13, 2012 at 12:48 AM, John Cowan <> wrote:

> James Clark scripsit:
> > a) allow newlines but normalizes to spaces (as XML): compatible with
> > XML, but leaves MicroXML with ugly and surprising
> Yes, we don't want Teh Ugly, and transforming data behind the scenes
> is Teh Ugly.  I assume this normalization was included in the days of
> fixed-length records (actual or virtual punch cards) where you might be
> forced to split an attribute value across physical records.  But even
> the simplest editors allow long lines nowadays.
> > b) disallow literal newlines: compatible with XML; users can still
> > include newlines using numeric character references; possibly better
> > error recovering when a closing quote is missing; maybe a surprising
> > limitation, but then again lots of programming languages don't allow
> > literal newlines in string literals
> I think this is the least surprising behavior for people who don't know
> XML's quirks: if they try putting multi-line text into an attribute
> value, it will fail fast.  At Google, I was asked to take a look at an
> XML-based API that was "losing data" between the sender and the receiver.
> It turned out that the format called for putting a whole RFC 822 email
> into an attribute value.  "Won't work, guys", said I.
> > c) allow newlines but don't normalize them: most useful behaviour but
> > incompatible with XML
> And therefore a non-starter.

I'm not so sure about this.  As I have understood it the backward
compatibility goal is satisfied as long as every MicroXML document is a
well-formed XML document.  This issue doesn't affect that.  I am personally
OK with such a minor, but sensible processing difference, so I'm still +1
on (c).

Uche Ogbuji             
Founding Partner, Zepheira

Received on Thursday, 13 September 2012 14:46:28 UTC