[Bug 28812] JSON options 'unescape' and 'liberal' prevent use of off-the-shelf JSON parsers from bugzilla@jessica.w3.org on 2015-06-15 (public-qt-comments@w3.org from June 2015)

From: <bugzilla@jessica.w3.org>
Date: Mon, 15 Jun 2015 17:20:33 +0000
To: public-qt-comments@w3.org
Message-ID: <bug-28812-523-bdSVpJQrXD@http.www.w3.org/Bugs/Public/>

https://www.w3.org/Bugs/Public/show_bug.cgi?id=28812

--- Comment #1 from Michael Kay <mike@saxonica.com> ---
I would not be in favour of this change.

Firstly, it has been our tradition to write our specification based on user
requirements not on ease of implementation. For example, we have made no
concessions to ease of implementation when defining our regular expression
language, or when defining the precise rules for Base64. We have occasionally
considered taken into account what can be done with standard libraries, for
example when specifying the trig functions library, but only where there was no
sacrifice in functionality or usability.

Secondly, writing your own JSON parser takes a couple of hours. It's
considerably easier than regular expressions and hardly more difficult than
Base64. The Saxon implementation is about 650 lines of code, including all the
options appearing in the spec.

It is legitimate to support liberal='true' as a no-op; you can ignore the
option if you are happy to impose strict conformance on your users. It's there
because 15 years of experience with XML tells us that the internet mantra of
"be liberal in what you accept" sometimes makes life easier for users,
especially when consuming data feeds over which they have no control. No doubt
the kind of deviations from the strict grammar that it is useful to allow will
emerge over time.

The option unescape=false is important because it is the only way we know to
accept ALL Json input, even if it contains characters that are not legal in
XML. OK, if you support XML 1.1, there is only one such character (x00), but
one character is enough to cause problems. 

You are right to point out that unescape=false fails to meet its intended
purpose of ensuring that non-XML characters can always be handled. That's a
bug: it should say that if unescape=false is specified, escaped characters will
be retained in escaped form, and unescaped characters will be converted to
escaped form if they are not valid XML characters. It's true we could also meet
this use case with the "fallback" option, but that surely has the same problems
in terms of off-the-shelf library support.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Monday, 15 June 2015 17:20:36 UTC