So if we can't fix everything, what can we fix?

Ideally, by XML-ER N.0 we could turn any arbitrary file into the fully-documented
abstract syntax tree representing the source code that generated that file -- and
while we're at it, spin straw into gold.

But for version 1.0 of XML-ER, maybe we should start with a taxonomy of conditions
we'd like to recover from, and work up?

It seems to me if we have some good use-cases, then we can figure out the
general scope of the work.

For example:

a. XML-ish document contains nulls or control characters.
b. XML-ish document contains other characters outside supported character set.
b.1. XML-ish document contains name starting with digit.
b.2. XML-ish document contains name containing invalid character that is otherwise a valid XML char.
c. XML-ish document contains a free &.
d. XML-ish document contains a single unclosed element.
e. XML-ish document contains an element with two colons.

And in each case, we can figure out what the user actually meant.

(Perhaps we should rename XML-ER to XML-DWIMNWIS[1]?)

[1]: http://www.urbandictionary.com/define.php?term=dwimnwis

-- 
TONY LAVINIO <mailto:alavinio@progress.com>
Principal Software Architect

PROGRESS SOFTWARE CORPORATION <http://www.progress.com/>
14 Oak Park | Bedford, MA 01730-1414 | USA
<http://maps.google.com/?cid=1965261258172274725&sll=42.513363,-71.252296>

DIRECT +1 413 529 2182 <tel:+1-413-529-2182> | MOBILE: +1 413 626 6870 <tel:+1-413-626-6870>
Google alavinio | Skype: alavinio

Received on Monday, 27 February 2012 03:16:48 UTC