Re: tag name state

On 3/3/2012 6:00 AM, David Carlisle wrote:
>
> I think if we can decide that there is a requirement that the output from
> xml-er is (or corresonds to) well formed xml, then decisions about whether
> we express that using a data model or as a literal xml document or both,
> will be a lot easier.

Good point.

Still, it's not impossible that because of the advantages in the many cases 
where the output will correspond to well-formed XML, we may want to 
document text output for all cases.  Specifically, and this is just a 
proposal for consideration:

1, When the input is well formed, we specify a 1-to-1 mapping (unless 
someone sees value in using c14n)

2. In the many cases where the fixed up output does correspond to well 
formed XML, we specify a mapping to a well formed XML stream

3. In the cases where the output does not correspond to a well formed 
stream, e.g. because we allow non-XML characters, we specify a mapping to a 
non-well formed stream, e.g. one in which the tags contain characters that 
XML considers illegal.

Again, I think the advantage is that this builds on the formal definition 
of what well formed XML is, I.e. a stream of characters. It ensures that in 
cases 1 & 2, which are very common, >all< existing XML specifications are 
directly applicable.

For case 3, we might additionally want to provide advice on generalization 
of certain models, such as the DOM or Infoset, so that processors would 
have the option to provide them in an interoperable way if desired.

Again, I am in any case >not< proposing that the typical processor would 
have to serialize the output as an intermediate step in building a DOM, 
etc.; rather for conformance checking, it would have to show in cases 1 & 2 
the at the DOM or DM or whatever it built is indeed the one that would have 
resulted from parsing the specified fixed-up XML.

Noah

Received on Saturday, 3 March 2012 16:41:14 UTC