tricky rxp/xmlproc question

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The question of pinning down the [element content whitespace] property
of character infoitems in the output of processors conforming to the
minimum and basic XML processor profiles has come up.  Our reading of
the XML spec. is that it leaves open the question of whether a
non-validating process supplies that information to applications.  My
reading of the Infoset spec. is that it carefully allows for the
possibility that they do.

So, some questions:

 1) In the spirit of trying to provide a deterministic answer to the
    "What infoset do you get" question wrt our profiles, what should
    we say wrt [element content whitespace], i.e. must be absent, must
    be present and accurate?

 2) Does RXP/xmlproc provide e-c-w information when it's not
    validating but there is a doctype?

 3) Do you happen to know what any other parser does in this regard?

 4) Or, do you think the above analysis is wrong and it's actually an
    error for a processor which isn't validating to supply e-c-w
    information?

Thanks,

ht
- -- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)

iD8DBQFMBS39kjnJixAXWBoRAi56AJ0dOwor1ZBplVZDmm3fSdqTzBobugCfXR25
TWixdzAQe0l/3XSsUQZAujA=
=THUd
-----END PGP SIGNATURE-----

Received on Tuesday, 1 June 2010 15:58:33 UTC