W3C home > Mailing lists > Public > www-tag@w3.org > April 2003

Re: [xmlProfiles-29] xml subsetting in IETF XMPP

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: 02 Apr 2003 10:05:01 +0100
To: Tim Bray <tbray@textuality.com>
Cc: Noah Mendelsohn/Cambridge/IBM <noah_mendelsohn@us.ibm.com>, Chris Lilley <chris@w3.org>, Fabrice Desré <fabrice.desre@francetelecom.com>, www-tag@w3.org
Message-ID: <f5bsmt146de.fsf@erasmus.inf.ed.ac.uk>

As I've said before, I don't like _any_ of "subset", "profile" or
"usage convention" -- if we decide to address the requirements in this
area, I believe the right way to do it is with a new conformance class
alongside the two already provided ('validating' and
'non-validating').

Call such a conformance class 'minimal' -- it can be trivially defined
as a further restriction of 'non-validating', as follows (this is an
edited copy of text from section 5.1 of XML 1.0 2e [1], changes in
bold):

  **Minimal** processors are required to check only the document entity,
  including the entire internal DTD subset, for well-formedness,
  *except that they must not process any general or parameter entity
  declarations*. [Definition: While they are not required to check the
  document for validity, they are required to *minimally* process all
  the *non-entity* declarations they read in the internal DTD subset,
  up to the first reference to a parameter entity *[deleted]*; that is
  to say, they must use the information in those declarations to
  normalize attribute values *[deleted]*.] Except when
  standalone="yes", they must not process *[deleted]* attribute-list
  declarations encountered after a reference to a parameter entity
  *[deleted]*, since the entity may have contained overriding
  declarations.

The crucial difference between such an approach and the 'profile' or
'subset' approach is that it doesn't change the fundamental
universality of XML -- all conformant processors can process all XML
documents.  That minimal processors will 'produce' slightly different
infosets from some inputs than non-validating processors is nothing
new, it's already true that non-validating processors will 'produce'
slightly different infosets from some inputs than validating
processors.

This approach does not address the question of processing instructions
and comments -- like the TAG, I think applications existing freedom to
ignore these is sufficient.

ht

[1] http://www.w3.org/TR/REC-xml#sec-conformance
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Wednesday, 2 April 2003 04:05:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:17 GMT