W3C home > Mailing lists > Public > www-tag@w3.org > April 2003

Re: [xmlProfiles-29] xml subsetting in IETF XMPP

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: 02 Apr 2003 10:05:01 +0100
To: Tim Bray <tbray@textuality.com>
Cc: Noah Mendelsohn/Cambridge/IBM <noah_mendelsohn@us.ibm.com>, Chris Lilley <chris@w3.org>, Fabrice Desré <fabrice.desre@francetelecom.com>, www-tag@w3.org
Message-ID: <f5bsmt146de.fsf@erasmus.inf.ed.ac.uk>

As I've said before, I don't like _any_ of "subset", "profile" or
"usage convention" -- if we decide to address the requirements in this
area, I believe the right way to do it is with a new conformance class
alongside the two already provided ('validating' and

Call such a conformance class 'minimal' -- it can be trivially defined
as a further restriction of 'non-validating', as follows (this is an
edited copy of text from section 5.1 of XML 1.0 2e [1], changes in

  **Minimal** processors are required to check only the document entity,
  including the entire internal DTD subset, for well-formedness,
  *except that they must not process any general or parameter entity
  declarations*. [Definition: While they are not required to check the
  document for validity, they are required to *minimally* process all
  the *non-entity* declarations they read in the internal DTD subset,
  up to the first reference to a parameter entity *[deleted]*; that is
  to say, they must use the information in those declarations to
  normalize attribute values *[deleted]*.] Except when
  standalone="yes", they must not process *[deleted]* attribute-list
  declarations encountered after a reference to a parameter entity
  *[deleted]*, since the entity may have contained overriding

The crucial difference between such an approach and the 'profile' or
'subset' approach is that it doesn't change the fundamental
universality of XML -- all conformant processors can process all XML
documents.  That minimal processors will 'produce' slightly different
infosets from some inputs than non-validating processors is nothing
new, it's already true that non-validating processors will 'produce'
slightly different infosets from some inputs than validating

This approach does not address the question of processing instructions
and comments -- like the TAG, I think applications existing freedom to
ignore these is sufficient.


[1] http://www.w3.org/TR/REC-xml#sec-conformance
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Wednesday, 2 April 2003 04:05:11 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:37 UTC