W3C home > Mailing lists > Public > www-tag@w3.org > February 2003

[xmlProfiles-29] TAG recommendation for work on subset of XML 1.1

From: Henry S. Thompson <ht@cogsci.ed.ac.uk>
Date: 05 Feb 2003 11:29:14 +0000
To: www-tag@w3.org
Message-ID: <f5bwukf3r05.fsf@erasmus.inf.ed.ac.uk>

I am unconvinced of the necessity for subsetting the language, as
opposed to identifying a new conformance class alongside the two
already provided ('validating' and 'non-validating').

Call such a conformance class 'minimal' -- it can be trivially defined
as a further restriction of 'non-validating', as follows (this is an
edited copy of text from section 5.1 of XML 1.0 2e [1], changes in
bold):

  *Minimal* processors are required to check only the document entity,
  including the entire internal DTD subset, for well-formedness,
  *except that they must not process any general or parameter entity
  declarations*. [Definition: While they are not required to check the
  document for validity, they are required to *minimally* process all
  the *non-entity* declarations they read in the internal DTD subset,
  up to the first reference to a parameter entity *[deleted]*; that is
  to say, they must use the information in those declarations to
  normalize attribute values *[deleted]*.] Except when
  standalone="yes", they must not process *[deleted]* attribute-list
  declarations encountered after a reference to a parameter entity
  *[deleted]*, since the entity may have contained overriding
  declarations.

The crucial difference between such an approach and the 'profile' or
'subset' approach is that it doesn't change the fundamental
universality of XML -- all conformant processors can process all XML
documents.  That minimal processors will 'produce' slightly different
infosets from some inputs than non-validating processors is nothing
new, it's already true that non-validating processors will 'produce'
slightly different infosets from some inputs than validating
processors.

This approach does not address the question of processing instructions
and comments -- like the TAG, I think applications existing freedom to
ignore these is sufficient.

ht

[1] http://www.w3.org/TR/REC-xml#sec-conformance
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                      Half-time member of W3C Team
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/
 [mail really from me _always_ has this .sig -- mail without it is forged spam]
Received on Wednesday, 5 February 2003 06:29:05 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:16 GMT