W3C home > Mailing lists > Public > public-xml-id@w3.org > August 2008

Re: RELAX NG and xml:id

From: Daniel Veillard <veillard@redhat.com>
Date: Wed, 13 Aug 2008 09:22:26 -0400
To: Murata Makoto <EB2M-MRT@asahi-net.or.jp>
Cc: public-xml-id@w3.org, Michael.Brauer@sun.com, eb2m-mrt@j.asahi-net.or.jp
Message-ID: <20080813132225.GI24173@redhat.com>

On Wed, Aug 13, 2008 at 07:12:03PM +0900, Murata Makoto wrote:
> Dear colleagues,


> I have a question about interactions of xml:id and 
> validation.
> Consider a RELAX NG schema that defines xml:id as xsd:NCName rather
> than xsd:ID and an DTD-free instance valid against this schema.  The
> schema does NOT use the RELAX NG DTD compatibility specification.
> Since RELAX NG validation does not change the information set, 
> the [attribute type] property of the attribute xml:id is unknown.

  Well In my opinion xml:id support is really is a property of the XML
parser. Validation which is not DTD validation but RELAX NG or XSD, etc...
is at least logically coming as a separated step after parsing. 
   So if the parser is xml:id aware after parsing and before the information
set is handed to RELAX NG, the ID type assignment (section 4 bullet 2)
has been performed and the attribute is of type ID not unknown.
   But if the parser is not xml:id aware, the attribute would be of type
unknown at that point.

> I believe that there is nothing wrong in applying "ID attribute
> normalization" and "ID type assignment" to xml:id in this instance
> document.

  Agreed this will happen (at least logically) before the RELAX NG processor
receives the informations from the parser.

> In my understanding, xml:id tries to separate ID processing
> from validation as much as possible.

  yes it tries to implement IDness at the parser level, i.e. provide
IDness even if no DTD is available.

> However, Section 4 of the xml:id recommendation says:
> 	The declared type of the attribute, if it has one, is "ID".
> 	All declarations for xml:id attributes must specify "ID" as
> 	the type of the attribute.
> Does this sentence prohibit my scenario?  The pattern for xml:id 
> specifies xsd:NCName rather than xsd:ID.

  In spirit yes, you should not at the validation level conflict with
what a parser supporting xml:id but not validating would provide.
  In practice I would not see that as a hard problem myself since
RELAX NG do not modify the infoset, so basically that rule in your 
schemas is in my opinion just verifying that the values passed are
compatible with xsd:NCName, it's a type checking not a type definition.

  Since the infoset is not changed, the only impact of the RNG mismatch 
is that you won't be able to catch some problems:
   - conflicting ID but assuming xml:id processing and no DTD one would
     expect IDness to be only xml:id based and conflicts will be reported
     by the parser itself
   - ID/IDREF mismatches

so by miscategorizing the attribute you loose some quality of checking
sounds like a schemas bug but of limited impact.

> Furthermore, what will happen if the xml:id attribute is validated
> against wildcards?  For example: 
>   anyAtt = attribute * { xsd:string }.

  Sounds similar to me as the previous case, you use a generic rule but
as a result loose some quality in the checking.

> Such wildcards are useful when we would like to allow foreign elements
> to contain any attribute.   Since the RELAX NG DTD compatibility 
> specification allows the use of xsd:ID only when we precisely know 
> the element name as well as the attribute name, we cannot 
> have: 
>   anyElement = element * {attribute xml:id {xsd:ID}?, anyElement*}
> If the "anyAtt" define statement shown above is what you mean
> "declaration", we cannot allow xml:id within foreign elements 
> without giving up RELAX NG validation. 

  I think it's an extreme viewpoint. To me it just means that for foreign
elements you will just rely on the parser itself to detect xml:id IDness
and conlict between IDs declared in the full document. But you will loose
ID-IDREF references checking for foreign element, which again sounds rather
limited because you would expect ID-IDREF linking to happen between elements
of a common vocabulary not foreign elements pertaining to a different logic.

  my 2 cents.


Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
veillard@redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/
Received on Wednesday, 13 August 2008 13:23:14 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:41:00 UTC