W3C home > Mailing lists > Public > public-i18n-geo@w3.org > July 2005

Re: New FAQ: entities and NCRs

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 06 Jul 2005 14:41:48 +0900
To: "public-i18n-geo@w3.org" <public-i18n-geo@w3.org>
Message-ID: <op.stg8rywjx1753t@ibm-60d333fc0ec.w3.mag.keio.ac.jp>

The table at the end of the last mail got screwed up, I hope you still  
will understand it :)

Cheers, Felix

On Wed, 06 Jul 2005 14:39:54 +0900, Felix Sasaki <fsasaki@w3.org> wrote:

>
> Hi Richard, hi all,
>
> This is a summary of the issue "NCRs and schema languages", which has  
> some overlap with the FAW "entities and NCRs".  It describes ways of  
> encapsulating NRCs, and entities are ONE possibility. Which way is  
> useful and possible, depends on the schema language. This summary might  
> become an input to the FAQ about entites and NCRs, if Richard and others  
> think it is useful.
>
> Cheers, Felix.
>
> The following discussion on entities for numeric character references  
> (NCRs) and other, alternative ways of encapsulating numeric character  
> references concentrates on four schema languages: XML DTDs, XML Schema,  
> RELAX NG and Schematron.
>
> All schema languages allow to use entities for NCRs in XML documents.  
> They differ with respect to the declaration of entities. As for XML  
> DTDs, entities can be defined A) in the declaration subset of the XML  
> document, or B) in the external DTD ("NCR" is used as a placeholder for  
> a numeric character reference):
>
> A)
> <!DOCTYPE mydoc [
> <!ENTITY mychar "NCR">
> ]>
>
> or
>
> B)
> <!DOCTYPE mydoc SYSTEM "mydtd.dtd">
>
> "mydtd.dtd" contains The entitiy declaration <!ENTITY mychar "NCR">.
>
> XML Schema, RELAX NG and Schematron allow to declare entities like A).  
> They do not allow to declare entities like B), i.e. as part of the  
> external schema. Strictly speaking, entity declaration and expansion are  
> out of scope for XML Schema, RELAX NG and Schematron. All these schema  
> languages rely on an XML processor which expands the entities before the  
> validation against the schema starts. Non-validating XML processors are  
> required to check only the document and no external declarations. Hence,  
> it depends on the implementation of the XML processor, whether external  
> entity declarations can be resolved or not.
> XML Schema provides a different solution to encapsulate numeric  
> character references: The numeric character reference can be defined as  
> a default value for an element:
>
> <xsd:element name="mychar" type="xsd:token" fixed="NCR"/>
>
> In an XML document, the element then can be used like this
>
> <mydoc> ... <mychar/>...</mydoc>
>
> RELAX NG and XML DTDs do not allow to define default values for element  
> content. Also, Schematron does not support this solution. But XML DTDs,  
> XML Schema and RELAX NG allow to declare default values for attributes.  
> Hence, for XML DTDs the following alternative way of attaching a name to  
> a numeric character reference is possible:
>
> <!ELEMENT mychar EMPTY>
> <!ATTLIST mychar ncr NMTOKEN "..." #FIXED>
>
> or in RELAX NG:
> <element name="mychar">
>   <attribute name="ncr" a:defaultValue="NCR"/>
>   <empty/>
> </element>
>
> or in XML Schema:
> <xsd:attribute name="mychar" type="xsd:token" fixed="NCR"/>
>
> As for XML DTDs, there seems to be no real need to choose this method,  
> since they allow to declare entities in the external DTD.
>
> The following table summarizes the ways of declaring entities and  
> alternative methods to encapsulate numeric character references in  
> different schema languages.
>
> 		    Declaration Subset	   External Subset    Element default  
> value	Attribute default value
> XML DTDs	    +			         +		          -				      +
> XML Schema	    +			         xml parser dep.    +				      +
> RELAX NG	    +			         xml parser dep.    -				      +
> Schematron	    +			         xml parser dep.    -				      -
>
Received on Wednesday, 6 July 2005 05:41:54 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:40 GMT