W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2002

RE: comments and pi inside simple content

From: <noah_mendelsohn@us.ibm.com>
Date: Fri, 4 Oct 2002 12:53:54 -0400
To: "Dare Obasanjo" <dareo@microsoft.com>
Cc: ht@cogsci.ed.ac.uk, tmoog@sarvega.com, xmlschema-dev@w3.org
Message-ID: <OF6F4609FC.CE263E52-ON85256C48.005C8689@lotus.com>

Schema >never< changes existing properties or information items when 
building a PSVI from Infoset.  They all remain available.  All additional 
information is added in new information items and/or new properties. Note 
that for attributes and for elements validated by simple types the PSVI 
>adds< a property named [schema normalized value] which includes only the character children, and which (by the way) also 
reflects the result of type-specific whitespace processing.  This property 
in fact contains the characters which are the lexical form used as input 
to simple type validation. 

So, PSVI gives you both:  the original children are all there, with PIs, 
comments, leading and trailing whitespace, etc., and a new property that 
gathers just the characters actually used for validation.  In that new 
property, PIs, Comments etc. don't show, and this does indeed have 
consequences for the validation of constructs like:

<number>1<!-- this is a comment -->2</number>

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------







"Dare Obasanjo" <dareo@microsoft.com>
Sent by: xmlschema-dev-request@w3.org
10/04/02 10:31 AM

 
        To:     "Henry S. Thompson" <ht@cogsci.ed.ac.uk>, "Tom Moog" <tmoog@sarvega.com>
        cc:     <xmlschema-dev@w3.org>, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        RE: comments and pi inside simple content
Categories: 
 





This may have interesting repercussions for XQuery which I hadn't
considered. So in the PSVI, does the element information item's
[children] contain the comment information item or not? 

-- 
PITHY WORDS OF WISDOM 
The meek shall inherit the Earth....if that's all right with the rest of
you. 

This posting is provided "AS IS" with no warranties, and confers no
rights. 

> 
> 
> -----Original Message-----
> From: Henry S. Thompson [mailto:ht@cogsci.ed.ac.uk] 
> Sent: Friday, October 04, 2002 3:21 AM
> To: Tom Moog
> Cc: xmlschema-dev@w3.org
> 
> 
> Tom Moog <tmoog@sarvega.com> writes:
> 
> > In 3.1.4 of Structures, it states that comments and pi are ignored 
> > when constructing the content of an element.  This does not 
> appear to 
> > depend on whether the content is simple or complex.  Thus
> > 
> >      <number>1<!-- this is a comment -->2</number>
> > 
> > is the same as <number>12</number> for purposes of validation.
> 
> That's what the REC says, yes.
> 
> > This seems rather counter-intuitive.  It also conflicts 
> with the way 
> > xslt and other xml processors handle content - they would treat the 
> > "1" and the "2" as separate text items.
> 
> XML Schema doesn't have a notion of 'text item', for better or worse.
> 
> > Do I correctly understand the document, or did I miss something ?
> 
> I think you understand it correctly.
> 
> ht
> --
>   Henry S. Thompson, HCRC Language Technology Group, 
> University of Edinburgh
>           W3C Fellow 1999--2002, part-time member of W3C Team
>      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 
> 131 650-4440
>                    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
>                                     URL: http://www.ltg.ed.ac.uk/~ht/  [mail 
> really from me _always_ has this .sig -- mail without it is 
> forged spam]
> 
> 
Received on Friday, 4 October 2002 12:57:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:34 GMT