- From: Ivan Herman <ivan@w3.org>
- Date: Wed, 31 Oct 2007 13:01:26 +0100
- To: Mark Birbeck <mark.birbeck@formsPlayer.com>
- Cc: W3C RDFa task force <public-rdf-in-xhtml-tf@w3.org>
- Message-ID: <47286E96.9040603@w3.org>
Mark, to back this up: I have the impression that a number of XML toolkits (eg, XSLT) essentially implement the xpath stuff (because that is what the 'know') and not the CSS stuff (because that is the stuff they do not 'know'). Ie, your proposal may make the life of implementers easier... I think this is a good move. Ivan Mark Birbeck wrote: > Hello all, > > We've discussed defining two of our processing rules in terms of other > sets of rules, such as those in CSS: > > * the conversion of a sequence of elements that contain text into one > text string; > > * the removal of leading and trailing whitespace. > > Although there is still some discussion to have on these, I realised > today that we could actually define these rules quite clearly by using > the XPath specification--and in many ways that would be more > 'correct'. So I'm sending this email mainly as a > 'reminder-to-self-and-Shane' to look into this further, but I thought > I'd put it on the list in case anyone has any comments, particularly > from an implementation perspective. > > The key 'concepts' that I'm thinking we need from XPath are the > function normalize-space() and the idea that all nodes have a > 'string-value'. > > For example, the 'string-value' of an element is defined as: > > The string-value of an element node is the concatenation of the > string-values of all text node descendants of the element node in > document order. > > In turn, the 'string-value' of a text node is defined as: > > The string-value of a text node is the character data. A text node > always has at least one character of data. > > And so on, including a definition of the 'string-value' of the root node. > > You can see that this is better than the definition I added to the > syntax document: > > The actual literal is either the value of @content (if present) or a string > created by concatenating the text content of each of the child elements > of the [current element] in document order... > > since I don't define "text content", whilst the idea of "character > data" is very familiar. Also, given that XPath underpins a number of > specifications it would be wise to use their version of any concepts > that we share, rather than writing them afresh. > > On the space normalisation, XPath defines the normalize-space() > function as follows: > > Function: string normalize-space(string?) > > The normalize-space function returns the argument string with > whitespace normalized by stripping leading and trailing whitespace > and replacing sequences of whitespace characters by a single space. > Whitespace characters are the same as those allowed by the S > production in XML. If the argument is omitted, it defaults to the context > node converted to a string, in other words the string-value of the > context node. > > This again is much better than what we have at the moment, since the > idea is to use CSS rules: > > ... by concatenating the text content of each of the child elements > of the [current element] in document order, and then normalising > white-space according to [WHITESPACERULES]. > > In my view the XPath approach is better since it specifically refers > to the "S production in XML", and given that we are using XHTML at the > moment, this seems to me to suitably precise. > > So I believe we should either refer to these two ideas, or even import > the prose as is, if we have to. > > One way of using XPath by reference would be to define our processing > in terms of the XPath concepts. At the moment we say this: > > The actual literal is either the value of @content (if present) or a string > created by concatenating the text content of each of the child elements > of the [current element] in document order, and then normalising > white-space according to [WHITESPACERULES]. > > But we could say: > > The actual literal is either the value of @content (if present) or a string > created by {processing that has the same effect as taking the XPath > string-value of the [current element] and passing it to the XPath > normalize-space() function.} > > Or some such wording. Essentially all we're really saying is that a > processor must act as if it has done this: > > normalize-space( string-value of [current element] ) > > Regards, > > Mark > -- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 31 October 2007 12:01:27 UTC