- From: Mark Birbeck <mark.birbeck@formsPlayer.com>
- Date: Wed, 31 Oct 2007 11:14:48 +0000
- To: "W3C RDFa task force" <public-rdf-in-xhtml-tf@w3.org>
Hello all, We've discussed defining two of our processing rules in terms of other sets of rules, such as those in CSS: * the conversion of a sequence of elements that contain text into one text string; * the removal of leading and trailing whitespace. Although there is still some discussion to have on these, I realised today that we could actually define these rules quite clearly by using the XPath specification--and in many ways that would be more 'correct'. So I'm sending this email mainly as a 'reminder-to-self-and-Shane' to look into this further, but I thought I'd put it on the list in case anyone has any comments, particularly from an implementation perspective. The key 'concepts' that I'm thinking we need from XPath are the function normalize-space() and the idea that all nodes have a 'string-value'. For example, the 'string-value' of an element is defined as: The string-value of an element node is the concatenation of the string-values of all text node descendants of the element node in document order. In turn, the 'string-value' of a text node is defined as: The string-value of a text node is the character data. A text node always has at least one character of data. And so on, including a definition of the 'string-value' of the root node. You can see that this is better than the definition I added to the syntax document: The actual literal is either the value of @content (if present) or a string created by concatenating the text content of each of the child elements of the [current element] in document order... since I don't define "text content", whilst the idea of "character data" is very familiar. Also, given that XPath underpins a number of specifications it would be wise to use their version of any concepts that we share, rather than writing them afresh. On the space normalisation, XPath defines the normalize-space() function as follows: Function: string normalize-space(string?) The normalize-space function returns the argument string with whitespace normalized by stripping leading and trailing whitespace and replacing sequences of whitespace characters by a single space. Whitespace characters are the same as those allowed by the S production in XML. If the argument is omitted, it defaults to the context node converted to a string, in other words the string-value of the context node. This again is much better than what we have at the moment, since the idea is to use CSS rules: ... by concatenating the text content of each of the child elements of the [current element] in document order, and then normalising white-space according to [WHITESPACERULES]. In my view the XPath approach is better since it specifically refers to the "S production in XML", and given that we are using XHTML at the moment, this seems to me to suitably precise. So I believe we should either refer to these two ideas, or even import the prose as is, if we have to. One way of using XPath by reference would be to define our processing in terms of the XPath concepts. At the moment we say this: The actual literal is either the value of @content (if present) or a string created by concatenating the text content of each of the child elements of the [current element] in document order, and then normalising white-space according to [WHITESPACERULES]. But we could say: The actual literal is either the value of @content (if present) or a string created by {processing that has the same effect as taking the XPath string-value of the [current element] and passing it to the XPath normalize-space() function.} Or some such wording. Essentially all we're really saying is that a processor must act as if it has done this: normalize-space( string-value of [current element] ) Regards, Mark -- Mark Birbeck, formsPlayer mark.birbeck@formsPlayer.com | +44 (0) 20 7689 9232 http://www.formsPlayer.com | http://internet-apps.blogspot.com standards. innovation.
Received on Wednesday, 31 October 2007 11:14:59 UTC