- From: Mark Birbeck <mark.birbeck@formsPlayer.com>
- Date: Wed, 31 Oct 2007 11:14:48 +0000
- To: "W3C RDFa task force" <public-rdf-in-xhtml-tf@w3.org>
Hello all,
We've discussed defining two of our processing rules in terms of other
sets of rules, such as those in CSS:
* the conversion of a sequence of elements that contain text into one
text string;
* the removal of leading and trailing whitespace.
Although there is still some discussion to have on these, I realised
today that we could actually define these rules quite clearly by using
the XPath specification--and in many ways that would be more
'correct'. So I'm sending this email mainly as a
'reminder-to-self-and-Shane' to look into this further, but I thought
I'd put it on the list in case anyone has any comments, particularly
from an implementation perspective.
The key 'concepts' that I'm thinking we need from XPath are the
function normalize-space() and the idea that all nodes have a
'string-value'.
For example, the 'string-value' of an element is defined as:
The string-value of an element node is the concatenation of the
string-values of all text node descendants of the element node in
document order.
In turn, the 'string-value' of a text node is defined as:
The string-value of a text node is the character data. A text node
always has at least one character of data.
And so on, including a definition of the 'string-value' of the root node.
You can see that this is better than the definition I added to the
syntax document:
The actual literal is either the value of @content (if present) or a string
created by concatenating the text content of each of the child elements
of the [current element] in document order...
since I don't define "text content", whilst the idea of "character
data" is very familiar. Also, given that XPath underpins a number of
specifications it would be wise to use their version of any concepts
that we share, rather than writing them afresh.
On the space normalisation, XPath defines the normalize-space()
function as follows:
Function: string normalize-space(string?)
The normalize-space function returns the argument string with
whitespace normalized by stripping leading and trailing whitespace
and replacing sequences of whitespace characters by a single space.
Whitespace characters are the same as those allowed by the S
production in XML. If the argument is omitted, it defaults to the context
node converted to a string, in other words the string-value of the
context node.
This again is much better than what we have at the moment, since the
idea is to use CSS rules:
... by concatenating the text content of each of the child elements
of the [current element] in document order, and then normalising
white-space according to [WHITESPACERULES].
In my view the XPath approach is better since it specifically refers
to the "S production in XML", and given that we are using XHTML at the
moment, this seems to me to suitably precise.
So I believe we should either refer to these two ideas, or even import
the prose as is, if we have to.
One way of using XPath by reference would be to define our processing
in terms of the XPath concepts. At the moment we say this:
The actual literal is either the value of @content (if present) or a string
created by concatenating the text content of each of the child elements
of the [current element] in document order, and then normalising
white-space according to [WHITESPACERULES].
But we could say:
The actual literal is either the value of @content (if present) or a string
created by {processing that has the same effect as taking the XPath
string-value of the [current element] and passing it to the XPath
normalize-space() function.}
Or some such wording. Essentially all we're really saying is that a
processor must act as if it has done this:
normalize-space( string-value of [current element] )
Regards,
Mark
--
Mark Birbeck, formsPlayer
mark.birbeck@formsPlayer.com | +44 (0) 20 7689 9232
http://www.formsPlayer.com | http://internet-apps.blogspot.com
standards. innovation.
Received on Wednesday, 31 October 2007 11:14:59 UTC