Re: DOM L3 Core spec.: textContent specification ambiguity from Doug Schepers on 2010-06-25 (www-dom@w3.org from April to June 2010)

From: Doug Schepers <schepers@w3.org>
Date: Fri, 25 Jun 2010 10:13:21 +0100
To: Daniel Barclay <daniel@fgm.com>
CC: Robin Berjon <robin@berjon.com>, www-dom@w3.org
Message-ID: <4C247331.90009@w3.org>
Hi, Daniel-

I agree with Robin that the text seems clear, and that the string 
"&lt;e/&gt;" doesn't contain markup... in fact, it's specifically 
escaped so that the UA doesn't misinterpret it as markup.

However, if you still think it's ambiguous, the most productive way 
forward is not to present arguments, but to propose alternate wording.

Regards-
-Doug Schepers
W3C Team Contact, SVG and WebApps WGs


Daniel Barclay wrote (on 6/21/10 4:42 PM):
> Robin Berjon wrote:
>> Hi Daniel,
>>
>> On Jun 8, 2010, at 19:43 , Daniel Barclay wrote:
>>> The wording in the definition of the textContext attribute of the
>>> Note interface seems to be ambiguous (or at least misleading).
>>>
>>> The text says:
>>>
>>> "On getting, no serialization is performed, the returned string
>>> does not contain any markup."
>>>
>>> The intent of the latter part of that sentence is to say that the
>>> string does not contain any added markup to represent any child
>>> elements, etc.
>>>
>>> However, that wording sounds like it's saying that the string cannot
>>> contain any text that looks like markup.
>>
>> I find the sentence to be rather clear in fact.
>
> Note that how you find the sentence isn't necessarily the issue.
> If it's ambiguous, some are going to find the sentence to be saying
> one thing, and some are going to find it to be saying something else.
> Such ambiguity is inappropriate for a technical specification.
>
> For this particular "contains no markup" phrase, note how that might
> be used in, say, a description of a database field or web service
> parameter for a web application to try to specify that it is
> restricted to values that can be inserted into an HTML page without
> having to encode the value. Yes, it might not be good practice to
> skip encoding in such cases because you might then forget to encode
> in other cases, but that existing wording usage certainly could
> influence how readers interpret that same phrase in the the DOM
> specification.
>
>
>>  It says that the
>>  returned string contains no markup, which to me sounds like it's
>>  saying that it contains no markup;
>
> Huh? Saying that "contains no markup" sounds like "contains no markup"
> isn't much of an argument; it certainly doesn't address the issue.
> The ambiguity is in the phrase "contains no markup."
>
> It means to say that it contains no markup at the relevant level of
> interpretation, but doesn't limit itself to sounding like only that.
>
>
>>  if it said that the returned
>>  string doesn't contain anything that could be mistakenly interpreted
>> as containing markup, then it'd probably sound like it's saying that
>>  the string cannot contain any text that might perhaps look like markup.
>>  But it doesn't :)
>
> If A implies B and A is false, trying to argue that therefore B is
> false is an invalid argument. (Other things can imply B.)
>
>
>>> If the difference isn't clear, consider getting the text content of
>>> the root element of this document:
>>>
>>> <root><sub>&lt;e/&gt;</sub><root>
>>>
>>> The textContent attribute string would be "<e/>", right?
>>
>> Which is fine: it's not markup. It's just text.
>
> Not quite. Yes, it is true that it is text and not markup _at_the_
> _intended_ level of interpretation. However, it is markup at a
> different level of intepretation. And yes, that other level of
> interpretation (taking text content from one level and re-interpreting
> it (parsing it again) usually is irrevelant to XML/HTML
> specifications.
>
> However, wording that sounds like it covers that other level makes
> the level relevant at least to the degree of avoiding mistaken
> implications/inferences about it.
>
>
> Ah, maybe here's part of why we're arguing. You write:
>
> it's not markup. It's just text.
>
> But markup _is_ text. (In SGML/HTML/XML, markup is not binary codes
> marking beginnings and endings of ranges of represented text; it is
> _text_ marking up beginnings and endings of ranges of represented
> text.)
>
> You (and the above wording in the spec) probably need to distinguish
> more clearly between text and represented text, or text and marked-up
> text, or something like that.
>
>
>>  You can then go
>>  el.textContent = "<e/>" and it'll roundtrip because it's not markup.
>>
>>> That string _does_ contain markup
>>
>> No, it doesn't. That's like saying that the following XML document
>>  isn't well-formed because the "b" element isn't closed:
>>
>> <a><![CDATA[<b>]]></a>
>
> Yes, it's true that one can't say that in that XML document, the b
> element isn't closed, or the b start tag isn't balanced with an end
> tag, because there is no b start tag in _that_ XML element.
>
> However, if someone refers to the b tag, you can't say that there
> is no b tag at all.
>
> Yes, the b tag is only in the XML document that results from taking
> the text represented by the a element in the given XML document (and
> interpreting it as XML document), and, yes, that usually that is
> completely relevant.
>
> However, if one says there's no b tag, something has to limit that
> to saying there's no b tag in the _given_ XML document, or one is
> say that there is not b tag at all, which isn't true.
>
>
> Why not have the spec say what it means but not sound it like means
> more than it intends to mean?
>
> Daniel
>
>
>
>
>

--
Received on Friday, 25 June 2010 10:06:14 UTC