W3C home > Mailing lists > Public > public-html-bugzilla@w3.org > August 2009

[Bug 7062] replace terms "CDATA element" and "RCDATA element" with... something better

From: <bugzilla@wiggum.w3.org>
Date: Fri, 21 Aug 2009 01:46:43 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1MeJDH-0004E3-DC@wiggum.w3.org>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=7062


Michael(tm) Smith <mike@w3.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|NEEDSINFO                   |




--- Comment #6 from Michael(tm) Smith <mike@w3.org>  2009-08-21 01:46:42 ---
(In reply to comment #5)
> RCDATA stands for "replaceable character data", so "replaceable text" seems
> like a better term for it, along with a note saying it means text that can have
> character references.

Calling it any kind of "text" at all, and thus needing to add a note to say
that it's text that can contain character references, is the reason I suggested
"replaceable character data" initially.

What the spec currently defines as "text" cannot contain character references.
Also, what it defines "text" has two possible forms:

   - "raw" text that is allowed to contain unparsed markup characters
   - "non-raw" text that is not allowed to contain unparsed markup characters

...where "unparsed markup characters" essentially means the character "<" and
the strings "<!--" and "-->".

So there are three ways in which the text/html syntax allows those two forms of
text to be combined with character references:

  1. non-raw text that can be combined with character references
  2. raw text that can be combined with character references (RCDATA)
  3. raw text that cannot be combined character references

One way to describe the above more succinctly is:

  1. normal character data
  2. replaceable character data
  3. non-replaceable character data

Or maybe "raw character data" would be a better term for #3 (which is what the
spec now calls "raw text" and which it previously called "CDATA").

But regardless, the term "character data" seems very useful as a general term
for describing all three of those possible combinations, and each of them could
be defined specifically by preceding "character data" with some adjective to
describe what type of character data it is.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.
Received on Friday, 21 August 2009 01:46:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 20:00:58 UTC