Re: [LC Review] of WebCGM 2.0 from Lofton Henderson on 2006-07-10 (public-webcgm-wg@w3.org from July 2006)

From: Lofton Henderson <lofton@rockynet.com>
Date: Mon, 10 Jul 2006 10:27:04 -0600
To: public-webcgm-wg@w3.org
Message-Id: <5.1.0.14.2.20060709154303.04235650@localhost>
[...I see in the archive that Benoit has replied also.  But I'm still 
having trouble with receipt of WG email, which is routed through OASIS to 
me...]

Here are my thoughts on I18N's three comments...

At 10:52 PM 7/7/2006 +0900, Felix Sasaki wrote:
>Hello,
>
>These are comments on
>
>WebCGM 2.0, http://www.w3.org/TR/2006/WD-webcgm20-20060623/
>
>sent on behalf of the i18n core working group.
>
>Best regards, Felix Sasaki.
>
>Comment 1 (editorial): <title> elements in some files are confusing
>It seems that some <title> elements contain "OASIS CGM Open
>specification - ...", e.g.
>http://www.w3.org/TR/2006/WD-webcgm20-20060623/WebCGM20-TOC.html
>"OASIS CGM Open specification - WebCGM Profile - Expanded Table of Contents"
>This is just confusing and should be fixed.

PROPOSED REPLY:
Agreed, we will fix.  Thanks for catching this.  The <title> elements 
should match the title text that immediate preceding the horizontal rule at 
the top of each chapter.


>Comment 2 (editorial): Reference to Unicode
>In
>http://www.w3.org/TR/2006/WD-webcgm20-20060623/WebCGM20-Intro.html#norm-ref
>  , you have two references to Unicode, one generic reference, and one to
>version 4.01. Is there a reason for that? If not, please reference to
>Unicode following the description at
>http://www.w3.org/TR/charmod/#sec-RefUnicode , that is, only in a
>generic manner.

DISCUSSION:
Although it is not a good answer to "Is there a reason for that?", this 
particular advice (generic+specific) came from Chris, as we negotiated 
resolutions to his own comments on the WebCGM 2.0 Submission text.  We 
should involve Chris in this discussion.  Here is email from Chris to me on 
18th May, replying to questions from me ("LH>").  (Thierry was Cc'd, but 
message was not archived, as we didn't yet have an archived email list).

[[[
On Thursday, May 18, 2006, 12:11:16 AM, Lofton wrote:
LH> Hi Chris,
LH> We're just finishing up a document "recommended changes before Last Call",
LH> for TC to offer to WG. I'm processing one of a couple remaining comments
LH> of yours:
LH> 
http://www.oasis-open.org/committees/download.php/17844/CL-comments_proposed_resolution.htm#CL-d1 

LH> CharMod has this section:
LH> http://www.w3.org/TR/2005/REC-charmod-20050215/#sec-RefUnicode
LH> Do you have any advice or guidance about specific version versus generic
LH> reference ("..as may from time to time..")? Or is that purely a matter of
LH> the requirements of the particular group?
Charmod says
C063 [S] A generic reference to the Unicode Standard MUST be made if
it is desired that characters allocated after a specification is
published are usable with that specification. A specific reference to
the Unicode Standard MAY be included to ensure that functionality
depending on a particular version is available and will not change
over time.
LH> One prejudice of mine is fixed conformance requirements. E.g., if an error
LH> in unicode were fixed and a code were changed, it is not completely clear
LH> in high-legacy environments like aerospace whether you want the viewer to
LH> change and track the correction.
Referencing both a fixed and a generic version of Unicode guards against
the (rare) case where a code is removed. AFAIK this has only happened
once, for some Korean codes that were genuinely completely wrong.
LH> On the other hand, fixed version
LH> eliminates future code pages (this might be a minor point, in the
LH> ASCII-centric aerospace environment now).
Its not 'pages' but character code points, which may be added
individually as well as in blocks of arbitrary size. For example, the
Euro character was added that way (to an existing block, at an unused
code point).
LH> I understand that "both" (specific and generic) is an option, and that 
such
LH> a dual reference (per CharMod) will freeze existing codes while allowing
LH> new codes.
Yes.
LH> Still, I'm inclined toward just specific, as I think the
LH> constituent interest in new code pages is relatively low.
So, for example, you would be happy to test for WebCGM 1.0 viewers
correctly rejecting a Euro, or rejecting characters added in Unicode 4.0
and 4.1? I see no value in doing so.
LH> But I could be
LH> convinced otherwise.
LH> Any advice would be appreciated.
Reference the specific and the generic. Characters get added over time.
You do not want to have to revise the WebCGM spec whenever new
characters are added. XML 1.1 is a good example of a spec that does the
right thing here.
In addition, using the exact form of words from charmod means that
reviewers in last Call just note that and move on.
]]]

Bottom line -- I don't feel strongly, but Chris made arguments for 
generic+specific.  Would those satisfy Felix?


>Comment 3 (editorial): Why not Unicode as the default encoding?
>In
>http://www.w3.org/TR/2006/WD-webcgm20-20060623/WebCGM20-Concepts.html#webcgm_2_4
>, (sec. 2.5.4), you describe isolatin1 as the default "character set".
>We would propose to describe UTF-8 as the default character encoding,
>and to use the term "character encoding" instead of "character set". See
>also http://www.w3.org/TR/charmod/#C020 .

PROPOSED REPLY (perhaps too wordy):
Simple answer:  legacy.  WebCGM 1.0 (1999) uses the default of IsoLatin1, 
as does the ISO CGM:1999 standard upon which the WebCGM 1.0 profile is 
based.  (Ignoring some fine distinctions between graphical and 
non-graphical text.)  Changing the default for WebCGM 2.0 would be pretty 
disruptive, without apparent commensurate gain.  Particularly since it is a 
simple matter for a metafile instance to reset its "character set" (more 
about terminology below).

In addition, there may be an issue about CGM:1999's "Rules for 
Profiles".  At best, it is unclear whether a valid CGM profile can redefine 
for profile-conforming metafile instances the defaults specified in the 
base standard, CGM:1999.  If it doesn't violate the letter of the rules in 
CGM:1999 clause 9 ("Profiles and conformance"), it appears to violate the 
spirit.  Again, I would think that it would be a hardship on 
implementations to have defaults that are profile sensitive and at variance 
with the base standard.

Aside... If the ISO CGM standard were being written today (instead of 
descending from the venerable original Version 1 of ISO CGM:1987), Unicode 
would certainly have been the chosen default.  Note that the same pertains 
to CGM's terminology of "character set", instead of the correct 
terminology, "character encoding".  We are aware that it is at variance 
with CharMod.  However the incorrect terminology "character set" descends 
from the original ISO CGM:1987, and indeed there are CGM element names 
(CHARACTER SET LIST, CHARACTER SET INDEX, etc) that embed the incorrect 
terminology.  If this is thought to be important, we could perhaps include 
the correct terminology in an explanatory note?  And link occurrences of 
"character set" to that note?

QUESTION:  Is use of the correct terminology ("character encoding") 
sufficiently important to change it throughout WebCGM 2.0 (except for 
proper element names like CHARACTER SET LIST), at variance with ISO 
CGM:1999 and WebCGM 1.0?  Or could an explanatory note suffice?

All for now,
-Lofton.
Received on Monday, 10 July 2006 16:27:26 UTC