issue-113 (Re: Comment on ITS 2.0 specification WD) from Felix Sasaki on 2013-01-24 (www-international@w3.org from January to March 2013)

From: Felix Sasaki <fsasaki@w3.org>
Date: Thu, 24 Jan 2013 10:07:51 +0100
To: Norbert Lindenberg <w3@norbertlindenberg.com>
CC: public-multilingualweb-lt-comments@w3.org, www-international <www-international@w3.org>
Message-ID: <5100F9E7.1000804@w3.org>
Dear Norbert,

thanks a lot for the comments. Below are our WG replies. We will track 
edits related to this comment via
https://www.w3.org/International/multilingualweb/lt/track/actions/423

Am 19.01.13 07:29, schrieb Norbert Lindenberg:
> Dear Multilingual Web LT group,
>
> Below is a collection of comments on the Last Call draft. These comments are not directly related to internationalization, so I don't expect the Internationalization WG to track or endorse them.
>
> I've also submitted through the issue tracker of the Internationalization WG a number of issues today that I consider internationalization issues (I18N-ISSUE-238 through I18N-ISSUE-247). Note that the working group has not reviewed these issues yet, so at this point they should be considered personal comments.
>
> All comments are on
> http://www.w3.org/TR/2012/WD-its20-20121206
>
> Regards,
> Norbert
>
>
> Global comments:
>
> - A number of headings are missing spaces between section number and title.

Thanks, will fix.

>
> - Numerous sentences appear to be missing articles. I've noted some of them below, but some careful copy-editing seems necessary.

Arle will do copy-editing after last call to handle this, see 
https://www.w3.org/International/multilingualweb/lt/track/actions/422

>
>
> Status of this Document
> J Acknowledgements (Non-Normative)
>
> The name "MultilingualWeb-LT" Working Group should be spelled out. What does "LT" stand for?

we want to keep the ambiguity, since it is on purpose: LT could stand 
for "localization technology", "language technology", "language related 
technology", ...

>
>
> 1 Introduction
>
> - "NIF" should be spelled out on first use - I didn't know what it is.

Sure, will do that.

>
>
> 1.2 Motivation for ITS
>
> - Links for DocBook and DITA would be useful.

I will add these.

>
>
> 1.3.1.6Text Analytics
>
> - "These types of users": Do you mean "this type of service"?

OK, we will say "this type of services".

>
>
> 1.3.1.7Localization Workflow Managers
>
> - concerend -> concerned

will fix.

>
> - "bitext format": what's that? (Yes, I found out. Still, I first thought this was a typo...)

will change this to "bitext (aligned source and target content for the 
purposes of translation) like XLIFF"

>
>
> 1.4.1 Support for legacy HTML content
>
> - "migrate their content to HTML" -> "migrate their content to HTML5"

will fix.

>
> - "in older versions of HTML ... its-* attributes will be marked as invalid in validators": The W3C validator also reject its-* attributes in HTML5

This is just a question of time; ITS2 validation will be integrated in 
w3c validator. It's already available at http://validator.nu/ .

> , and I don't see anything in the HTML5 spec that would allow such attributes in conforming HTML5 documents. (The spec allows for conforming HTML5+XXX documents, but there's no way to tell the validator that you'd like to apply HTML5+ITS rules and what they are).

See above.
>
>
> 2.1 Selection
>
> - This section needs to clearly specify what it means by "node". This term is not defined in the XML specification, and it's defined differently in XPATH (which includes attributes in its definition) and HTML5/DOM (which exclude attributes). I guess you want to follow the XPATH definition.

Correct. we will add a reference to XPath 1.0 spec to make that clear. 
We will also change  "primarily element and attribute nodes" to "i.e., 
element and attribute nodes".

>
> - "CSS and other query languages": I guess you mean Selectors (formerly CSS selectors)?

Yes, will fix.

>
> - "supported by application" -> "supported by the application"

will fix.

>
> - "http://docbook.org/ns /docbook" -> "http://docbook.org/ns/docbook"

will fix.
>
>
> 3.7 The Term HTML
>
> - This section requires a normative reference to the HTML5 specification; a non-normative one is inadequate.

will do.

>
>
> 4 Conformance
>
> - servers -> serves

will fix.
>
>
> 4.4 Conformance Class for HTML5+ITS documents
>
> - This section should refer to HTML5 section 2.2.3 Extensibility.

We are reffering to HTML5 spec, we don't think we need to refer to the 
specific sections, this is also safer in terms of stability of 
identifiers in the HTML5 spec.

>
> - This section should note that conforming HTML5+ITS documents in HTML syntax that include ITS markup are not conforming HTML5 documents.

I don't think we want to make this, it will scare users of ITS. We allow 
extension attributes and we supply definition of  HTML5+ITS
conforming documents.
>
>
> 5.3.3 CSS Selectors
>
> - Selectors are now known as just Selectors, even though they originated in CSS.

Will fix.

>
> - This makes the identifier "css" a bit unfortunate.

Will want to keep css identifier, since that is the main use case for 
selectors.

>
> - This section requires a normative reference to the Selectors Level 3 specification, but there is none in Appendix A.

Will add that.

>
>
> 5.3.4 Additional query languages
>
> - The "MAY" after "Future versions of this specification" is probably not the MAY of RFC 2119.

Will do this a lower-case may.

>
>
> 5.7 Conversion to NIF
>
> - This section requires a normative reference to the NIF specification, but there is none in Appendix A.

We will change that, see also our issue
https://www.w3.org/International/multilingualweb/lt/track/issues/73

>
>
> 6 Using ITS Markup in HTML
>
> - This section should clarify that by "HTML" it really means "HTML5 (or successor) in HTML syntax". It's not HTML 4, because that doesn't have a translate attribute. It's also not HTML5 in XHTML syntax, because that is case sensitive and has real namespaces.

We will add a generic section at the start saying that "HTML" always 
refers to "HTML5 or its successor". See also

http://www.w3.org/TR/2012/WD-its20-20121206/#usage-in-legacy-html
Here we will change
"Users are encouraged to migrate their content to HTML or XHTML. " >
"Users are encouraged to migrate their content to HTML5 or XHTML5. "

>
> - This section requires a normative reference to the HTML5 specification, but there is none in Appendix A.

Will add that.
>
>
> 6.1 Mapping of Local Data Categories to HTML
>
> - Is it really necessary to use case-insensitive matching for attribute values? A long discussion with the CSS group has convinced us that case-insensitive matching is generally a bad idea. The case of attribute names in HTML syntax is unfortunately decided...

We had related comment from Henry S. asking for a change
http://lists.w3.org/Archives/Public/public-multilingualweb-lt-comments/2013Jan/0083.html
We want to align behaviour with rest of HTML by following that comment. 
The values that are in question use always ASCII characters only 
(pre-defined constants like "yes", "no", ...).

>
> - "Name of HTML attribute" -> "The name of the HTML attribute", "the name of attribute" -> "the name of the attribute"

will fix.
>
> - "will gets" -> "will get"

will fix.
>
>
> 6.3 Standoff Markup in HTML
>
> - The forward references to unexplained but complex sounding concepts are rather unfortunate.

Understand, but it is hard to have a spec that you can lineary. If you 
have a suggestion how to solve this, let us know :)

>
>
> 7 Using ITS Markup in XHTML
>
> - I assume this section is also meant to cover HTML5 documents in XHTML syntax. If so, this should be called out.

Correct and we will make explicit that this covers XHTML5 too.

>
>
> 8.7 Language Information
> E References (Non-Normative)
>
> - The 5th edition of the XML specification refers to BCP 47, so there's no need to discuss RFC 3066 anymore.

thanks, will update.

>
>
> 8.9 Domain
>
> - The substeps of step 3-1-2 and step 3-2 are the same.

It is lengthy but clearer in our opinion.

>   The algorithm would be simpler if these substeps only occured once. Step 3-1-2 iterates over a string list of length > 1, step 3-2 over a string list of length = 1 - that should be easy to merge.
>
> - Step 3-1-2-5 refers to a mapping without saying where it comes from and how it's defined. This should be clarified based on the later description of domainMapping.

We would change " Check if there is a mapping for the string" > " Check 
the domanMapping attribute if there is a mapping for the string"

>
> - Step 3-1-2-5 says "the mapping is case-insensitive". This should say that the string being processed is matched against the left part of the pair in a case-insensitive manner (but see also I18N-ISSUE-242).

We addressed that, see mail from Yves Savourel on the topic.

>
> - Steps 4 and 5 refer to "the resulting string". This should be "the resulting string list".

It is only one string that contains several values, so we intend to keep 
this as is.
>
> - The recommendation to use <meta name="keywords"> for HTML seems misguided. Typically the keywords don't contain domain information (e.g., automotive), but are stuffed with words that the authors hope search engines will match against user input (e.g., Toyota Camry, VW Passat, Honda Accord).

Agree, we will say "a possible way" instead of "preferred way".

>
>
> 8.10 Disambiguation
>
> - "what WordNet services do", "such as DBpedia": What are WordNet and DBpedia? (Yes, I found out, but informative references would help, if these services really need to be mentioned.)

We will add references.

>
> - "serialize in RDFa Lite or Microdata": need informative references.

We will add references.

>
>
> 8.11 Locale Filter
>
> - "can include the wildcard extended language range '*'": this is part of the definition of extended language ranges in BCP 47 and doesn't need to be stated here.

OK, will fix.

>
> - "included in any local" -> "included in any locale"

That will go away in a different edit.

>
>
> 8.15 Id Value
>
> - Different parts of this section seem contradictory: First, an id value is supposed to be a "unique identifier for a given part of the content", but then there's a selector that "selects the nodes to which this rule applies", with "nodes" in plural. If the selector selects multiple nodes, then the identifier isn't unique. A name that can be used to select multiple nodes is called a "class" in HTML. So, should this section be about classes, or should the selector be required to select a single node?


I think we addressed this comment this this mail exchange
http://lists.w3.org/Archives/Public/public-multilingualweb-lt-comments/2013Jan/0158.html


>
> - "xml:id (which is defined by XML)": I can't find this in the XML specification. Can you provide a reference?
We will provide a reference.
>
>
> 8.16 Preserve Space
>
> - "not applicable to HTML documents": Not quite correct - it is applicable to HTML documents in XHTML syntax. On the other hand, the non-applicability should be mentioned in normative text, in the Definition section: "The Preserve Space data category does not apply to HTML documents in HTML syntax."

OK, will make both changes.

>
>
> 8.21 Storage Size
>
> - "character set encoding" -> "character encoding" (multiple times)
will fix.

>
> - Example 94: It would be worth pointing out that CONTINUE doesn't fit.

Will do.

>
>
> A References
>
> - The XML 1.0 reference should be to the 5th edition.

Will add.

>
>
> Appendix B:
>
> - File extensions are commonly specified with leading period, i.e., ".its".

Will add.

>
> - .its is used for some other file types - I don't know whether that's likely to cause problems:
> http://www.fileinfo.com/extension/its

understand, but think that this will not cause problems.

>
>
> C Values for the Localization Quality Issue Type
>
> - locale-violation: Both YYYY-MM-DD and DD.MM.YYYY are valid date formats in Germany according to DIN 5008.

Thanks, we will replace with a Japanese example
Received on Thursday, 24 January 2013 09:08:24 UTC