- From: Andrea Rendine <master.skywalker.88@gmail.com>
- Date: Fri, 13 Mar 2015 23:18:57 +0100
- To: public-html-comments@w3.org
- Message-ID: <CAGxST9mQGKOyQbXoxUCsFUD1B4kiWFasWyc1ZD_8o8kZvZ1RmQ@mail.gmail.com>
Gannon, please, before Librarians Flash Mob clashes against my door and whatever is inside (!), please tell me in layman's terms what the stuff about ISO 639-1 and -2 can say in order to prevent what I have in mind. With my proposal I referenced BCP Best Current Practices 47, "Tags for Identifying Languages", as specified by the document https://tools.ietf.org/html/bcp47 and referenced by all modern HTML specifications: *"The lang attribute (in no namespace) specifies the primary language for the element's contents and for any of the element's attributes that contain text. Its value must be a valid BCP 47 language tag, or the empty string. Setting the attribute to the empty string indicates that the primary language is unknown. [BCP47]"* Now, this is what BCP 47 says about Private use subtags: "Private use subtags are used to indicate distinctions in language that are important in a given context by private agreement. The following rules apply to private use subtags: 1. Private use subtags are separated from the other subtags defined in this document by the reserved single-character subtag 'x'. 2. Private use subtags MUST conform to the format and content constraints defined in the ABNF for all subtags; that is, they MUST consist solely of letters and digits and not exceed eight characters in length. 3. Private use subtags MUST follow all primary language, extended language, script, region, variant, and extension subtags in the tag. Another way of saying this is that all subtags following the singleton 'x' MUST be considered private use. Example: The subtag 'US' in the tag "en-x-US" is a private use subtag. 4. A tag MAY consist entirely of private use subtags. 5. No source is defined for private use subtags. Use of private use subtags is by private agreement only. 6. Private use subtags are NOT RECOMMENDED where alternatives exist or for general interchange. See Section 4.6 for more information on private use subtag choice." (BCP 47, page 17, what a luck!) The previous paragraph talks about extension subtags. It defines a mechanism for extending language tags for use in various applications. They are intended to identify information that is commonly used in association with languages or language tags but that is not part of language identification." I don't consider registering extension subtags because they'd undergo a proper submission procedure that I have no authority to propose, and also because extension subtags must follow a primary language tag (which wouldn't be preferable in our case). Extension subtags are identified by a singleton, which can be whatever letter, EXCLUDED x- because x- identifies private use. At page 3-4 in the same spec (BCP 47), the rules for the language tag formation specify that a language tag may be constituted by a private use subtag alone, identified with an "x" singleton and consisting of 1*("-" (1*8alphanum)), which means, if I understand, that x-php x-alpha1 are valid private use subtags. As a language tag may consist of this alone, they're also valid language tags. x-c++ x-javascript would NOT be valid: the first uses symbols other than the alphanumeric range, the second is too long. Thus the need for an agreement. As x- defines a private use, a string corresponding to an existing language subtag is not to be interpreted in its original meaning, i.e. x-US is valid, but en-x-US has nothing to do with United States (unless its underlying agreement decides so, but even in that case, it isn't the same than en-US). So in short, in your opinion:would lang="x-perl", lang="en-US-x-perl" be valid? If not, why?
Received on Friday, 13 March 2015 22:19:24 UTC