- From: Phillips, Addison <addison@amazon.com>
- Date: Tue, 24 Mar 2009 12:07:35 -0700
- To: Alan Ruttenberg <alanruttenberg@gmail.com>, Sandro Hawke <sandro@w3.org>
- CC: "public-rdf-text@w3.org" <public-rdf-text@w3.org>, "team-rif-chairs@w3.org" <team-rif-chairs@w3.org>, "team-owl-chairs@w3.org" <team-owl-chairs@w3.org>
> Here is my take on the editor notes: > > > Issue 1, re: an infinity of characters in Unicode, seems wrong > according to the documentation of Unicode "All three encoding forms > need at most 4 bytes (or 32-bits) of data for each character", but > arguments for defining it that way are pragmatic. It would seem > that > this needs to be a technical decision about this, probably by vote > if there is not consensus at this point. The largest Unicode code point is 0x10FFFF. Period. There is not an infinity of Unicode code points. A better solution would just be to drop this sentence: -- The set of available characters is assumed to be infinite, and it is thus independent of the current version of UCS and Unicode. -- The set of characters is independent of the version of Unicode provided that the full range is supported. > > Issue 2 asks for an example of pattern and langpattern. > > An example of pattern would be "(in)|(out)", which matches the > character sequences "in" and "out" and nothing else. It is unclear > to me whether the literal should be written as a plan literal or not, > but I am guessing so. > > An example of a langpattern is "(en)|(en-.+)" - one could get more > precise by following http://www.rfc-editor.org/rfc/rfc4647.txt but > I'm not sure it's worth it. I think it's important to follow RFC 4647. A multiplicity of formats makes it more difficult to work with languages and the most likely useful source of 'langpattern' will be RFC 4647-style language priority lists. Also: following the pattern shown would NOT be compliant with BCP 47 language tag matching. (en-.+) matches many invalid tags, for example. Addison Addison Phillips Globalization Architect -- Lab126 Chair -- W3C Internationalization WG Editor -- IETF LTRU WG (BCP 47) Internationalization is not a feature. It is an architecture.
Received on Tuesday, 24 March 2009 19:08:15 UTC