- From: Boris Motik <boris.motik@comlab.ox.ac.uk>
- Date: Wed, 8 Apr 2009 11:32:57 +0100
- To: "'Axel Polleres'" <axel.polleres@deri.org>
- Cc: <public-rdf-text@w3.org>
Hello, All such keywords have been typeset in accordance with the W3C guidelines; see the W3C manual of style: http://www.w3.org/2001/06/manual/#RFC Regards, Boris > -----Original Message----- > From: Axel Polleres [mailto:axel.polleres@deri.org] > Sent: 08 April 2009 11:13 > To: Boris Motik > Cc: 'Phillips, Addison'; public-rdf-text@w3.org; public-i18n-core@w3.org > Subject: Re: comments on rdf:text draft > > Thanks Boris, Addison for your continuous efforts! > I think the document looks very good now. > > one more thing... should we captalize the use of words MUST, SHOULD, > etc. to make the normative usage of these words clear or do we not want > to imply that? > > Axel > > Boris Motik wrote: > > Hello, > > > >> -----Original Message----- > >> From: Phillips, Addison [mailto:addison@amazon.com] > >> Sent: 07 April 2009 21:39 > >> To: Boris Motik; public-rdf-text@w3.org > >> Cc: public-i18n-core@w3.org > >> Subject: RE: comments on rdf:text draft > >> > >>> Thanks for this comment. I'm afraid, however, that in response to > >>> Sandro's > >>> comments, I have rewritten earlier today this part of the > >>> introduction. I've > >>> adopted the "elevator pitch" that Sandro suggested. Please let me > >>> know should > >>> you consider that the current intro needs further revision. > >>> > >> The new text is okay, although I think it might leave the average reader > >> slightly mystifying what rdf:text is for. There is a lot of text about > >> different literal flavors, but no mention about why the presence or absence > of > >> a language tag is interesting. And it concludes with this paragraph, which > >> suggests some confusion about how to represent text in RDF: > >> > >> -- > >> RDF tools may use other mechanisms for representing internationalized text, > >> such as the xml:lang feature of the rdf:XMLLiteral datatype. The rdf:text > >> datatype does not provide a replacement for such mechanisms. > >> -- > >> > >> It seems to me that the introduction should say why these three classes of > >> literals are related and why rdf:text might be interesting. I would at > least > >> include some sort of notation about why language tags might be needed. > Perhaps > >> add a third bullet point: > >> > >> -- > >> * Literals often contain human-readable natural language text. RDF needs a > >> mechanism for representing literals in various different languages, for > >> selecting the proper literal in a specific language, and to allow > applications > >> to keep language information with literals to facilitate processing that is > >> language affected. > >> -- > >> > > > > I agree with your point. I've rewritten the first paragraph of the > introduction > > along the lines of what you suggested, and I hope that the introduction is > now > > clearer. > > > >> Minor notes: first bullet s/literals/literal/ > >> Also: "internationalized text" is a misnomer. Perhaps "text in different > >> languages"?? > >> > > > > I've changed this. > > > >>>> 3. The intro to section 2 is still not quite right. Instead of > >>> the first > >>>> paragraph, I think it suffices to say: > >>>> > >>>> -- > >>>> A 'character' is an atomic unit of text, as defined in [Unicode] > >>> and/or > >>>> [ISO/IEC 10646] and corresponding to the 'Char' production from > >>> [XML]. > >>>> -- > >>>> > >>> This formulation was taken from XML Schema. Nevertheless, your > >>> suggestion is an > >>> improvement, modulo the fact that, if a character must match the > >>> 'Char' > >>> production, it is not defined as in [Unicode]. Therefore, I've > >>> rewritten the first two sentences like this: > >> I'm not sure what you mean by this. Unicode defines a range of code points > and > >> 'Char' mirrors it. The definition of 'Char' actually says "Unicode code > >> points" :-). > >> > > > > But the 'Char' production seems to actually exclude some Unicode code > points. > > Here is how I interpreted all the specs: > > > > - Any integer between 0 and 0x10FFFF is a Unicode code point. > > - Not every such integer, however, matches the 'Char' production from XML. > > > > Because of that, it seems to me that the two definitions (i.e., the one in > > Unicode and the one in XML) are *not* equivalent; in fact, the latter is a > > proper subset of the former. > > > > [snip] > > > >>>> 4. The sentence "Code points are written as U+ followed by the > >>> hexadecimal > >>>> value of the code point" is not quite right. You might moderate > >>> this by saying > >>>> "are represented by U+ (etc.) in this document". Although you > >>> barely use the > >>>> U+ syntax in the document. Note that the sentence is also > >>> incomplete: the > >>>> usual minimum length of a U+ hex sequence is four hex digits > >>> (U+00E9). > >>> I've rephrased the sentence like this: > >>> > >>> Code points are represented in this document as U+ followed by a > >>> four-digit hexadecimal value of the code point. > >> That sounds good, although I'd even tend to say "are sometimes > represented", > >> since there are plenty of code points that are represented as ASCII > characters > >> :-). > >> > > > > I've added "sometimes". > > > > [snip] > > > >>> Nevertheless, I don't understand now whether foo-bar is a valid > >>> language tag. It > >>> does seem to match the production from BCP 47, so I'd say it is. > >>> Your > >>> explanation, however, suggests that the "en" part must be > >>> registered; is this > >>> really the case? In any case, I strongly believe that the > >>> definitions *must not* > >>> depend on any kind of a registry, as this would make the > >>> consequences of an OWL > >>> 2 ontology possibly vary over time. > >> This is why I refered specifically to the conformance requirements in BCP > 47, > >> which defines two separate terms: > >> > >> - "well-formed" means matching the ABNF/grammar but not necessarily > checking > >> to see if the subtags are registered. This is the sort of conformance you > >> have. > >> - "valid" means "well-formed" plus checking that the subtags are each > properly > >> registered (and a few other very minor checks on stuff like extensions). > This > >> is not the sort of conformance you require, although you allow it. > >> > > > > OK, thanks. But would it then be possible to have an example with "xy- > fubar"? > > Currently, we are using "en-fubar", which seems to suggest that the first > part > > ("en") must be somehow valid. (That is, I don't see a point in the 'langtag' > > production of BCP 47 which would force the first part to be "en", "de", or > > anything registered.) I'd like to use an example where nothing is "valid", > just > > to drive the message home that no registry must be taken into account. > Hence, if > > you agree, I'd change the example to "xy-fubar". > > > >> Hope this helps. > >> > > > > It does indeed: thanks a lot for your expert help! > > > > Regards, > > > > Boris > > > >> Addison > >> > >> Addison Phillips > >> Globalization Architect -- Lab126 > >> > >> Internationalization is not a feature. > >> It is an architecture. > >> > >> > > > > > > > > > -- > Dr. Axel Polleres > Digital Enterprise Research Institute, National University of Ireland, > Galway > email: axel.polleres@deri.org url: http://www.polleres.net/
Received on Wednesday, 8 April 2009 10:34:14 UTC