RE: comments on rdf:text draft

Hello,

All such keywords have been typeset in accordance with the W3C guidelines; see
the W3C manual of style:

http://www.w3.org/2001/06/manual/#RFC

Regards,

	Boris

> -----Original Message-----
> From: Axel Polleres [mailto:axel.polleres@deri.org]
> Sent: 08 April 2009 11:13
> To: Boris Motik
> Cc: 'Phillips, Addison'; public-rdf-text@w3.org; public-i18n-core@w3.org
> Subject: Re: comments on rdf:text draft
> 
> Thanks Boris, Addison for your continuous efforts!
> I think the document looks very good now.
> 
> one more thing... should we captalize the use of words MUST, SHOULD,
> etc. to make the normative usage of these words clear or do we not want
> to imply that?
> 
> Axel
> 
> Boris Motik wrote:
> > Hello,
> >
> >> -----Original Message-----
> >> From: Phillips, Addison [mailto:addison@amazon.com]
> >> Sent: 07 April 2009 21:39
> >> To: Boris Motik; public-rdf-text@w3.org
> >> Cc: public-i18n-core@w3.org
> >> Subject: RE: comments on rdf:text draft
> >>
> >>> Thanks for this comment. I'm afraid, however, that in response to
> >>> Sandro's
> >>> comments, I have rewritten earlier today this part of the
> >>> introduction. I've
> >>> adopted the "elevator pitch" that Sandro suggested. Please let me
> >>> know should
> >>> you consider that the current intro needs further revision.
> >>>
> >> The new text is okay, although I think it might leave the average reader
> >> slightly mystifying what rdf:text is for. There is a lot of text about
> >> different literal flavors, but no mention about why the presence or absence
> of
> >> a language tag is interesting. And it concludes with this paragraph, which
> >> suggests some confusion about how to represent text in RDF:
> >>
> >> --
> >> RDF tools may use other mechanisms for representing internationalized text,
> >> such as the xml:lang feature of the rdf:XMLLiteral datatype. The rdf:text
> >> datatype does not provide a replacement for such mechanisms.
> >> --
> >>
> >> It seems to me that the introduction should say why these three classes of
> >> literals are related and why rdf:text might be interesting. I would at
> least
> >> include some sort of notation about why language tags might be needed.
> Perhaps
> >> add a third bullet point:
> >>
> >> --
> >> * Literals often contain human-readable natural language text. RDF needs a
> >> mechanism for representing literals in various different languages, for
> >> selecting the proper literal in a specific language, and to allow
> applications
> >> to keep language information with literals to facilitate processing that is
> >> language affected.
> >> --
> >>
> >
> > I agree with your point. I've rewritten the first paragraph of the
> introduction
> > along the lines of what you suggested, and I hope that the introduction is
> now
> > clearer.
> >
> >> Minor notes: first bullet s/literals/literal/
> >> Also: "internationalized text" is a misnomer. Perhaps "text in different
> >> languages"??
> >>
> >
> > I've changed this.
> >
> >>>> 3. The intro to section 2 is still not quite right. Instead of
> >>> the first
> >>>> paragraph, I think it suffices to say:
> >>>>
> >>>> --
> >>>> A 'character' is an atomic unit of text, as defined in [Unicode]
> >>> and/or
> >>>> [ISO/IEC 10646] and corresponding to the 'Char' production from
> >>> [XML].
> >>>> --
> >>>>
> >>> This formulation was taken from XML Schema. Nevertheless, your
> >>> suggestion is an
> >>> improvement, modulo the fact that, if a character must match the
> >>> 'Char'
> >>> production, it is not defined as in [Unicode]. Therefore, I've
> >>> rewritten the first two sentences like this:
> >> I'm not sure what you mean by this. Unicode defines a range of code points
> and
> >> 'Char' mirrors it. The definition of 'Char' actually says "Unicode code
> >> points" :-).
> >>
> >
> > But the 'Char' production seems to actually exclude some Unicode code
> points.
> > Here is how I interpreted all the specs:
> >
> > - Any integer between 0 and 0x10FFFF is a Unicode code point.
> > - Not every such integer, however, matches the 'Char' production from XML.
> >
> > Because of that, it seems to me that the two definitions (i.e., the one in
> > Unicode and the one in XML) are *not* equivalent; in fact, the latter is a
> > proper subset of the former.
> >
> > [snip]
> >
> >>>> 4. The sentence "Code points are written as U+ followed by the
> >>> hexadecimal
> >>>> value of the code point" is not quite right. You might moderate
> >>> this by saying
> >>>> "are represented by U+ (etc.) in this document". Although you
> >>> barely use the
> >>>> U+ syntax in the document. Note that the sentence is also
> >>> incomplete: the
> >>>> usual minimum length of a U+ hex sequence is four hex digits
> >>> (U+00E9).
> >>> I've rephrased the sentence like this:
> >>>
> >>> Code points are represented in this document as U+ followed by a
> >>> four-digit hexadecimal value of the code point.
> >> That sounds good, although I'd even tend to say "are sometimes
> represented",
> >> since there are plenty of code points that are represented as ASCII
> characters
> >> :-).
> >>
> >
> > I've added "sometimes".
> >
> > [snip]
> >
> >>> Nevertheless, I don't understand now whether foo-bar is a valid
> >>> language tag. It
> >>> does seem to match the production from BCP 47, so I'd say it is.
> >>> Your
> >>> explanation, however, suggests that the "en" part must be
> >>> registered; is this
> >>> really the case? In any case, I strongly believe that the
> >>> definitions *must not*
> >>> depend on any kind of a registry, as this would make the
> >>> consequences of an OWL
> >>> 2 ontology possibly vary over time.
> >> This is why I refered specifically to the conformance requirements in BCP
> 47,
> >> which defines two separate terms:
> >>
> >> - "well-formed" means matching the ABNF/grammar but not necessarily
> checking
> >> to see if the subtags are registered. This is the sort of conformance you
> >> have.
> >> - "valid" means "well-formed" plus checking that the subtags are each
> properly
> >> registered (and a few other very minor checks on stuff like extensions).
> This
> >> is not the sort of conformance you require, although you allow it.
> >>
> >
> > OK, thanks. But would it then be possible to have an example with "xy-
> fubar"?
> > Currently, we are using "en-fubar", which seems to suggest that the first
> part
> > ("en") must be somehow valid. (That is, I don't see a point in the 'langtag'
> > production of BCP 47 which would force the first part to be "en", "de", or
> > anything registered.) I'd like to use an example where nothing is "valid",
> just
> > to drive the message home that no registry must be taken into account.
> Hence, if
> > you agree, I'd change the example to "xy-fubar".
> >
> >> Hope this helps.
> >>
> >
> > It does indeed: thanks a lot for your expert help!
> >
> > Regards,
> >
> > 	Boris
> >
> >> Addison
> >>
> >> Addison Phillips
> >> Globalization Architect -- Lab126
> >>
> >> Internationalization is not a feature.
> >> It is an architecture.
> >>
> >>
> >
> >
> >
> 
> 
> --
> Dr. Axel Polleres
> Digital Enterprise Research Institute, National University of Ireland,
> Galway
> email: axel.polleres@deri.org  url: http://www.polleres.net/

Received on Wednesday, 8 April 2009 10:34:14 UTC