RE: ISSUE-191 reference filtering in RFC 4647 (was Re: Request for feedback on SKOS Last Call Working Draft)

Hello Alistair,

Except as noted, this response is on behalf of the I18N Core WG (see [1]).

This change looks good and will meet the Internationalization working group's concerns. Thank you for your help on this. We also support Richard Ishida's comments to you and I will reply to your email on that topic under separate cover.

=== personal observations ===

The proposed text works pretty well. I would suggest only a couple of minor changes.

""" It is suggested that applications match requests for labels in a
given language to labels with related language tags that are provided
by a SKOS concept scheme,

AP> I would suggest not using the term "related" here, since BCP 47 points out that tags that share a prefix may not actually be mutually intelligible. Perhaps:

...(of which there could be many), and are compatible with SKOS concept
schemes that provide only those labels whose lexical forms are
distinct for a given language or collection of languages.  """

AP> This last bit might be a little too "tight around the collar"? It suggests that only a few SKOS schemes will limit their array of tags, whereas, in truth, most will provide limited coverage, even within a language family. Furthermore, BCP 47 has long recommended using the simplest language tag possible. Perhaps:

"... (of which there could be many), since, in keeping with best practices, most SKOS concept schemes will provide the simplest language tag for a given label and only supply those additional labels whose lexical forms are distinct for a given language variation."

 === end personal observations ===

My foregoing observations are certainly optional. I18N looks forward to seeing your WG announce advancement of SKOS Reference to CR.

Best Regards,

Addison (for I18N)

[1] http://www.w3.org/2009/03/11-core-minutes.html#item06


Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

> -----Original Message-----
> From: Alistair Miles [mailto:alistair.miles@zoo.ox.ac.uk]
> Sent: Saturday, March 14, 2009 11:44 AM
> To: Phillips, Addison
> Cc: Antoine Isaac; Ralph R. Swick; Richard Ishida; public-swd-
> wg@w3.org; public-i18n-core@w3.org; 'Felix Sasaki'
> Subject: Re: ISSUE-191 reference filtering in RFC 4647 (was Re:
> Request for feedback on SKOS Last Call Working Draft)
>
> Hi Addison,
>
> I have published a new revision of the SKOS Reference Editors'
> draft:
>
> http://www.w3.org/2006/07/SWD/SKOS/reference/20081001/ (revision
> 1.86)
>
> In this draft in section 5.6.5 (Labeling and Language Tags) I have
> added the following paragraph:
>
> """ It is suggested that applications match requests for labels in
> a
> given language to labels with related language tags that are
> provided
> by a SKOS concept scheme, e.g. by implementing the "lookup"
> algorithm
> defined by [BCP 47]. Applications that perform matching in this way
> do
> not require labels to be provided in all possible language
> variations
> (of which there could be many), and are compatible with SKOS
> concept
> schemes that provide only those labels whose lexical forms are
> distinct for a given language or collection of languages.  """
>
> I basically followed your suggested prose, with a few tweaks to fit
> with the language of the document.
>
> I'm going to offer this revision to the working group for
> progression
> to candidate recommendation, if you have any further comments or
> suggestions then please let me know as soon as possible.
>
> Note that there are also some other editorial changes in section 5
> in
> response to suggestions by Richard, which I will describe in a
> separate email.
>
> Kind regards,
>
> Alistair
>
>
>
> On Tue, Mar 10, 2009 at 12:51:54PM +0000, Alistair Miles wrote:
> > Hi Addison,
> >
> > I've raised an issue in the SWDWG tracker to ensure that this
> > discussion is recorded in the WG's audit trail:
> >
> > http://www.w3.org/2006/07/SWD/track/issues/191

> >
> > With respect to the SKOS Reference, would you be minimally
> satisfied
> > if section 5.6.5 [1] were to include the following sentence:
> >
> > """ It is suggested that applications match requests for a label
> in a
> > given language to related language tags that exist in the SKOS
> > document, e.g. by implementing the "lookup" algorithm from [BCP
> > 47]. This practice is compatible with SKOS concept schemes that
> > provide only those labels whose lexical forms are distinct for a
> given
> > language or collection of languages. """
> >
> > My initial reaction was to view this as an aspect of best
> practice
> > that is out of scope for the SKOS Reference and would be better
> dealt
> > with in a separate document, but I don't have a strong feeling
> about
> > this and am happy to include a note in the SKOS Reference.
> >
> > Kind regards,
> >
> > Alistair
> >
> > [1] http://www.w3.org/2006/07/SWD/SKOS/reference/20081001/#L1629

> >
> > On Mon, Mar 09, 2009 at 11:51:42AM -0700, Phillips, Addison wrote:
> > > Hi, (personal comment follows)
> > >
> > > I don't agree that SKOS should ignore this issue in its
> documents. My concern is that the text and examples in SKOS may go
> too far by concentrating on the fact that different language tags
> are separate. I don't think that SKOS has to promote a particular
> matching scheme or implementation of language tags, but it needs to
> balance separation of tags for RDF purposes from an acknowledgement
> of how language tags are typically expected/supposed to work. The
> fact that this thread is tied up in knots on the issue should be an
> indicator that users of the Reference and Primer might need a hint
> of how to proceed.
> > >
> > > I think, in fact, that this text in the Primer is misleading:
> > >
> > > --
> > > Note that the notion of preferred label implies that a resource
> can only have one such label per language, as it is mentioned in
> Section 5 of the SKOS Reference [SKOS-REFERENCE].
> > >
> > > Following common practice in KOS design, the preferred label of
> a concept may be also used to unambiguously represent this concept
> within one KOS and its applications. Although SKOS semantics do not
> formally enforce it, it is therefore recommended that no two
> concepts in the same KOS be given the same preferred lexical label
> in any two given languages.
> > > --
> > >
> > > No mention is made of the overlapping nature of tags. This
> suggests that you would only label the "differences" in a SKOS
> document between two related languages:
> > >
> > >    skos:prefLabel "red"@en
> > >    ...
> > >    skos:prefLabel "green"@en
> > >    ...
> > >    skos:prefLabel "color"@en <!-- cultural bias here -->
> > >    skos:prefLabel "colour"@en-GB
> > >
> > > Again, this suggests a resource tree rather than a dictionary.
> Also: your recommendation will be problematic when there are cross-
> language homonyms. For example, both English and French have the
> word "chat" (but it means something different in each); while the
> word "machine" exists in both and means (roughly) the same thing.
> > >
> > > So I might say the following instead of the above text:
> > >
> > > --
> > > Note that the notion of preferred label means that a resource
> can only have one such label per language tag, as is mentioned in
> Section 5 of the SKOS Reference [SKOS-REFERENCE].
> > >
> > > Following common practice in KOS design, the preferred label of
> a concept may be also used to unambiguously represent this concept
> within one KOS and its applications. Although SKOS semantics do not
> formally enforce it, it is therefore recommended that no two
> concepts in the same KOS be given the same preferred lexical label
> using the same language tag.
> > >
> > > Two languages might sometimes apply the same label to different
> concepts in different contexts: this should be avoided to the
> extent possible. In addition, it may sometimes be desirable to use
> the same label with different language tags, even if the languages
> are related.
> > >
> > > Because there are many more language tags that can be generated
> than there are distinct labels needed in any particular KOS, it is
> recommended that implementations match requests for a label in a
> given language to related language tags that exist in the SKOS
> document, perhaps by implementing the "lookup" algorithm from IETF
> BCP 47. This allows the SKOS document to carry only those labels
> that are distinct for a given language or collection of languages.
> > > --
> > >
> > > Something like that. Otherwise I think you'll run afoul of
> implementers making all manner of (problematic) assumptions about
> what language tag presence or absence means in SKOS labels.
> > >
> > > Regards,
> > >
> > > Addison
> > >
> > > Addison Phillips
> > > Globalization Architect -- Lab126
> > >
> > > Internationalization is not a feature.
> > > It is an architecture.
> > >
> > > > -----Original Message-----
> > > > From: Antoine Isaac [mailto:aisaac@few.vu.nl]
> > > > Sent: Saturday, March 07, 2009 6:12 AM
> > > > To: Phillips, Addison
> > > > Cc: Alistair Miles; Ralph R. Swick; Richard Ishida; public-
> swd-
> > > > wg@w3.org; public-i18n-core@w3.org; 'Felix Sasaki'
> > > > Subject: Re: Request for feedback on SKOS Last Call Working
> Draft
> > > >
> > > > Hi Addison,
> > > >
> > > > To clarify my previous mail. Your point makes much sense to
> me, but
> > > > I don't think we should add this in the SKOS documents
> (that's true
> > > > for the Reference, and even more true for the Primer).
> > > > These matters are indeed quite complex, especially for
> "normal" RDF
> > > > users who are not aware of these things. Furthermore, they
> are not
> > > > really specific to SKOS, but to every data representation
> means
> > > > which use language tags. And they are more related to the way
> one
> > > > consumes data than to the way it is represented and exchanged,
> > > > which I feel is the core business of SKOS.
> > > >
> > > > Note that this position is just my own, I'm not speaking for
> the
> > > > SWD WG here.
> > > >
> > > > Best
> > > >
> > > > Antoine
> > > >
> > > > > Hi Addison,
> > > > >
> > > > > It makes sense!
> > > > >
> > > > > Antoine
> > > > >
> > > > >> Hi Antoine,
> > > > >>
> > > > >> Yes, as I said the SKOS model is technically correct,
> accurate,
> > > > and
> > > > >> complete. The issue is what users and implementations do
> with it.
> > > > I
> > > > >> think the main concern I have is that SKOS Reference makes
> quite
> > > > clear
> > > > >> that you can have multiple labels with related-but-not-
> identical
> > > > >> language tags. It is just that, having gone out of its way
> to
> > > > say that
> > > > >> 'en' != 'en-US', it doesn't further clarify that the
> presence of
> > > > an
> > > > >> 'en' tag is allowed imply a match with e.g. 'en-AU' or
> 'en-NZ',
> > > > if the
> > > > >> latter are not provided as distinct labels.
> > > > >>
> > > > >> Does that make sense?
> > > > >>
> > > > >> Addison
> > > > >>
> > > > >> Addison Phillips
> > > > >> Globalization Architect -- Lab126
> > > > >>
> > > > >> Internationalization is not a feature.
> > > > >> It is an architecture.
> > > > >>
> > > > >>
> > > > >>> -----Original Message-----
> > > > >>> From: Antoine Isaac [mailto:aisaac@few.vu.nl]
> > > > >>> Sent: Wednesday, March 04, 2009 10:00 AM
> > > > >>> To: Phillips, Addison
> > > > >>> Cc: Alistair Miles; Ralph R. Swick; Richard Ishida;
> public-swd-
> > > > >>> wg@w3.org; public-i18n-core@w3.org; 'Felix Sasaki'
> > > > >>> Subject: Re: Request for feedback on SKOS Last Call
> Working
> > > > Draft
> > > > >>>
> > > > >>> Hi Addison,
> > > > >>>
> > > > >>> Thanks for the explanation, which makes a bit clear what
> I had
> > > > >>> understood from [1]:
> > > > >>> "Matching different language tags is important for a
> number of
> > > > >>> applications. According to BCP 47 'en' can be said to
> match
> > > > 'en-
> > > > >>> GB'."
> > > > >>>
> > > > >>> If I understand well, there are applications that could
> do this
> > > > >>> filtering, and if they use data which was not intended
> for
> > > > >>> filtering (that is, data including language tag variation,
> > > > because
> > > > >>> their original context of application was concerned with
> that),
> > > > >>> then there could be trouble.
> > > > >>>
> > > > >>> But maybe this is not so much trouble in fact: that kind
> of
> > > > >>> matching does not amount to producing new RDF data (in
> your
> > > > example,
> > > > >>> a new triple ex:walkingPath skos:prefLabel
> "sidewalk"@en. ),
> > > > does
> > > > >>> it?
> > > > >>> If the data stays the same, and if as you say it is
> technically
> > > > >>> valid, then there is no possible inconsistency with what
> the
> > > > SKOS
> > > > >>> model specifies.
> > > > >>>
> > > > >>> Best,
> > > > >>>
> > > > >>> Antoine
> > > > >>>
> > > > >>> [1] http://www.w3.org/International/articles/language-

> tags/
> > > > >>>
> > > > >>>
> > > > >>>> Hello Alistair,
> > > > >>>>
> > > > >>>> Thanks for the note back.
> > > > >>>>
> > > > >>>> I'm aware of the SPARQL function: I helped the WG craft
> the
> > > > text
> > > > >>> about it. The query function might turn out to be a
> problem and
> > > > I
> > > > >>> may not have given the right feedback in my last email.
> Let me
> > > > >>> explain.
> > > > >>>> My concern is that, if you have a triple like:
> > > > >>>>
> > > > >>>> ex:walkingPath rdf:type skos:Concept;
> > > > >>>>   skos:prefLabel "sidewalk"@en-US;
> > > > >>>>   skos:prefLabel "pavement"@en
> > > > >>>>
> > > > >>>> ... then SKOS rightly asserts that "en" and "en-US" are
> > > > different
> > > > >>> languages exclusive of one another. This implies that one
> must
> > > > >>> include a separate prefLabel for every possible language
> tag
> > > > >>> variation one wishes to support. This is not generally
> the
> > > > >>> intention when applying language tags.
> > > > >>>> So my example doesn't say whether the label for "en"
> covers a
> > > > >>> user who speaks "en-GB" or "en-AU" or "en-NZ" (for
> example).
> > > > Those
> > > > >>> are all different languages not specified. Typically, a
> request
> > > > for
> > > > >>> the label from the SKOS description of an ontology will
> contain
> > > > the
> > > > >>> user's fully qualified language preference--that is, they
> are
> > > > >>> specifying the MOST information that they care to provide
> about
> > > > >>> their language. The matching scheme in RFC 4647 for that
> is
> > > > called
> > > > >>> "lookup" and it falls back (a request for "en-GB" in my
> example
> > > > >>> would find "pavement", labeled as "en"). That is, a SKOS
> file
> > > > >>> contains what we I18N folks would call a "resource
> bundle" or
> > > > >>> "message catalog".
> > > > >>>> In any case, SKOS is technically correct, but I think my
> > > > advice
> > > > >>> would be to add some note clarifying that a natural
> language
> > > > label
> > > > >>> defined in SKOS should be considered to apply to any
> request
> > > > not
> > > > >>> masked by some other label. It is possible but very
> difficult
> > > > to
> > > > >>> construct using SPARQL langMatches, whose purpose is
> actually
> > > > >>> different.
> > > > >>>> So I guess I'd request notes in the Reference and Primer
> > > > >>> clarifying that, although (for example) "en" and "en-US"
> are
> > > > >>> considered to be different, one may consider a shorter
> language
> > > > tag
> > > > >>> that is a "prefix" (by language tag standards) to match a
> > > > longer
> > > > >>> "language range" in a request. That is, you don't need to
> > > > supply
> > > > >>> "en-AU" if it is not different from "en".
> > > > >>>> Regards,
> > > > >>>>
> > > > >>>> Addison
> > > > >>>>
> > > > >>>> Addison Phillips
> > > > >>>> Globalization Architect -- Lab126
> > > > >>>>
> > > > >>>> Internationalization is not a feature.
> > > > >>>> It is an architecture.
> > > > >>>>
> >
> > --
> > Alistair Miles
> > Senior Computing Officer
> > Image Bioinformatics Research Group
> > Department of Zoology
> > The Tinbergen Building
> > University of Oxford
> > South Parks Road
> > Oxford
> > OX1 3PS
> > United Kingdom
> > Web: http://purl.org/net/aliman

> > Email: alistair.miles@zoo.ox.ac.uk
> > Tel: +44 (0)1865 281993
> >
>
> --
> Alistair Miles
> Senior Computing Officer
> Image Bioinformatics Research Group
> Department of Zoology
> The Tinbergen Building
> University of Oxford
> South Parks Road
> Oxford
> OX1 3PS
> United Kingdom
> Web: http://purl.org/net/aliman

> Email: alistair.miles@zoo.ox.ac.uk
> Tel: +44 (0)1865 281993

Received on Saturday, 14 March 2009 19:11:45 UTC