W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2009

Re: Request for feedback on SKOS Last Call Working Draft

From: Alistair Miles <alistair.miles@zoo.ox.ac.uk>
Date: Wed, 11 Mar 2009 08:37:28 +0000
To: Richard Ishida <ishida@w3.org>
Cc: public-swd-wg@w3.org, "'Ralph R. Swick'" <swick@w3.org>, public-i18n-core@w3.org, 'Felix Sasaki' <fsasaki@w3.org>
Message-ID: <20090311083727.GB9315@skiathos>
Hi Richard,

I will make these changes.

Thanks again,

Alistair

On Tue, Mar 10, 2009 at 06:36:33PM -0000, Richard Ishida wrote:
> Small editorial comments:
> 
> Btw, I also think you should change "in a given natural language, such as English or Japanese Hiragana." to read "in a given natural language, such as English or Japanese (written here in hiragana)."   (The language of ja-hira is still just Japanese, even though the tag also indicates that it is written using hiragana.)
> 
> Also, very minor nit, I don't think you need to titlecase Kanji, Hiragana, etc in 5.6.5.
> 
> RI
> 
> ============
> Richard Ishida
> Internationalization Lead
> W3C (World Wide Web Consortium)
> 
> http://www.w3.org/International/
> http://rishida.net/
> 
> 
> 
> 
> > -----Original Message-----
> > From: public-i18n-core-request@w3.org [mailto:public-i18n-core-
> > request@w3.org] On Behalf Of Richard Ishida
> > Sent: 10 March 2009 18:16
> > To: 'Alistair Miles'
> > Cc: public-swd-wg@w3.org; 'Ralph R. Swick'; public-i18n-core@w3.org; 'Felix
> > Sasaki'
> > Subject: RE: Request for feedback on SKOS Last Call Working Draft
> > 
> > Alistair,
> > 
> > Thanks for this reply.  Sorry it has taken me so long to find time to reply,
> > though I have been following the discussion with Addison.
> > 
> > I understand better the position now, and the Japanese example you cited
> > was quite helpful.
> > 
> > Having said that, I needed your explanation to clarify that, and I think that
> > other people are also inclined to see a usage in a spec and assume that that
> > is an example of best practice.  So I would really like to see a condensed
> > statement of what you say below, as a warning.  How about following on
> > from the sentence at the end of 5.1 as follows:
> > 
> > "See the SKOS Primer for more examples of labeling SKOS concepts. Note
> > that the labeling shown in these examples does not necessarily indicate best
> > practice. The SKOS Reference tries to establish a general framework that is
> > applicable across a range of situations, which may then be refined and/or
> > constrained by usage conventions for more specific situations. Application
> > and language-specific usage conventions are out of scope for the SKOS
> > Reference."
> > 
> > Cheers,
> > RI
> > 
> > ============
> > Richard Ishida
> > Internationalization Lead
> > W3C (World Wide Web Consortium)
> > 
> > http://www.w3.org/International/
> > http://rishida.net/
> > 
> > 
> > 
> > 
> > > -----Original Message-----
> > > From: Alistair Miles [mailto:alistair.miles@zoo.ox.ac.uk]
> > > Sent: 26 February 2009 11:33
> > > To: Richard Ishida
> > > Cc: public-swd-wg@w3.org; 'Ralph R. Swick'; public-i18n-core@w3.org;
> > 'Felix
> > > Sasaki'
> > > Subject: Re: Request for feedback on SKOS Last Call Working Draft
> > >
> > > Dear Richard,
> > >
> > > Some comments on specific points of your discussion inline below...
> > >
> > > On Tue, Feb 24, 2009 at 06:50:53PM -0000, Richard Ishida wrote:
> > > > > From: Felix Sasaki [mailto:fsasaki@w3.org]
> > > > > Sent: 03 February 2009 02:24
> > > > > To: Richard Ishida
> > > > > Cc: public-swd-wg@w3.org; 'Ralph R. Swick'; public-i18n-core@w3.org
> > > > > Subject: Re: Request for feedback on SKOS Last Call Working Draft
> > > > >
> > > > > Richard Ishida さんは書きました:
> > > > > > I agree that using the word 'language' to describe every different
> > > language
> > > > > tag, including en-GB and en-US and en, doesn't sound right.
> > > > > >
> > > > > > I have another question too.  In example 11 we see
> > > > > >
> > > > > > <AnotherResource>
> > > > > >   skos:prefLabel "東"@ja-Hani ;
> > > > > >   skos:prefLabel "ひがし"@ja-Hira ;
> > > > > >   skos:altLabel "あずま"@ja-Hira ;
> > > > > >   skos:prefLabel "ヒガシ"@ja-Kana ;
> > > > > >   skos:altLabel "アズマ"@ja-Kana ;
> > > > > >   skos:prefLabel "higashi"@ja-Latn ;
> > > > > >   skos:altLabel "azuma"@ja-Latn .
> > > > > >
> > > > > >
> > > > > > Here there are four prefLabels associated with the same word in
> > > Japanese
> > > > > (just spelled in four different ways).  From a semantic point of view, I'm
> > > not
> > > > > sure that this makes sense, and I would have expected the kana and
> > > romaji
> > > > > versions to be altLabels. What is the value of having more than one
> > > prefLabel
> > > > > for a given language when the word being used is exactly the same?
> > > > >
> > > > >  From http://www.w3.org/TR/skos-primer/#secpref
> > > > > "RDF plain literals are formally defined as character strings with
> > > > > optional language tags [RDF-CONCEPTS]. SKOS thereby enables a
> > > simple
> > > > > form of multilingual labelling. "
> > > >
> > > > Right.  But I don't think that addresses my question.  If you use the word
> > > language in my question to refer to a natural language, such as in this case
> > > Japanese, my question still stands: What is the value of having more than
> > one
> > > prefLabel for a given language, albeit with different spellings, when the
> > word
> > > being used is exactly the same?
> > >
> > > A typical use case would be adapting a user interface to a user's
> > > locale. For example, if you consider en-GB vs. en-US, it makes sense
> > > to provide a prefLabel in both en-GB and en-US, so that a UI could
> > > choose the preferred label for a concept depending on the user's
> > > locale.
> > >
> > > So in the general case, I think it makes sense to provide more than
> > > one preferred label with the same primary language subtag (e.g. "en")
> > > but with different script and/or regions subtags. I.e. in principle, I
> > > don't see anything fundamentally wrong with the possibility to provide
> > > multiple prefLabels with the same primary language subtag. Do you
> > > agree?
> > >
> > > This is the immediage issue for the WG. The SKOS Reference tries to
> > > establish a general framework that is applicable across a range of
> > > situations, which may then be refined and/or constrained by usage
> > > conventions for more specific situations.
> > >
> > > I.e. For specific applications, it may not make sense to provide more
> > > than one prefLabel for a given primary language subtag, as you
> > > suggest. This would then constitute an application-, community- or
> > > language-specific usage convention, which is perfectly reasonable, but
> > > which is out of scope for the SKOS reference.
> > >
> > > For example, I understand from discussions with Shigeo Sugimoto and
> > > Mitsuharu Nagamori of the University of Tsukuba, who have worked on a
> > > SKOS representation of the Japanese National Diet Library Subject
> > > Headings (NDLSH), that the typical requirement for rendering the NDLSH
> > > for a Japanese user is to display both the Kanji and the Yomi
> > > transcription for each label (see e.g. attachment to [1]). Their
> > > solution, I believe, is to provide prefLabels in both Kanji and Yomi,
> > > and then to use a custom extension to SKOS to explicitly link each
> > > Kanji label to its Yomi transcription so the labels may be associated
> > > in the display.
> > >
> > > So based on their work, I understood that there is nothing
> > > fundamentally wrong with example 11 in the SKOS Reference [2], which
> > > serves to convey the general principle that multiple preferred labels
> > > *may* be given with script or region variations on a common primary
> > > subtag.
> > >
> > > You might consider that, for a specific use cases, it is more
> > > appropriate to provide a single prefLabel with the "ja" primary
> > > subtag, and to provide all script- or region-specific labels as
> > > altLabels, however this would be an application and language-specific
> > > usage convention, which is out of scope for the SKOS Reference, and
> > > which needs to be established within the relevant community of
> > > practice.
> > >
> > > Does this make sense?
> > >
> > > Kind regards,
> > >
> > > Alistair
> > >
> > > [1] http://lists.w3.org/Archives/Public/public-esw-thes/2007Mar/0015.html
> > > [2] http://www.w3.org/2006/07/SWD/SKOS/reference/20081001/#labels
> > >
> > >
> > > >
> > > > >
> > > > > >  I suppose I could see the use of contrasting "東"@ja with
> > "higashi"@ja-
> > > Latn
> > > > > so that non-Japanese people could state a preference to see the
> > > transcribed
> > > > > form of the Japanese word (though from a semantic point of view,
> > > > > presumably skos:prefLabel "East"@en would be better?).  But maybe
> > this
> > > is
> > > > > idiosynchratic to Japanese, since for Japanese people the hiragana
> > and
> > > > > katakana transcriptions are usually just alternatives to the kanji version.
> > > > > >
> > > > >
> > > > > Correct, but a multilingual system may be used by non-Japanese
> > persons,
> > > > > e.g. learning Japanese, who rely on "higashi"@ja-Latn. You could argue
> > > > > if multilingual fits to Japanese written with latin script versus
> > > > > Japanese script, but I think we don't have to argue ...
> > > >
> > > > But isn't the meaning what's important here?  Why would a non-Japanese
> > > person use higashi rather than East?  That would only be of use to a
> > person
> > > who happens to speak Japanese but not write it, right?
> > > >
> > > >
> > > > >
> > > > > .
> > > > >
> > > > > > On a slightly different tack, what's the advice wrt when one should
> > use,
> > > eg.,
> > > > > en-GB / en-US / en?
> > > > >
> > > > > Are you asking about preferred, alternative or hidden lexical labels?
> > > > >
> > > > > >  I would have thought that one should use en unless there are
> > > divergent
> > > > > spellings (eg. colour vs color) or locutions (eg. lift vs elevator), but
> > > example
> > > > > 19 shows
> > > > > >
> > > > > > "color"@en , "color"@en-US , "colour"@en-GB .
> > > > > >
> > > > > > which seems to present two problems:
> > > > > >
> > > > >
> > > > > Maybe these sections
> > > > > http://www.w3.org/TR/skos-primer/#secpref
> > > > > http://www.w3.org/TR/skos-primer/#secalt
> > > > > http://www.w3.org/TR/skos-primer/#sechidden
> > > > > explain the problems, and the difference between the three labels?
> > > > >
> > > > > > [1] it requires a lot more annotation than strictly necessary, since
> > > > > applications using this data ought to be able to tell that "color"@en  is
> > > > > appropriate for en-US in the absence of a specific "color"@en-US
> > label
> > > (three
> > > > > is already doubly redundant here, but there are more varieties of
> > English
> > > > > than this, eg. en-AU,en-IR, etc....)
> > > > > >
> > > > > > [2] without this matching capability, you could end up with
> > unnecessary
> > > > > gaps in the data (for example, what about a search originating from
> > an
> > > en-
> > > > > AU context?
> > > > >
> > > > > Note that the role of labels can be very different. From
> > > > > http://www.w3.org/TR/skos-primer/#seclabel
> > > > > "Each property implies a specific status for the label it introduces,
> > > > > ranging from a strong, univocal denotation relationship, to a string to
> > > > > aid in lookup. "
> > > > > So matching is not necessarily an application for a label.
> > > >
> > > > Yes, I had already read those sections, but the difference between the
> > > labels doesn't seem to be directly related to my question.  Example 19 in
> > > http://www.w3.org/TR/2008/WD-skos-reference-20080829/ relates to a
> > > *single* type of label afaict.  Perhaps it would help for me to first focus
> > > attention specifically on the part of the example that says "color"@en ,
> > > "color"@en-US.  Why is it necessary to have "color"@en-US when you
> > already
> > > have "color"@en, which is indistinguishable in meaning and spelling? Is it in
> > > fact necessary, or just an error in the example, or just something that may
> > > happen?
> > > >
> > > > Next, lets look at "color"@en-US , "colour"@en-GB. This question is
> > about
> > > the use of language tags for dialects. Is it necessary to add "colour"@en-
> > AU
> > > etc, or is the intent here just to capture an alternative spelling and label it
> > with
> > > something reasonably intelligent but different from 'color', with the
> > > assumption that labelling it as en-GB will be sufficient for Australians to find
> > > and use it?  Or does one have to systematically apply labels with all the
> > > possible variations to support the likely 'user' environments? (I'm hoping
> > not.)
> > > >
> > > > What I'm getting at here, is that I think a search for an English term
> > should
> > > not fail if there is an @en label only but the search is done from an @en-
> > GB
> > > source, and vice versa; and that having both @en and @en-US seems
> > > redundant and wasteful.  I'm probing to understand the role and
> > application
> > > of matching of language tags in SKOS, since it wasn't clear to me from
> > what I
> > > had read.
> > > >
> > > > >
> > > > >
> > > > > Felix
> > > > >
> > > > > > As it stands, the implication seems to be that it wouldn't match this
> > > > > perfectly adequate literal).
> > > > > >
> > > > > > I would have expected that processing tools should recognise that a
> > > search
> > > > > originated from an en-GB context also matches en in the absence of
> > > > > alternatives with longer subtags.
> > > > > >
> > > > > > There is another small issue here related to the "colour"@en
> > > declaration.
> > > > > Why is the American spelling used for en? What would happen if the
> > > English
> > > > > spelling were used in some places? Is there a stated policy that en =
> > US
> > > > > English?
> > > >
> > > > These questions remain unanswered.
> > > >
> > > >
> > > > RI
> > > >
> > > >
> > > > > >
> > > > > > Cheers,
> > > > > > RI
> > > > > >
> > > > > > ============
> > > > > > Richard Ishida
> > > > > > Internationalization Lead
> > > > > > W3C (World Wide Web Consortium)
> > > > > >
> > > > > > http://www.w3.org/International/
> > > > > > http://rishida.net/
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >> -----Original Message-----
> > > > > >> From: Felix Sasaki [mailto:fsasaki@w3.org]
> > > > > >> Sent: 24 January 2009 08:19
> > > > > >> To: Ralph R. Swick
> > > > > >> Cc: public-i18n-core@w3.org; chairs@w3.org; ishida@w3.org;
> > public-
> > > swd-
> > > > > >> wg@w3.org
> > > > > >> Subject: Re: Request for feedback on SKOS Last Call Working Draft
> > > > > >>
> > > > > >> I looked at this briefly and have a personal, editorial comment.
> > > > > >>
> > > > > >> You write for example in sec. 5
> > > > > >>
> > > > > >> "The following graph is consistent, and illustrates the provision of
> > > > > >> lexical labels in four different languages (Japanese Kanji, Japanese
> > > > > >> Hiragana, Japanese Katakana and Japanese Rōmaji)."
> > > > > >>
> > > > > >> I would rather say
> > > > > >>
> > > > > >> "The following graph is consistent, and illustrates the provision of
> > > > > >> lexical labels in four different variations (Japanese written with
> > > > > >> Kanji, the Hiragana script, the Katakana script or with latin
> > characters
> > > > > >> (Rōmaji))."
> > > > > >>
> > > > > >> Since all examples are Japanese and differ only with regards to the
> > > > > >> script in use.
> > > > > >>
> > > > > >> I think this concerns sec. 5.1 ("Japanese Hiragana"), 5.4, and 5.5.
> > > > > >>
> > > > > >> Regards, Felix
> > > > > >>
> > > > > >> Ralph R. Swick さんは書きました:
> > > > > >>
> > > > > >>> Dear I18N Core Working Group (and other interested Chairs),
> > > > > >>>
> > > > > >>> The Semantic Web Deployment Working Group requests any
> > > feedback
> > > > > >>> you may have on the Simple Knowledge Organization System
> > > (SKOS)
> > > > > >>> Vocabulary Reference specification [1].
> > > > > >>>
> > > > > >>>   [1] http://www.w3.org/TR/2008/WD-skos-reference-20080829/
> > > > > >>>
> > > > > >>> This document was published as a W3C Last Call Working Draft
> > > > > >>> on 29 August 2008 [2]. The SemWeb Deployment Working Group
> > > > > >>> requested CR transition on 7 January 2009 [3].
> > > > > >>>
> > > > > >>>   [2] http://www.w3.org/News/2008#item148
> > > > > >>>   [3]
> > > http://lists.w3.org/Archives/Member/chairs/2009JanMar/0000.html
> > > > > >>>
> > > > > >>> It appears that due to an oversight there was not an explicit
> > notice
> > > > > >>> to chairs@w3.org of the Last Call publication.  Therefore we
> > cannot
> > > > > >>> be assured that you had the necessary notice should you have
> > > > > >>> planned to do an I18N review of this document.
> > > > > >>>
> > > > > >>> The most likely subject matter for I18N consideration is the
> > > > > >>> SKOS lexical labelling properties [4].
> > > > > >>>
> > > > > >>>   [4] http://www.w3.org/TR/2008/WD-skos-reference-
> > > 20080829/#L2831
> > > > > >>>
> > > > > >>> On behalf of the Semantic Web Deployment Working Group,
> > > > > >>> I request that you to consider whether you wish to offer any
> > > > > >>> comments on the SKOS Reference Last Call Working Draft
> > > > > >>> and to let us know an approximate schedule should you wish
> > > > > >>> to send comments.
> > > > > >>>
> > > > > >>> Thank you,
> > > > > >>> Ralph Swick
> > > > > >>> SemWeb Deployment WG Team Contact
> > > > > >>>
> > > > > >>>
> > > > > >>>
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > >
> > > >
> > > >
> > >
> > > --
> > > Alistair Miles
> > > Senior Computing Officer
> > > Image Bioinformatics Research Group
> > > Department of Zoology
> > > The Tinbergen Building
> > > University of Oxford
> > > South Parks Road
> > > Oxford
> > > OX1 3PS
> > > United Kingdom
> > > Web: http://purl.org/net/aliman
> > > Email: alistair.miles@zoo.ox.ac.uk
> > > Tel: +44 (0)1865 281993
> 
> 
> 

-- 
Alistair Miles
Senior Computing Officer
Image Bioinformatics Research Group
Department of Zoology
The Tinbergen Building
University of Oxford
South Parks Road
Oxford
OX1 3PS
United Kingdom
Web: http://purl.org/net/aliman
Email: alistair.miles@zoo.ox.ac.uk
Tel: +44 (0)1865 281993
Received on Wednesday, 11 March 2009 08:38:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 11 March 2009 08:38:08 GMT