- From: Bernard Vatant <bernard.vatant@mondeca.com>
- Date: Tue, 26 Dec 2006 18:27:22 +0100
- To: Sue Ellen Wright <sellenwright@gmail.com>
- Cc: Felix Sasaki <fsasaki@w3.org>, Gerhard Budin <gerhard.budin@univie.ac.at>, Addison Phillips <addison@yahoo-inc.com>, Mark Davis <mark.davis@jtcsv.com>, Thomas Baker <baker@sub.uni-goettingen.de>, public-esw-thes@w3.org
Hi all To sum up things on a permanent place, I've created a page on ESW wiki to track this question http://esw.w3.org/topic/Languages_as_RDF_Resources Now I think I've caught the point made by Felix and Sue Ellen, thanks! So I came up with an approach where Language (langtag) is defined as a class, but given the openness of combination of subtags, the instances of this class are not specified by an URI, but defined as anonymous resources to which subtag values are attached as properties. Subtags themselves are defined as SKOS concepts, with URIs based on the subtag type and value, and to which additional information can be added using the SKOS vocabulary, such as : <bcp47:LanguageSubtag rdf:about="#language-fr" skos:prefLabel="fr"> <bcp47:suppressScript rdf:resource="#script-Latn"/> <skos:definition xml:lang="en">French</skos:definition> <skos:definition xml:lang="fr">Français</skos:definition> </bcp47:LanguageSubtag> Definitions can be added in other languages of course. The most important is the way to use in metadata, like in the following example. We have a document in language en-US (as far as I can figure) of which subject is the french as it is spoken in Québec. The code for such a language should certainly be fr-CA. BCP 47 does not make provision for different flavours of canadian french. But note that the language being defined as an anonymous node, there is no absolute rule of identification. It's up to applications to decide the "same-ness rules". Some could use the "langtag" value, other who don't care about regional distinctions would rely on "primaryLanguage" value only. <foaf:Document rdf:about="http://en.wikipedia.org/wiki/Quebec_French"> <dc:language> <bcp47:Language bcp47:langtag="en-US"> <bcp47:primaryLanguage rdf:resource="#language-en"/> <bcp47:region rdf:resource="#region-US"/> <rdfs:label xml:lang="en">US English</rdfs:label> <rdfs:label xml:lang="fr">Anglais américain</rdfs:label> </bcp47:Language> </dc:language> <dc:subject> <bcp47:Language bcp47:langtag="fr-CA"> <bcp47:primaryLanguage rdf:resource="#language-fr"/> <bcp47:region rdf:resource="#region-CA"/> <rdfs:label xml:lang="en">Quebec French</rdfs:label> <rdfs:label xml:lang="fr">Français québecois</rdfs:label> </bcp47:Language> </dc:subject> </foaf:Document> A full RDF file with those examples is at http://perso.orange.fr/universimmedia/lang/bcp47_sample.rdf Waiting for comments of course, but certainly not before next year! Best to all Bernard Sue Ellen Wright a écrit : > Hi, All, > I'm sure Felix will get back to us, but I know what he means about a > finite list. The RFC 4646 defines rules for generating language tags > based on the various code components that can be included. The > potential for possible combinations is huge, although, as the document > points out, some combinations are unrealistic or silly (Aleut as > spoken in Belgium is a great example.) > Bye for now > Sue Ellen > > > On 12/22/06, *Bernard Vatant* <bernard.vatant@mondeca.com > <mailto:bernard.vatant@mondeca.com>> wrote: > > Hi Felix > > Thanks for jumping in. > > > I'm trying to understand what you want to achieve: Is it URIs for > > language values, e.g. > http://www.w3.org/2004/02/skos/language#en-US ? > > > Indeed. All the point is to identify and represent languages as > concepts, in order to be able to make RDF assertions about them, > beyond > the "tag" use. > > > I don't think that it is feasible to have everything after "#" as an > > URI, since RFC 4646 or its successor define a grammar for > language tags. > > > Do you mean there is a technical issue forbidding to build valid URIs > out of language tags? > Not that although a single # namespace is the first idea which > comes to > mind it's not the only option. > Could be as well http://www.w3.org/2004/02/skos/language/en/US > <http://www.w3.org/2004/02/skos/language/en/US> or even > an opaque URI http://www.w3.org/2004/02/skos/language#1234 > In any case subtag elements and other properties as revision date will > be explicitly attached as properties. You can't rely on the URI > string > to carry semantics. This is a "Semantic Web Axiom" :-) > > That is, you cannot have a finite set of URIs built out of that. > > > Sorry, I don't catch the point. What do you mean by a "finite set"? > Could you expand on that? > > Have you thought of registering an XPointer scheme at W3C? E.g. > > something like "language()" which can be used e.g. in > > http://www.w3.org/2004/02/skos/language#(en-US) > <http://www.w3.org/2004/02/skos/language#%28en-US%29> . You would > have to > > define that the scheme data "()" contains an BCP 47 identifier. > > > I think I see what you have in mind, but remember RDF is not mainly > about the structure of a published XML document, but about the > semantics > of URIs. > Besides the language values themselves, and even before, we need a > namespace for the ontology, the "Language" class, the different > "subtag" > properties etc. > And defining a namespace is more or less dependent of the vocabulary > publication. > See e.g. http://www.w3.org/TR/swbp-vocab-pub/ > > Hope that helps, and that we don't speak cross each other. > > Regards > -- *Bernard Vatant *Knowledge Engineering ---------------------------------------------------- *Mondeca** *3, cité Nollez 75018 Paris France Web: www.mondeca.com <http://www.mondeca.com> ---------------------------------------------------- Tel: +33 (0) 871 488 459 Mail: bernard.vatant@mondeca.com <mailto:bernard.vatant@mondeca.com> Blog: Leçons de Choses <http://mondeca.wordpress.com/>
Received on Tuesday, 26 December 2006 17:27:35 UTC