[VM] Need a section about (Formal and Natural) Languages in Vocabularies?

Re-reading the draft over my morning coffee ...

I think we miss an important section in the document about the Language(s) in which a
Vocabulary is published.
When I say Language, it's both so-called "Natural Languages" (like e.g.
http://psi.oasis-open.org/iso/639/#fra) and "Formal Languages" (XML, RDF, OWL ...). I
would gladly see those added to the consensual glossary, something like.

Formal Language : A formal standard syntax in which the Vocabulary is issued
Natural Language : A language in which the terms are originally expressed (wording to
improve here)

I think we address much the former in Semantic Web specs and literature, but that the
latter is too often ignored, or quickly swept by something like "If you really need other
languages - read : other than universal English ;-) - use a "label", but it will bear no
semantics ...".

If we want SW to be adopted, say, in European Community, we need to say something about
multilingualism practices.

BTW it has been answered to my previous post (Tom, Jim) that the name vs concept was not a
pragmatic issue (read: it is an old unbreakable academic debate we should not confuse
people with). Well, seems to me that, considered from the multilingual viewpoint, it is a
pragmatic issue, like the following points try to show.

When publishing a multi-lingual vocabulary, which is the best practice for identifiers
among the following options?

1. Have a single URI to identify the concept, and attach the names in different languages
as labels.
This option considers that labels in different languages represent in fact one single
term-concept.

Option 1.a : Use the "default" or "base" language to build human-readable URIs, like:

	Vocabulary Default Language : 	French
	Other Language				English
	Term : 					http://MyAuthorityDomain/MyDirectory/MyNameSpace#Societe
	English Label: 				Company

This option implicitly assumes that I had a Vocabulary built and thought in French, then
translated in English.

Option 1.b : Use language-neutral, hence non-human readable URIs, and a label in each
language

	Vocabulary Languages : 	French, English
	Term : 			http://MyAuthorityDomain/MyDirectory/MyNameSpace#Concept2546
	English Label:		Company
	French Label: 		Societe

This option implicitly assumes that similar terms in English and French has been agreed as
representing a somehow language-independent concept. Such vocabularies are indeed often
built from various monolingual sources. A good example is GEMET (GEneral Multilingual
Environmental Thesaurus of European Environment Agency - 19 languages and growing).

2. Use one identifier by language, and if necessary link them by any relevant, formal or
informal relation. This option considers that a term can't be considered independently of
the NL in which it was first expressed. It can be chosen for domains where vocabularies
are likely to carry a strong specific linguistic-cultural bias. Say e.g. legal concepts in
Belgium, in both French and Dutch (there again, this is a use case I've been working on
lately).

	Vocabulary Languages : 	French, English
	French Term:			http://MyAuthorityDomain/MyDirectory/MyNameSpace#Societe
	English Term:			http://MyAuthorityDomain/MyDirectory/MyNameSpace#Company

In this latter option, the concept schemes (using here SKOS wording) are developed
independently, but clearly need matching. There again SKOS has opened the way towards
expression of such matching (exact, loose, formal ...)

Bernard

**********************************************************************************

Bernard Vatant
Senior Consultant
Knowledge Engineering
bernard.vatant@mondeca.com

"Making Sense of Content" :  http://www.mondeca.com
"Everything is a Subject" :  http://universimmedia.blogspot.com

**********************************************************************************

> -----Message d'origine-----
> De : public-swbp-wg-request@w3.org
> [mailto:public-swbp-wg-request@w3.org]De la part de Thomas Baker
> Envoye : mercredi 27 octobre 2004 15:56
> A : SW Best Practices
> Objet : [VM] Roles for VM Task Force members
>
>
>
> Dear all,
>
> As explained in my earlier posting, the latest draft includes
> alot of TASKS for specific members of the Task Force based on
> my "best guess" as to what text contributions are needed in
> specific sections.  These TASKS reflect the following rough
> division of labor:
>
>     Tom       - coordinator and editor; Dublin Core
>     Libby     - FOAF, W3C specs and findings
>     Dan       - FOAF, W3C specs and findings
>     Alistair  - SKOS, thesauri, TAG on versioning
>     Bernard   - OASIS Public Subjects
>     Ralph     - W3C specs and findings
>     James     - Use of vocabularies in Semantic Web
>     Aldo      - Princeton Wordnet
>     Alan      - Maybe an example ontology?
>     Natasha   - Maybe an example ontology?
>
> I list below how these specializations translate into
> specific tasks.  I would appreciate if you could provide
> me with feedback on these tasks over the coming week --
> if only to confirm that the default tasks seem reasonable.
>
> Of course, my intention with proposing specific tasks is
> to help us focus -- not to set limits on participation!
> If you would like to adjust the scope of your involvement,
> please let me know so I can adjust your TASKS before the
> draft goes up on the Wiki.
>
> Thank you,
> Tom
>
> -----
>
> Alan
> - An example of a large-scale ontology?
> - "What constitutes a change?"
>
> Aldo
> - One paragraph about wordnet issues
> - Sentence or two on Wordnet term URIrefs
> - Describe maintenance policies for Wordnet
> - Short paragraph on versioning in Wordnet
> - One sentence pointing to Wordnet Web documents
> - Two sentences on Wordnet schemas.
> - Short paragraph on Wordnet dereferencing policy
> - Short paragraph on what Wordnet schemas assert.
> - Annotate Glossary with Wordnet terminology where appropriate
>
> Alistair
> - One paragraph about SKOS
> - Sentence or two on SKOS term URIrefs
> - Describe maintenance policies for SKOS
> - TAG Versioning on "semantic stability"
> - Short paragraph on versioning in SKOS
> - What TAG says about versioning
> - One sentence pointing to SKOS Web documents
> - Two sentences on SKOS schemas.
> - Short paragraph on SKOS dereferencing policy
> - Short paragraph on what SKOS schemas assert.
> - Discuss alternative ways to model a thesaurus
> - Annotate Glossary with SKOS terminology where appropriate
>
> Bernard
> - Bullet point on OASIS Published Subjects
> - What PSI says about identifying terms
> - What PSI says about maintenance policies
> - Short paragraph on versioning in PSI
> - One sentence pointing to PSI Web documents
> - Two sentences on PSI schemas.
> - Paragraph on PSI dereferencing policy
> - Short paragraph on what PSI schemas assert.
> - Reuse of existing terms in a local context
> - Annotate Glossary with PSI terminology where appropriate
>
> DanBri and/or Libby
> - One paragraph on FOAF
> - Bullet point on W3C good-practice documents
> - Describe W3C usage of the word "namespace"
> - Define "URI Reference", elaborating in the Glossary
> - Sentence or two on FOAF term URIrefs
> - What W3C says about identifying terms
> - Describe maintenance policies for FOAF
> - What W3C says about maintenance policies
> - Short paragraph on versioning in FOAF
> - One sentence pointing to FOAF Web documents
> - One sentence pointing to W3C Web documents
> - Two sentences on FOAF schemas.
> - Two sentences on W3C schemas.
> - Short paragraph on FOAF dereferencing policy
> - Short paragraph on what FOAF schemas assert.
> - Short paragraph on what W3C schemas assert.
> - Describe the "vocabulary market"
> - Formation of URI strings ("hash or slash" etc)
> - Define URI Reference
> - Annotate Glossary with FOAF terminology where appropriate
>
> James
> - One page on "vocabularies in Semantic Web"
>
> Jeremy (if willing)
> - Summarize discussion of "social meaning"
>
> Natasha
> - An example of a large-scale ontology?
>
> Ralph
> - Longer paragraph on versioning in W3C
> - Paragraph or two on W3C dereferencing policy
> - Annotate Glossary with W3C terminology where appropriate
>
> Tom
> - One paragraph about Dublin Core
> - Sentence or two on DCMI term URIrefs
> - A sentence on the "CORES Resolution"
> - Describe maintenance policies for DCMI
> - Short paragraph on versioning in DCMI
> - One sentence pointing to DCMI Web documents
> - Two sentences on DCMI schemas.
> - Short paragraph on DCMI dereferencing policy
> - Short paragraph on what DCMI schemas assert.
> - DCMI on "terms usable as RDF properties"
> - Describe the DCMI notion of "application profile"
> - DCMI endorsing assertions about MARC Relator terms
> - DCMI guidelines on coining URI references
> - DCMI perspective on "namespace hosting"
> - Annotate Glossary with DCMI terminology where appropriate
>
> Everyone
> - Using terms outside of their original contexts
> - Describe other notions of "application profile"
> - Comment on the role of the "vocabulary owner"
> - When and how to declare new or reuse existing terms
>
> --
> Dr. Thomas Baker                        Thomas.Baker@izb.fraunhofer.de
> Institutszentrum Schloss Birlinghoven         mobile +49-160-9664-2129
> Fraunhofer-Gesellschaft                          work +49-30-8109-9027
> 53754 Sankt Augustin, Germany                    fax +49-2241-144-2352
> Personal email: thbaker79@alumni.amherst.edu
>

Received on Friday, 29 October 2004 10:51:24 UTC