Re: telco today at 15:00 from Philipp Cimiano on 2014-06-12 (public-ontolex@w3.org from June 2014)

From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Date: Thu, 12 Jun 2014 23:53:10 +0200
To: Manuel Fiorelli <manuel.fiorelli@gmail.com>, "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>
CC: Jorge Gracia <jgracia@fi.upm.es>, "public-ontolex@w3.org" <public-ontolex@w3.org>
Message-ID: <539A2146.8060309@cit-ec.uni-bielefeld.de>
Dear all, I think it was clear that we recommend to use BCP 47 in the 
context of lemon.
So yes "eng" should be "en"; i changed this several times and other 
people change it back ;-)

Actually, according to recommendations of the BPMLOD group, every string 
should have a language tag, so we should follows this best practice in 
our examples.

Yes, you can represent dialect variations using BCP 47, i.e. en-GB or 
en-US and these should be attached to different forms rather than to the 
lexicon as John mentioned.

Hope this clarifies.

Philipp.

Am 06.06.14 17:49, schrieb Manuel Fiorelli:
> Hi John, All
>
> Concerning the property language, I noticed that it is defined on the 
> ISO 639-{1,3} codes, while RDF 1.1 refers to BCP 47 (RDF referred to a 
> now obsolete RFC).
>
> In fact, BCP 47 reuses ISO codes, but it also commits on very specific 
> decisions. For instance, I am quite sure that English should be 
> expressed only as "en" rather than "eng" (in the official registry 
> there is no eng tag: 
> http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry). 
> In such cases, we could have values for the property language that 
> should not appear as language tags in the actual RDF data.
>
> If my concerns are true, then the following example from the Wiki 
> would be problematic (excuse me if the problem has already been 
> addressed):
> ex:lex_marry a ontolex:LexicalEntry ;
>    ontolex:canonicalForm ex:form_marry ;
>    ontolex:otherForm ex:form_marries .
>
> ex:form_marry ontolex:writtenRep "marry"@eng .
> ex:form_marries ontolex:writtenRep "marries"@eng .
>
> Moreover, if we allow ontolex:languageUri to represent any language 
> beyond the scope of the ISO repertory, then we could not have any 
> language tag to use.
>
> Should we avoid language tags altogether, and instead rely on the use 
> of ontolex:language for each lexical form?
>
> One interesting features of BCP 47 is the ability to represent country 
> variations, such as en-GB or en-US. I suspect that ISO 639-{1,3} codes 
> do not allow to represent these variations. Do we care about?
>
> Furthermore, concerning the existence of two related properties, I 
> wonder whether they are formally related or not. For instance, can 
> they be used together, or are they mutually exclusive?
>
>
>
> 2014-06-06 16:27 GMT+02:00 John P. McCrae 
> <jmccrae@cit-ec.uni-bielefeld.de 
> <mailto:jmccrae@cit-ec.uni-bielefeld.de>>:
>
>     Hi Gil, Jorge,
>
>     Thanks for the comment, we have discussed it in the telco. The
>     decision that is proposed is to have two properties
>
>       * *language* whose value must be a two-letter ISO 639-1 code or
>         a three-letter ISO 639-3 code (ISO 639-2 is not supported to
>         avoid ambiguity
>       * *languageURI*//whose value should refer to an RDF language
>         resource, for example the Library of Congress identifier or
>         (better) the LexVo identifier
>
>     The second property is better from a semantic point of view (as we
>     can use the extra information given by LexVo) and allows us to
>     refer definitions for languages that don't have an ISO code (e.g.,
>     Dothraki, Jèrriais)
>
>     Are there any objections to this scheme?
>
>     Regards,
>     John
>
>
>     On Fri, Jun 6, 2014 at 3:14 PM, Jorge Gracia <jgracia@fi.upm.es
>     <mailto:jgracia@fi.upm.es>> wrote:
>
>         Hi Philipp,
>
>         Let me add another issue for the first part
>
>         1.6) In ontolex:language, Is it better to have a URI as range
>         instead of a String? See DCAT for instance
>         http://www.w3.org/TR/vocab-dcat/#Property:catalog_language
>
>         Regards,
>         Jorge
>
>
>
>
>         2014-06-06 8:59 GMT+02:00 Philipp Cimiano
>         <cimiano@cit-ec.uni-bielefeld.de
>         <mailto:cimiano@cit-ec.uni-bielefeld.de>>:
>
>             Dear all,
>
>              we have  a few things to discuss today, I would propose
>             splitting the slot in two parts:
>
>             1) Discussion about ontolex changes (30 mins, with
>             decisions on the single points)
>
>                1.1) Introducing Lexicalization into the core model
>             (decision)
>                1.2) Naming the property between a "Lexical Sense" and
>             a "Lexical Concept"; contains was not regarded as
>             appropriate by many, so proposals on the table are:
>             realizes/isRealizedBy, lexicalizes/isLexicalizedBy,
>             instantiates/isInstantiatedBy,
>             substantiates/isSubstantiatedBy, means/isMeaningOf as well
>             as expresses/isExpressedBy; I am fine with at least 3 of
>             them ;-)
>                1.3) Discussion: renaming property lexicalForm to
>             simply "form"
>                1.4) Discussion: introducing property "definition" as a
>             subclass of rdfs:comment with domain ontolex:LexicalSense
>                1.5) Discussion: explicitly introducing the class
>             "Reference" as the range of "reference" as we have it
>             anyway in most our diagrams; has no practical neither
>             theoretical implications other than clarity (IMHO) and
>             increasing the size of the module by one class
>
>             2) Discussion on lime proposal sent by Manuel/Armando
>             (this assumes that Armando will be there to walk us
>             through) -> 30 mins. (no decision)
>
>             Btw: I finally managed to find a nice tool to produce
>             UML-style visualizations of our models. It is called
>             draw.io <http://draw.io> ;-) I attach a diagram that
>             reflects the current state of the ontolex module. The
>             diagram is in the GIT repo as well (where cardinalities
>             are not indicated they are 0..n).
>
>             I propose to postpone the discussion about Translation for
>             another occasion. I need to make up my mind myself there.
>             I will send a separate email on this.
>
>             Access details can be found here as usual:
>             https://www.w3.org/community/ontolex/wiki/Teleconference,_2014.06.06,_15-16_pm_CET
>
>             Talk to you later!
>
>             Philipp.
>
>             -- 
>
>             Prof. Dr. Philipp Cimiano
>
>             Phone: +49 521 106 12249 <tel:%2B49%20521%20106%2012249>
>             Fax: +49 521 106 12412 <tel:%2B49%20521%20106%2012412>
>             Mail: cimiano@cit-ec.uni-bielefeld.de
>             <mailto:cimiano@cit-ec.uni-bielefeld.de>
>
>             Forschungsbau Intelligente Systeme (FBIIS)
>             Raum 2.307
>             Universität Bielefeld
>             Inspiration 1
>             33619 Bielefeld
>
>
>
>
>         -- 
>         Jorge Gracia, PhD
>         Ontology Engineering Group
>         Artificial Intelligence Department
>         Universidad Politécnica de Madrid
>         http://delicias.dia.fi.upm.es/~jgracia/
>         <http://delicias.dia.fi.upm.es/%7Ejgracia/>
>
>
>
>
>
> -- 
> Manuel Fiorelli


-- 

Prof. Dr. Philipp Cimiano

Phone: +49 521 106 12249
Fax: +49 521 106 12412
Mail: cimiano@cit-ec.uni-bielefeld.de

Forschungsbau Intelligente Systeme (FBIIS)
Raum 2.307
Universität Bielefeld
Inspiration 1
33619 Bielefeld
Received on Thursday, 12 June 2014 21:53:42 UTC