Re: telco today at 15:00 from Manuel Fiorelli on 2014-06-06 (public-ontolex@w3.org from June 2014)

From: Manuel Fiorelli <manuel.fiorelli@gmail.com>
Date: Fri, 6 Jun 2014 17:49:12 +0200
To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>
Cc: Jorge Gracia <jgracia@fi.upm.es>, Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>, "public-ontolex@w3.org" <public-ontolex@w3.org>
Message-ID: <CAGDmdGi-Eqdd8cys+Pa1UyuU8LUs6Xvrdwk0DAUXULywADOMWQ@mail.gmail.com>
Hi John, All

Concerning the property language, I noticed that it is defined on the ISO
639-{1,3} codes, while RDF 1.1 refers to BCP 47 (RDF referred to a now
obsolete RFC).

In fact, BCP 47 reuses ISO codes, but it also commits on very specific
decisions. For instance, I am quite sure that English should be expressed
only as "en" rather than "eng" (in the official registry there is no eng
tag:
http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry).
In such cases, we could have values for the property language that should
not appear as language tags in the actual RDF data.

If my concerns are true, then the following example from the Wiki would be
problematic (excuse me if the problem has already been addressed):

ex:lex_marry a ontolex:LexicalEntry ;
  ontolex:canonicalForm ex:form_marry ;
  ontolex:otherForm ex:form_marries .

ex:form_marry ontolex:writtenRep "marry"@eng .
ex:form_marries ontolex:writtenRep "marries"@eng .

Moreover, if we allow ontolex:languageUri to represent any language beyond
the scope of the ISO repertory, then we could not have any language tag to
use.

Should we avoid language tags altogether, and instead rely on the use of
ontolex:language for each lexical form?

One interesting features of BCP 47 is the ability to represent country
variations, such as en-GB or en-US. I suspect that ISO 639-{1,3} codes do
not allow to represent these variations. Do we care about?

Furthermore, concerning the existence of two related properties, I wonder
whether they are formally related or not. For instance, can they be used
together, or are they mutually exclusive?



2014-06-06 16:27 GMT+02:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>:

> Hi Gil, Jorge,
>
> Thanks for the comment, we have discussed it in the telco. The decision
> that is proposed is to have two properties
>
>    - *language* whose value must be a two-letter ISO 639-1 code or a
>    three-letter ISO 639-3 code (ISO 639-2 is not supported to avoid ambiguity
>    - *languageURI* whose value should refer to an RDF language resource,
>    for example the Library of Congress identifier or (better) the LexVo
>    identifier
>
> The second property is better from a semantic point of view (as we can use
> the extra information given by LexVo) and allows us to refer definitions
> for languages that don't have an ISO code (e.g., Dothraki, Jèrriais)
>
> Are there any objections to this scheme?
>
> Regards,
> John
>
>
> On Fri, Jun 6, 2014 at 3:14 PM, Jorge Gracia <jgracia@fi.upm.es> wrote:
>
>> Hi Philipp,
>>
>> Let me add another issue for the first part
>>
>> 1.6) In ontolex:language, Is it better to have a URI as range instead of
>> a String? See DCAT for instance
>> http://www.w3.org/TR/vocab-dcat/#Property:catalog_language
>>
>> Regards,
>> Jorge
>>
>>
>>
>>
>> 2014-06-06 8:59 GMT+02:00 Philipp Cimiano <
>> cimiano@cit-ec.uni-bielefeld.de>:
>>
>> Dear all,
>>>
>>>  we have  a few things to discuss today, I would propose splitting the
>>> slot in two parts:
>>>
>>> 1) Discussion about ontolex changes (30 mins, with decisions on the
>>> single points)
>>>
>>>    1.1) Introducing Lexicalization into the core model (decision)
>>>    1.2) Naming the property between a "Lexical Sense" and a "Lexical
>>> Concept"; contains was not regarded as appropriate by many, so proposals on
>>> the table are: realizes/isRealizedBy, lexicalizes/isLexicalizedBy,
>>> instantiates/isInstantiatedBy, substantiates/isSubstantiatedBy,
>>> means/isMeaningOf as well as expresses/isExpressedBy; I am fine with at
>>> least 3 of them ;-)
>>>    1.3) Discussion: renaming property lexicalForm to simply "form"
>>>    1.4) Discussion: introducing property "definition" as a subclass of
>>> rdfs:comment with domain ontolex:LexicalSense
>>>    1.5) Discussion: explicitly introducing the class "Reference" as the
>>> range of "reference" as we have it anyway in most our diagrams; has no
>>> practical neither theoretical implications other than clarity (IMHO) and
>>> increasing the size of the module by one class
>>>
>>> 2) Discussion on lime proposal sent by Manuel/Armando (this assumes that
>>> Armando will be there to walk us through) -> 30 mins. (no decision)
>>>
>>> Btw: I finally managed to find a nice tool to produce UML-style
>>> visualizations of our models. It is called draw.io ;-) I attach a
>>> diagram that reflects the current state of the ontolex module. The diagram
>>> is in the GIT repo as well (where cardinalities are not indicated they are
>>> 0..n).
>>>
>>> I propose to postpone the discussion about Translation for another
>>> occasion. I need to make up my mind myself there. I will send a separate
>>> email on this.
>>>
>>> Access details can be found here as usual: https://www.w3.org/community/
>>> ontolex/wiki/Teleconference,_2014.06.06,_15-16_pm_CET
>>>
>>> Talk to you later!
>>>
>>> Philipp.
>>>
>>> --
>>>
>>> Prof. Dr. Philipp Cimiano
>>>
>>> Phone: +49 521 106 12249
>>> Fax: +49 521 106 12412
>>> Mail: cimiano@cit-ec.uni-bielefeld.de
>>>
>>> Forschungsbau Intelligente Systeme (FBIIS)
>>> Raum 2.307
>>> Universität Bielefeld
>>> Inspiration 1
>>> 33619 Bielefeld
>>>
>>>
>>
>>
>> --
>> Jorge Gracia, PhD
>> Ontology Engineering Group
>> Artificial Intelligence Department
>> Universidad Politécnica de Madrid
>> http://delicias.dia.fi.upm.es/~jgracia/
>>
>
>


-- 
Manuel Fiorelli
Received on Friday, 6 June 2014 15:49:42 UTC