W3C home > Mailing lists > Public > public-ontolex@w3.org > March 2021

Re: FRaC Faliscan language Example

From: Thierry Declerck <declerck@dfki.de>
Date: Mon, 8 Mar 2021 19:42:55 +0100
To: Christian Chiarcos <christian.chiarcos@web.de>, Fahad Khan <anasfkhan81@gmail.com>
Cc: public-ontolex <public-ontolex@w3.org>, Valeria Quochi <vquochi@gmail.com>
Message-ID: <1045bbb6-2b23-2b07-ff84-040aff53e175@dfki.de>
Dear All,

Thanks for the interesting discussion. Just to mention that I had 
similar issues: with Forms that have the same writtenRep, the same 
case/gender, but different pronunication of different stress 
(https://de.wiktionary.org/wiki/Vollzug -- Wiktionary has 2 entries for 
this example).

I played with this kind of representation (making use of Lexicog:

Similar with "lead" in English, with two pronunciations, where I tried 
something a bit different, with two entries and two forms:

Well I do not know if this fits in this thread.

Am 08.03.2021 um 18:42 schrieb Christian Chiarcos:
> Am Mo., 8. März 2021 um 18:07 Uhr schrieb Fahad Khan 
> <anasfkhan81@gmail.com <mailto:anasfkhan81@gmail.com>>:
>     Dear Christian, all,
>     Il giorno lun 8 mar 2021 alle ore 16:54 Christian Chiarcos
>     <christian.chiarcos@web.de <mailto:christian.chiarcos@web.de>> ha
>     scritto:
>         Hi Fahad, dear all,
>         Am Mo., 8. März 2021 um 12:17 Uhr schrieb Fahad Khan
>         <anasfkhan81@gmail.com <mailto:anasfkhan81@gmail.com>>:
>             Hi Everyone,
>             I have been working on modelling an entry from a lexicon
>             currently being compiled as part of an Italian project on
>             Italic languages and I think it potentially shows some
>             limitations in the current ontolex/FRaC approach.  I would
>             like to discuss this at the next telco but I will give a
>             description here in order to get some feedback from the
>             list too.
>             In the example in question we have a Faliscan word,
>             ekupetaris, which has different attested representations
>             for the same form (or same morphological variant). That
>             is, the masculine, nominative, singular form has been
>             attested in the following written variants:  "ECVPETARIS",
>             "EQUPETARS", "ekupetaris", "ekvopetaris", "ekvopetars",
>             "epetaris", "eppetaris".  Each of these written variants
>             has at least one attestation in some inscription. In the
>             case of "ekupetaris" there are four different
>             attestations; the others have one apiece.
>             According to the ontolex-lemon model these are all written
>             representations of the same Form element (the masculine,
>             nominative, singular form of the noun).
>         You seem to assume that the same features for the same lexical
>         representation lead to exactly one Form. I don't think this is
>         required. In fact, we can have different forms with identical
>         features but differences in usage. Think of English "has" and
>         "hath", which probably should be two forms. Despite both being
>         3.sg.ind.prs, they are not interchangeable. Looking at your
>         examples, these forms also differ *phonologically*, not just
>         orthographically. There are at least five phonologically
>         differentiable forms here:
>         "ECVPETARIS", "ekupetaris",
>         "EQUPETARS",
>         "ekvopetaris",
>         "ekvopetars",
>         "epetaris", "eppetaris"
>         Everything else is just orthography. If your resource
>         *decides* to define forms as phonologically-based (this is not
>         required), these would probably be it.
>         However, this is pre-standardized writing, and you could go as
>         far as to distinguish every attested form simply because you
>         can *never* be certain whether there really are no
>         phonological differences (epe- vs. eppe- may be a difference,
>         for example).
>     The ontolex guidelines are seemingly clear on this: that Form
>     should be used only for morphological or grammatical variation
>     (which afaik is usually defined as morphosyntactic variation). To
>     underline this, the example of "privacy" is given as a Form with
>     two different phonetic representations (Lexicon Model for
>     Ontologies: Community Report, 10 May 2016 (w3.org)
>     <https://www.w3.org/2016/05/ontolex/#forms>). If you are
>     suggesting that we could consider other kinds of relevant
>     variation (e.g., representing phonological differences) in
>     defining forms then the guidelines should probably be adjusted (as
>     GIlles mail and other comments I've heard would seem to suggest
>     there is at least a potential ambiguity in the current wording).
>     Indeed the solution which you suggest in which each variant would
>     be a separate Form, so that we would have seven forms (each of
>     which is marked as singular, masculine, nominative) with their own
>     separate attestations and written representations is the one which
>     we had originally wanted to use before checking what the
>     guidelines said.  (In addition instance the OED seems to use Form
>     in the broader sense which you mean Christian, rather than the one
>     in the guidelines)
>             Another possibility could be the creation of a new class
>             (in FRaC), something like AttestedRepresentation which is
>             also a FRaC observable with associated properties
>             attestedRep stringValue such that writtenRep is equivalent
>             to attestedRep o stringValue.
>         I would rather avoid that. For many reasons: I'm not sure we
>         can axiomatize the values of datatype properties in this way.
>         It would create something nearly identical with Attestation,
>         leading to a lot of confusion among users of FrAC. If this is
>         an observable, this would mean that it can have Attestations
>         on its own right -- what is an Attestation of an
>         AttestationRep? It would introduce at least two new properties
>         and one new class (as opposed to just one reifiable vartrans
>         property that uses the same construction template as we
>         previously used for lexical relations), and it would be
>         *highly specific* for a use case relevant for epigraphy -- but
>         not much beyond that (I might be wrong on that one). For a
>         vartrans relation between forms, I can see other uses (e.g.,
>         systematic mappings between related forms of different
>         lexemes, e.g., from different languages). For the
>         attestationRepresentation, I'm not sure these do exist.
>     The idea would be for AttestedRepresentation to be a reification
>     of a writtenRep
> If so, we should probably just reify writtenRep. I can see other uses 
> for that, but it would be a major change to the OntoLex core. And one 
> might wonder what the difference to Form is.
> I'm actually in favor of rewording the OntoLex spec to make that more 
> explicit. It meets a demand, it seems to follow current practice and 
> (in my reading) it doesn't contradict anything in the spec. The 
> suggested change is minor:
> "Different forms are used to express different morphological forms of 
> the entry."
> =>
> "Different forms of the same entry are used to express forms that 
> differ in their (morphological or phonological) structure."
> Note that "differ in their (phonological) structure" is sufficiently 
> general to admit multiple phonologicalReps for BE and AE at the same 
> form -- if a data provider asserts that these do not differ in their 
> phonological *structure*, but just in their articulation. (All 
> phonemes are 1:1 mappable.)
> But if data providers *want to actually assert* that these are 
> different in structure (in this case, few people would -- unless you 
> go into dialectology, where these differences are the subject of 
> investigation), they would be able to do so. Personally, I would 
> prefer to keep OntoLex sufficiently flexible to not systematically 
> rule out entire fields of research ;)
> Best,
> Christian
>     which we could predicate additional information of, in particular
>     via the use of an Attestation. The Attestation of an
>     AttestationRep would be a text (or locus) in which a form is
>     spelled/written in a certain way. For instance, say we wanted to
>     add information to a lexicon about the first attestation of the
>     spelling of the word colour as "color" to check whether the
>     American or the English spelling was the prior one. Currently we
>     can't do this.
>         Best,
>         Christian
>     Cheers
>     Fahad
Thierry Declerck
Senior Consultant at DFKI GmbH, Multilinguality and Language Technology
Stuhlsatzenhausweg, 3
D-66123 Saarbruecken
Phone: +49 681 / 857 75-53 58
Fax: +49 681 / 857 75-53 38
email: declerck@dfki.de

Deutsches Forschungszentrum für Künstliche Intelligenz GmbH
Trippstadter Strasse 122, D-67663 Kaiserslautern, Germany

Prof. Dr. Antonio Krüger

Vorsitzender des Aufsichtsrats:
Dr. Gabriël Clemens
Amtsgericht Kaiserslautern, HRB 2313
Received on Monday, 8 March 2021 18:43:18 UTC

This archive was generated by hypermail 2.4.0 : Monday, 8 March 2021 18:43:19 UTC