OntoLex-FrAC: Suggested revisions for frac:Corpus and frac:total

Dear all,

at today's call, we discussed three open issues in current FrAC draft with
the goal to arrive at a consolidated status quo for the Day of W3C Language
Technology Community Groups (aka 4th OntoLex Face-to-Face meeting) at LDK
on Sep 12 [1]. The issues surrounding frac:Corpus and frac:total have been
open for a long time now, so that instead of figuring out a consensus
before revising the diagram (as we tried for more than a year), we now
propose a revision for the community to approve.

I made a few suggestions and we had a consensus to draft the revision and
let it be approved by a wider audience (we were 3-4 people only, in the
call). However, in case any of the ideas below meets resistance from the
very beginning, please drop me a line, so I can stop early on to not model
and describe things that would be unlikely to be approved.

This concerns the following changes:
- frac:corpus (pointing from a frac:Observation to the frac:Corpus in which
the observation was made) is to be renamed frac:observedIn (frac:locus
remains unchanged)
- frac:Corpus (representing a text, a collection of texts, annotated or
unannotated, or the bibliographical metadata of a text or a collection of
texts) was felt to be too broadly defined to be understood. Suggestion:
This is to be abandoned. Instead, we add a comment that the object of
frac:corpus should be defined as a dct:DCMIType and give dct:Collection,
dct:Dataset and dct:Text as examples in the text.
- with abandoning frac:Corpus, we cannot define the range of frac:total in
strict RDF semantics, anymore. Instead, we introduce it as a property that
can be used to describe any dct:DCMIType object.
- rename frac:CorpusFrequency to frac:Frequency (because there is no formal
corpus object anymore)

In the past, we had discussed long how to bundle counts with units (say,
tokens, sentences, etc.). Suggestions:
- introduce frac:unit as a datatype property of frac:Frequency, all counts
of that frequency object are then relative to this unit.
- define the domain frac:total as frac:Frequency (rather than int), so we
can provide units along with counts

We are aware that these revisions are rather deep. They should solve the
existing issues, though. If you feel familiar with FrAC and have big issues
with any of the suggestions above, please drop me a line. Otherwise, I will
come up with a revised text for the next call in two weeks where it is to
be discussed, then.

At the same time, people using FrAC in current publications should refer to
the model that has otherwise remained stable for more than a year now as
"OntoLex-FrAC draft of January 2023". The changes proposed here are not
downward-compatible, but they will have a direct mapping correspondence.

Best,
Christian

[1] https://www.w3.org/community/ontolex/wiki/W3c_community_day_@_LDK2023

Received on Thursday, 22 June 2023 14:11:57 UTC