W3C home > Mailing lists > Public > public-ontolex@w3.org > July 2018

Re: frequency dictionaries?

From: Christian Chiarcos <christian.chiarcos@web.de>
Date: Mon, 2 Jul 2018 17:18:03 +0200
Message-ID: <CAC1YGdgofwcfGwR5+x5ihgcd3e32XzO1_GpUB0z5GKon=BhLBg@mail.gmail.com>
To: John McCrae <john.mccrae@insight-centre.org>
Cc: public-ontolex <public-ontolex@w3.org>

I think the least abusive way of applying ontolex here would be to use an
>> application-specific frequency property and multiple lexinfo:partOfSpeech
>> properties, with a blank node as argument and one associated OLiA class
>> each. As the blank node does not exhibit a unique reference, all blank
>> nodes could in theory resolve to the same URI, so formally, the
>> one-POS-per-entry constraint isn't broken. But this clearly is a hack and
>> I'm not sure this should be recommended.
>> I think you need to introduce a specific modelling as I can't see such a
> modelling encouraging reuse and semantic interoperability. For OntoLex, it
> would be great if you could propose such a model that could be introduced
> into the lexicography module.

I just discussed with Julia about introducing a lex:freq property for
absolute counts, and she also had the idea that synBehavior may be a place
to record prepositional vs. complementizer uses independently from
lexinfo:partOfSpeech. Making the SyntacticFrame argument (that synBehavior
requires) an instance of olia:SubordinatingConjunction, resp.
olia:Adposition would be the most compact encoding.

I think that a native ontolex+lexinfo solution might work, too (with some
extensions of lexinfo), but I am not a big fan of this solution because of
its verbosity (in modelling, but more importantly, for querying): Prepositional
uses may be encoded with lexinfo:PrepositionFrame with nominal complement,
subordinating uses with lexinfo:PrepositionFrame with clausal complement.

Just having two POS would be way more natural, though. I guess changing the
wording in the description is not an option? This is just about dropping
the word "single" -- which has been suggested before AFAIK, and for
independent reasons.


> John
>> A few triples more than in my original modelling, but it would work for
>> me.
>> Thanks,
>> Christian
>> PS: In fact, with the blank objects, we can have something almost
>> equivalent without reification:
>> _:after lexinfo:partOfSpeech [ a olia:SubordinatingConjunction;
>> prov:wasDerivedFrom <https://catalog.ldc.upenn.edu/ldc99t42>; my:freq
>> "10"; lexinfo:confidence "0.1" ].
>> _:after lexinfo:partOfSpeech [ a olia:Adposition; prov:wasDerivedFrom <
>> https://catalog.ldc.upenn.edu/ldc99t42>; my:freq"90"; lexinfo:confidence
>> "0.9" ].
>> This means we have multiple lexical-entry-specific POS categories. This
>> is much more readable but less precise, as unifying both blank nodes just
>> gives nonsense.
Received on Monday, 2 July 2018 15:18:27 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:37:02 UTC