- From: Jorge Gracia <jgracia@fi.upm.es>
- Date: Tue, 10 Oct 2017 11:56:48 +0200
- To: Tom Knorr <tknorr@neurocollective.com>
- Cc: "public-ontolex@w3.org" <public-ontolex@w3.org>
- Message-ID: <CANzuSaM7FNPjGcScMV_cUNoTPCGRMr7iuvPfTujQVuWu_Oa5vg@mail.gmail.com>
Dear Tom, Thanks for sharing your thoughts. Let me comment on a couple of points you raised: * about context, sense-ordering, and examples > In the future, where a electronic personal assistant holds a conversation with the user, > the context is known and should re-prioritize word choices and samples, even withhold > dictionary entries that are clearly not in context. > The computers can do that, I would not spend much time on in what sequence the > entries should be displayed/stored. If any, I would focus on the ability of the computer > to select based on context. In my view, this translates into a model that supports ANY order criteria, being agnostic of the particular sequence in which the entries should be displayed/stored, which is left to the dictionary creator's choice or to a NLP service that dynamically assigns it according to the context. In principle we should focus on the modelling side (which will be discussed in issue 6). * about provenance > What I think is important is that the sources of the dictionary entries are tracked, > in order to track back where entries came from, until they are confirmed from > several reputable sources. There are "standard" metadata vocabularies (PROV-O, DC-Terms) that support provenance description. I do not think that more extra properties are needed in the case of dictionaries. However I think it might be useful to describe a new issue or "best practice" to clarify how to do that. Best, Jorge 2017-10-10 1:19 GMT+02:00 Tom Knorr <tknorr@neurocollective.com>: > I have some observations to the current discussion on the open issues. > > I1: The German ‘Leiter’ can be female gender and translate into en:ladder, > a noun or it can be male gender and translate into en:leader, either noun > or a role, if you support that. Both are not or if any, very esoteric > related, one could maybe construct that if you move up the corporate ladder > you end up being a leader. Other than the article they have a common > morphology but they should not be the same lexical entry. > > A bit more interesting is the German ‘Bank’. ‘Die Bank’, the bench has a > plural of ‘Bänke’ while ‘die Bank’, the place you bring your money to, has > a plural of ‘Banken’. Entirely different morphology but same gender. > Definitely different lexical entries. > > Of the top of my head these are examples of many I encountered processing > Europarl and the German Wikipedia dump. It required us to modify some code > to properly map the morphology to the corresponding lexemes. > > > I have a opinion about the usage examples and ordering of dictionary > entries. > > Usage examples are useful but they should come from the semantically > linked data that is behind the dictionary entries. I think we need to take > a hard look at how the systems are going to be used in the future. In a > printed dictionary, even if it is a translation app, the publisher does not > know what the user intends to look up, therefore the ‘printed’ dictionary > needs a representative set of examples and a ordering that is based of some > acceptable scientific reasoning. In the future, where a electronic personal > assistant holds a conversation with the user, the context is known and > should re-prioritize word choices and samples, even withhold dictionary > entries that are clearly not in context. The computers can do that, I would > not spend much time on in what sequence the entries should be > displayed/stored. If any, I would focus on the ability of the computer to > select based on context. > > What I think is important is that the sources of the dictionary entries > are tracked, in order to track back where entries came from, until they are > confirmed from several reputable sources. > > I will try to catch the call tomorrow, if I get out of bed in time. The > call is at 5am PST for me. If not I will read the transcript and answer > through the list. > > > regards > > Tom > > On 10/02/2017 07:28 AM, Philipp Cimiano wrote: > > Dear all, > > I propose we have another ontolex telco on the 10th of October, 14:00 CEST. > > I propose we continue discussing the concrete examples that Julia and > Jorge have been preparing. > > I think the conclusion we had is that we wanted to continue working > bottom-up from examples of current lexica and then try to get an > abstract model that is able to accomodate future dictionaries that are > native LLD dictionaries. > > Let's try! > > We will again the meeting via skype. It worked quite well last time. > > Greetings, > > Philipp. > > > > > -- > Tom Knorr > Independent Consultant currently working on The NeuroCollectivewww.NeuroCollective.com > > Blog: http://www.neurocollective.com/blog/ > > LinkedIn: https://www.linkedin.com/in/tom-knorr-4406406/ > > -- Jorge Gracia, PhD Ontology Engineering Group Artificial Intelligence Department Universidad Politécnica de Madrid http://jogracia.url.ph/web/
Received on Tuesday, 10 October 2017 09:57:39 UTC