- From: Tom Knorr <tknorr@NeuroCollective.com>
- Date: Mon, 9 Oct 2017 16:19:21 -0700
- To: public-ontolex@w3.org
- Message-ID: <ea71a751-54cb-c69d-7015-54da96085903@NeuroCollective.com>
I have some observations to the current discussion on the open issues. I1: The German ‘Leiter’ can be female gender and translate into en:ladder, a noun or it can be male gender and translate into en:leader, either noun or a role, if you support that. Both are not or if any, very esoteric related, one could maybe construct that if you move up the corporate ladder you end up being a leader. Other than the article they have a common morphology but they should not be the same lexical entry. A bit more interesting is the German ‘Bank’. ‘Die Bank’, the bench has a plural of ‘Bänke’ while ‘die Bank’, the place you bring your money to, has a plural of ‘Banken’. Entirely different morphology but same gender. Definitely different lexical entries. Of the top of my head these are examples of many I encountered processing Europarl and the German Wikipedia dump. It required us to modify some code to properly map the morphology to the corresponding lexemes. I have a opinion about the usage examples and ordering of dictionary entries. Usage examples are useful but they should come from the semantically linked data that is behind the dictionary entries. I think we need to take a hard look at how the systems are going to be used in the future. In a printed dictionary, even if it is a translation app, the publisher does not know what the user intends to look up, therefore the ‘printed’ dictionary needs a representative set of examples and a ordering that is based of some acceptable scientific reasoning. In the future, where a electronic personal assistant holds a conversation with the user, the context is known and should re-prioritize word choices and samples, even withhold dictionary entries that are clearly not in context. The computers can do that, I would not spend much time on in what sequence the entries should be displayed/stored. If any, I would focus on the ability of the computer to select based on context. What I think is important is that the sources of the dictionary entries are tracked, in order to track back where entries came from, until they are confirmed from several reputable sources. I will try to catch the call tomorrow, if I get out of bed in time. The call is at 5am PST for me. If not I will read the transcript and answer through the list. regards Tom On 10/02/2017 07:28 AM, Philipp Cimiano wrote: > Dear all, > > I propose we have another ontolex telco on the 10th of October, 14:00 CEST. > > I propose we continue discussing the concrete examples that Julia and > Jorge have been preparing. > > I think the conclusion we had is that we wanted to continue working > bottom-up from examples of current lexica and then try to get an > abstract model that is able to accomodate future dictionaries that are > native LLD dictionaries. > > Let's try! > > We will again the meeting via skype. It worked quite well last time. > > Greetings, > > Philipp. > > -- Tom Knorr Independent Consultant currently working on The NeuroCollective www.NeuroCollective.com Blog: http://www.neurocollective.com/blog/ LinkedIn: https://www.linkedin.com/in/tom-knorr-4406406/
Received on Monday, 9 October 2017 23:19:46 UTC