- From: Felix Sasaki <fsasaki@w3.org>
- Date: Mon, 25 Jun 2012 10:32:45 +0200
- To: Tadej ©tajner <tadej.stajner@ijs.si>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czqYgoDNewDbbinbXsu7s9JTXWnSZwa-Rh=0+6FSza1fTA@mail.gmail.com>
Hi Tadej, sorry for the late reply. So this sounds like we would have an "entity" data category instead of "disambiguation". Disambiguation would then be one usage scenario for "entity". I had proposed at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0133.html that you, Tadej, write a "disambiguation" section, but maybe it makes sense to have an "entity" section with use cases (and markup) for "named entity" and "word sense disambiguation". The "terminology" aspect (linking to a term lexicon) could be realized by updating the existing terminology data category with a lexicon link. What do you or others think? Best, Felix 2012/6/21 Tadej Štajner <tadej.stajner@ijs.si> > Hi, > this is feasible. The rationale behind my decision was that having > individual attributes for different relationships is less verbose, at the > expense of having more attributes in the spec. If minimising the latter is > higher priority, then I agree with this way. > > Some points: in example 2, this syntax has now way to simultaneously > express that the "Mike Jones" can actually be described with an pointer to > a resource (let's say, http://dbpedia.org/resource/Mike_Jones_(poet)). > So, basically, saying both that he is a Person and that he's actually some > concrete person. This entails introducing this distinction: > > for unknown but detected entities: > <span entityType="ne-type" entityIdent="Person" entityResource=" > http://www.schema.org/">Mike Jones</span> > > for known entities: > <span entityType="ne-ref" entityIdent="http://dbpedia.org/resource/ > Mike_Jones_(poet)" entityResource="http://dbpedia.org/">Mike Jones</span> > > which is not ideal and reduces expressivity, since we're unable to assert > both at the same time within the same element. I guess nesting the elemets > could work, but that's introducing complexities in markup. In a global > selector setting, it's probably fine. > > And re your comments. > - that's the current state, of the software, yes. Automation of 3) is > possible provided that a term lexicon is specified. > - agree, but there can be a pretty big number of such rules following this > example, especially since we'd have to explicitly state every type mapping, > since the selector doesn't reason that a itemtype=Musician (for example) is > also a Person. Is this something that is worth maintaining? > > -- Tadej > > > On 20. 06. 2012 20:41, Felix Sasaki wrote: > > Tadej, all, > > I was looking at > > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Terminology > and I'm wondering whether your proposal can be merged. Let me start with > examples bottom-up > > 1) > <span entityType="wsd" entityIdent="synsets-836" entityResource=" > http://example.com/myWordnet">bank</span> > tries to capture > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#disambiguation > > 2) > <span entityType="ne" entityIdent="Person" entityResource=" > http://www.schema.org/">Mike Jones</span> > tries to capture > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#namedEntity > > 3) > <span entityType="term" entityIdent="lexEntry473" entityResource=" > http://example.com/myLexion">language technology</span> > tries to capture > http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#terminology_2 > > Does above merging make sense? One motivation for me is to propose as > less attributes as possible - in that way we can > Also, some general questions / comments: > - I assume that 1) and 2) could be automatically generated by tools, but > 3) not? > - to allow people to re-use existing annotations (e.g. from schema.org), > we could define global rules like this: > <its:entity Rule selector="//div[@itemtype='Person']" entityResource=" > http://www.schema.org/" entityType="ne"/> > > Felix > > > 2012/6/19 Tadej Stajner <tadej.stajner@ijs.si> > >> Hi, Felix, >> I've cleaned up the Terminology section in the requirements document with >> regard to recent discussions on the list and in Dublin. What kind of >> worklow do we have in order to update the draft, to post recommendations, >> examples, etc? Is the Requirements wiki page the right place for this? >> >> >> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Terminology >> >> -- Tadej >> >> >> >> >> On 6/19/2012 12:09 PM, Maxime Lefrançois wrote: >> >> Hi, >> >> The taskforce is on the HTML to RDFa algorithm. >> It should be ready by tomorrow afternoon for review. >> >> Maxime >> >> ------------------------------ >> >> *De: *"Felix Sasaki" <fsasaki@w3.org> <fsasaki@w3.org> >> *À: *"Jirka Kosek" <jirka@kosek.cz> <jirka@kosek.cz> >> *Cc: *public-multilingualweb-lt@w3.org >> *Envoyé: *Mardi 19 Juin 2012 12:00:25 >> *Objet: *Re: [All] ITS 2.0 first draft, please review by Thursday >> >> >> >> 2012/6/19 Jirka Kosek <jirka@kosek.cz> >> >>> On 19.6.2012 5:48, Felix Sasaki wrote: >>> >>> > Thanks for the reminder - just changed this. >>> > >>> > I also created a section including examples >>> > >>> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#usage-in-html5 >>> > and >>> > >>> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selection-global-html5 >>> > please have a look. >>> >>> Looks good. Except small typo: >>> >>> <link href="EX-translateRule-html5-1.xml" type="itsRules"/> >>> >>> Should read as: >>> >>> <link href="EX-translateRule-html5-1.xml" rel="itsRules"/> >>> >>> Also I think that for consistency we should use lower-case letters in >>> rel value, either type="itsrules" or type="its-rules". >>> >> >> Thanks, fixed. >> >> Felix >> >> >>> >>> Jirka >>> -- >>> ------------------------------------------------------------------ >>> Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz >>> ------------------------------------------------------------------ >>> Professional XML consulting and training services >>> DocBook customization, custom XSLT/XSL-FO document processing >>> ------------------------------------------------------------------ >>> OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member >>> ------------------------------------------------------------------ >>> >>> >> >> >> -- >> Felix Sasaki >> DFKI / W3C Fellow >> >> >> >> > > > -- > Felix Sasaki > DFKI / W3C Fellow > > > -- Felix Sasaki DFKI / W3C Fellow
Received on Monday, 25 June 2012 08:33:17 UTC