W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > June 2012

Re: Terminology merging ? (Re: [All] ITS 2.0 first draft, please review by Thursday)

From: Tadej Štajner <tadej.stajner@ijs.si>
Date: Thu, 21 Jun 2012 10:53:53 +0200
Message-ID: <4FE2E121.3010805@ijs.si>
To: public-multilingualweb-lt@w3.org
Hi,
this is feasible. The rationale behind my decision was that having 
individual attributes for different relationships is less verbose, at 
the expense of having more attributes in the spec. If minimising the 
latter is higher priority, then I agree with this way.

Some points: in example 2, this syntax has now way to simultaneously 
express that the "Mike Jones" can actually be described with an pointer 
to a resource (let's say, 
http://dbpedia.org/resource/Mike_Jones_(poet)). So, basically, saying 
both that he is a Person and that he's actually some concrete person. 
This entails introducing this distinction:

for unknown but detected entities:
<span entityType="ne-type" entityIdent="Person" 
entityResource="http://www.schema.org/">Mike Jones</span>

for known entities:
<span 
entityType="ne-ref" entityIdent="http://dbpedia.org/resource/Mike_Jones_(poet)" 
entityResource="http://dbpedia.org/">Mike Jones</span>

which is not ideal and reduces expressivity, since we're unable to 
assert both at the same time within the same element. I guess nesting 
the elemets could work, but that's introducing complexities in markup. 
In a global selector setting, it's probably fine.

And re your comments.
- that's the current state, of the software, yes. Automation of 3) is 
possible provided that a term lexicon is specified.
- agree, but there can be a pretty big number of such rules following 
this example, especially since we'd have to explicitly state every type 
mapping, since the selector doesn't reason that a itemtype=Musician (for 
example) is also a Person. Is this something that is worth maintaining?

-- Tadej

On 20. 06. 2012 20:41, Felix Sasaki wrote:
> Tadej, all,
>
> I was looking at
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Terminology
> and I'm wondering whether your proposal can be merged. Let me start 
> with examples bottom-up
>
> 1)
> <span entityType="wsd" entityIdent="synsets-836" 
> entityResource="http://example.com/myWordnet">bank</span>
> tries to capture 
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#disambiguation
>
> 2)
> <span entityType="ne" entityIdent="Person" 
> entityResource="http://www.schema.org/">Mike Jones</span>
> tries to capture 
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#namedEntity
>
> 3)
> <span entityType="term" entityIdent="lexEntry473" 
> entityResource="http://example.com/myLexion">language technology</span>
> tries to capture 
> http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#terminology_2
>
> Does above merging make sense? One motivation for me is to propose as 
> less attributes as possible - in that way we can
> Also, some general questions / comments:
> - I assume that 1) and 2) could be automatically generated by tools, 
> but 3) not?
> - to allow people to re-use existing annotations (e.g. from schema.org 
> <http://schema.org>), we could define global rules like this:
> <its:entity Rule selector="//div[@itemtype='Person']" 
> entityResource="http://www.schema.org/" entityType="ne"/>
>
> Felix
>
>
> 2012/6/19 Tadej Stajner <tadej.stajner@ijs.si 
> <mailto:tadej.stajner@ijs.si>>
>
>     Hi, Felix,
>     I've cleaned up the Terminology section in the requirements
>     document with regard to recent discussions on the list and in
>     Dublin. What kind of worklow do we have in order to update the
>     draft, to post recommendations, examples, etc? Is the Requirements
>     wiki page the right place for this?
>
>     http://www.w3.org/International/multilingualweb/lt/wiki/Requirements#Terminology
>
>      -- Tadej
>
>
>
>
>     On 6/19/2012 12:09 PM, Maxime Lefrançois wrote:
>>     Hi,
>>
>>     The taskforce is on the HTML to RDFa algorithm.
>>     It should be ready by tomorrow afternoon for review.
>>
>>     Maxime
>>
>>     ------------------------------------------------------------------------
>>
>>         *De: *"Felix Sasaki" <fsasaki@w3.org> <mailto:fsasaki@w3.org>
>>         *Ŕ: *"Jirka Kosek" <jirka@kosek.cz> <mailto:jirka@kosek.cz>
>>         *Cc: *public-multilingualweb-lt@w3.org
>>         <mailto:public-multilingualweb-lt@w3.org>
>>         *Envoyé: *Mardi 19 Juin 2012 12:00:25
>>         *Objet: *Re: [All] ITS 2.0 first draft, please review by Thursday
>>
>>
>>
>>         2012/6/19 Jirka Kosek <jirka@kosek.cz <mailto:jirka@kosek.cz>>
>>
>>             On 19.6.2012 5:48, Felix Sasaki wrote:
>>
>>             > Thanks for the reminder  - just changed this.
>>             >
>>             > I also created a section including examples
>>             >
>>             http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#usage-in-html5
>>             > and
>>             >
>>             http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#selection-global-html5
>>             > please have a look.
>>
>>             Looks good. Except small typo:
>>
>>             <link href="EX-translateRule-html5-1.xml" type="itsRules"/>
>>
>>             Should read as:
>>
>>             <link href="EX-translateRule-html5-1.xml" rel="itsRules"/>
>>
>>             Also I think that for consistency we should use
>>             lower-case letters in
>>             rel value, either type="itsrules" or type="its-rules".
>>
>>
>>         Thanks, fixed.
>>
>>         Felix
>>
>>
>>                                    Jirka
>>             --
>>             ------------------------------------------------------------------
>>              Jirka Kosek      e-mail: jirka@kosek.cz
>>             <mailto:jirka@kosek.cz> http://xmlguru.cz
>>             ------------------------------------------------------------------
>>                   Professional XML consulting and training services
>>              DocBook customization, custom XSLT/XSL-FO document
>>             processing
>>             ------------------------------------------------------------------
>>              OASIS DocBook TC member, W3C Invited Expert, ISO
>>             JTC1/SC34 member
>>             ------------------------------------------------------------------
>>
>>
>>
>>
>>         -- 
>>         Felix Sasaki
>>         DFKI / W3C Fellow
>>
>>
>
>
>
>
> -- 
> Felix Sasaki
> DFKI / W3C Fellow
>
Received on Thursday, 21 June 2012 08:54:28 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:24:56 UTC