W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > June 2012

Re: [All] domain data category section proposal, please review

From: Felix Sasaki <fsasaki@w3.org>
Date: Fri, 29 Jun 2012 08:52:01 +0200
Message-ID: <CAL58czrMgsC4coERpDp0RAn1SWOg9FoWXSrvjewggszaWZY7ZQ@mail.gmail.com>
To: public-multilingualweb-lt@w3.org
Hi all,

FYI, I wrote the domain section based on the initial proposal and this
thread, please have a look at
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#domain

This closes ACTION-144. I also updated

http://www.w3.org/International/multilingualweb/lt/wiki/Implementation_Commitments#New_ITS_2.0_categories
With a link to the section.

Best,

Felix

2012/6/27 Felix Sasaki <fsasaki@w3.org>

> Declan, all, thanks a lot for your feedback. I think we are close to
> consensus about this, and I have given myself an ACTION-144 to put this
> into the draft by next week.
>
> Best,
>
> Felix
>
>
> 2012/6/26 Declan Groves <dgroves@computing.dcu.ie>
>
>> Felix,
>>
>> Thanks for your proposal for domain category, which I think outlines the
>> best approach for dealing with the complex domain category so good job!
>>
>> The data category agnostic approach makes more sense, and allows for more
>> flexibility, particularly for existing commercial MT service providers who
>> will already have their own list of pre-defined domain categories. I am not
>> too familiar with DCR so I dont feel qualified to comment on Arle's
>> suggestion. o
>>
>> Using Dublin Core, however, is a good pointer to use due to its fairly
>> wide adoption (on this - is it worth providing a URL to the relevant Dublin
>> Core content?) - I know that many MT systems that do implement domain
>> metadata do so using high-level domains either taken directly from Dublin
>> Core or adapted from it (e.g. I think the LetsMT project use dublin core as
>> a starting point for defining domain).  One thing to keep in mind is
>> that the proposal should be as clear and concise as possible. In terms of
>> providing pointers to what codes people can use, I think we are better off
>> limiting this as promoting interoperability is key and providing a list
>> of alternative implementation strategies may over-complicate things.
>>
>> It is good to emphasise the optional domainMapping attribute, and I would
>> perhaps add to the paragraph concerning the explanation of domainMapping
>> that although optional, it is recommended that details for the attribute be
>> provided. For our implementation, I expect to carry out something similar
>> to Thomas - create a mapping from the provided domain metadata to domains
>> that are available for our trained systems.
>>
>> typo: "In source content... " -> "In the source content..."
>>       "no agreed upon set of value sets" -> "no agreed upon value sets"
>>
>> Declan
>>
>>
>>
>> On 25 June 2012 15:43, Felix Sasaki <fsasaki@w3.org> wrote:
>>
>>> Hi Arle, Thomas, all,
>>>
>>> thanks for your feedback, Thomas, I'll fix the typos you found.
>>>
>>> 2012/6/25 Arle Lommel <arle.lommel@dfki.de>
>>>
>>>> Was this an area where the ISO data category registry might come into
>>>> play?
>>>>
>>>
>>> No - this proposal is "data category agnostic". The idea is to provide a
>>> mechanism to map existing value lists (like the one Thomas mentioned).
>>>
>>>
>>>> That is, could we declare an agreed upon selection of fairly broad
>>>> top-level domains to promote interoperability while still allowing for
>>>> specification by users?
>>>>
>>>
>>>
>>> After our discussion in Dublin and quite a few mails about this, see
>>> e.g. the summary at
>>>
>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012May/0165.html
>>> or David's proposal at
>>>
>>> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012May/0079.html
>>>
>>> I don't see an agreement for even top level domains.
>>>
>>>
>>>
>>>>
>>>> Unfortunately there is a lot of complexity around this issue in general
>>>> that we will not resolve and that may indeed be fundamentally unresolvable.
>>>> But perhaps using the DCR as a place where domain ontologies can be
>>>> declared in an authoritative resource and pointed to we could at least
>>>> provide a way for someone to share what they mean.
>>>>
>>>
>>>
>>> There are so many running systems using their own value lists for domain
>>> - I wouldn't expect that Lucy software or others would change their
>>> systems. The benefit they would get with the proposal in this thread is
>>> that connecting systems (e.g. MT + CMS) gets easier.
>>>
>>> Of course one could point users to what codes they should use. The
>>> dublin core subject field I have put into the draft is such a pointer. In
>>> addition I would be happy to name DCR as another area to look into, like
>>> TAUS top level categories, Let's MT top level categories, etc. That is, of
>>> course we want people to be aware of DCR.
>>>
>>> I also saw your question wrt DCR in the other thread, but I also don't
>>> recall an area where we would have a direct dependency. But as I said
>>> above, it would be good to inform readers of ITS 2.0 about where relying on
>>> DCR makes sense.
>>>
>>> A related question: if I want to refer to DCR in an HTML "meta" element,
>>> how would the DCR "scheme" be identified? Here is an example from dublin
>>> core:
>>>
>>> <meta name="DCTERMS.issued" scheme="DCTERMS.W3CDTF" content="2003-11-01"
>>> />
>>>
>>>
>>> If there is an approach to do that with DCR, I think we should have an
>>> example about it in ITS 2.0. Maybe you can check with the DCR experts in
>>> Madrid?
>>>
>>>
>>> Best,
>>>
>>> Felix
>>>
>>>
>>>>
>>>> Arle
>>>>
>>>> --
>>>> Arle Lommel
>>>> Berlin, Germany
>>>> Skype: arle_lommel
>>>> Phone (US): +1 707 709 8650
>>>>
>>>> Sent from a mobile device. Please excuse any typos.
>>>>
>>>> On Jun 25, 2012, at 16:02, "Thomas Ruedesheim" <
>>>> thomas.ruedesheim@lucysoftware.com> wrote:
>>>>
>>>> Hi Felix,
>>>>
>>>> I agree with your proposal. (There are just 2 typos in the examples: ""
>>>> in domainPointer attributes.)
>>>> Lucy's MT engine accepts a global SUBJECT_AREAS parameter holding a
>>>> list of domain names. Domains are organized in a hierarchy.
>>>> Here is a short excerpt (first 2 levels):
>>>>   General Vocabulary
>>>>     Common Social Voc.
>>>>       Art & Literature
>>>>       Ecology, Environment Protection
>>>>       Economy & Trade
>>>>       Law & Legal Science
>>>>       ...
>>>>     Common Technical Voc.
>>>>       Agriculture & Fishing
>>>>       Civil Engineering
>>>>       Data Processing
>>>>       ...
>>>> We will read the meta data and apply the mapping. Of course, the
>>>> mapping is specific for the used MT tool.
>>>>
>>>> Cheers,
>>>> Thomas
>>>>
>>>>
>>>>
>>>>  ------------------------------
>>>> *From:* Felix Sasaki [mailto:fsasaki@w3.org]
>>>> *Sent:* Montag, 25. Juni 2012 08:48
>>>> *To:* public-multilingualweb-lt@w3.org
>>>> *Subject:* [All] domain data category section proposal, please review
>>>>
>>>> Hi all,
>>>>
>>>> I have created a proposal for the domain data category, see attachment.
>>>> This would resolve ISSUE-11, with the input from ACTION-87 taken into
>>>> account.
>>>>
>>>> Declan, Thomas, I think this is esp. important for you - we need to
>>>> know whether an implementation as described would be feasible and useful
>>>> for you. Of course, others, feel welcome to contribute.
>>>>
>>>> Please make comments in this thread - I will use them to provide
>>>> another version of the section.
>>>>
>>>> Thanks,
>>>>
>>>> Felix
>>>>
>>>> --
>>>> Felix Sasaki
>>>> DFKI / W3C Fellow
>>>>
>>>>
>>>
>>>
>>> --
>>> Felix Sasaki
>>> DFKI / W3C Fellow
>>>
>>>
>>
>>
>> --
>> Dr. Declan Groves
>> Research Integration Officer
>> Centre for Next Generation Localisation (CNGL)
>> Dublin City University
>>
>> email: dgroves@computing.dcu.ie <dgroves@computing.dcu.ie>
>>  phone: +353 (0)1 700 6906
>>
>
>
>
> --
> Felix Sasaki
> DFKI / W3C Fellow
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Friday, 29 June 2012 06:52:31 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:24:56 UTC