- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 27 Jun 2012 00:13:32 +0200
- To: Declan Groves <dgroves@computing.dcu.ie>
- Cc: Arle Lommel <arle.lommel@dfki.de>, Thomas Ruedesheim <thomas.ruedesheim@lucysoftware.com>, "<public-multilingualweb-lt@w3.org>" <public-multilingualweb-lt@w3.org>
- Message-ID: <CAL58czrF2TkMz1e_on9k=kLa-smH2ExgSFE-veLiwa=t69rG-w@mail.gmail.com>
Declan, all, thanks a lot for your feedback. I think we are close to consensus about this, and I have given myself an ACTION-144 to put this into the draft by next week. Best, Felix 2012/6/26 Declan Groves <dgroves@computing.dcu.ie> > Felix, > > Thanks for your proposal for domain category, which I think outlines the > best approach for dealing with the complex domain category so good job! > > The data category agnostic approach makes more sense, and allows for more > flexibility, particularly for existing commercial MT service providers who > will already have their own list of pre-defined domain categories. I am not > too familiar with DCR so I dont feel qualified to comment on Arle's > suggestion. o > > Using Dublin Core, however, is a good pointer to use due to its fairly > wide adoption (on this - is it worth providing a URL to the relevant Dublin > Core content?) - I know that many MT systems that do implement domain > metadata do so using high-level domains either taken directly from Dublin > Core or adapted from it (e.g. I think the LetsMT project use dublin core as > a starting point for defining domain). One thing to keep in mind is that > the proposal should be as clear and concise as possible. In terms of > providing pointers to what codes people can use, I think we are better off > limiting this as promoting interoperability is key and providing a list > of alternative implementation strategies may over-complicate things. > > It is good to emphasise the optional domainMapping attribute, and I would > perhaps add to the paragraph concerning the explanation of domainMapping > that although optional, it is recommended that details for the attribute be > provided. For our implementation, I expect to carry out something similar > to Thomas - create a mapping from the provided domain metadata to domains > that are available for our trained systems. > > typo: "In source content... " -> "In the source content..." > "no agreed upon set of value sets" -> "no agreed upon value sets" > > Declan > > > > On 25 June 2012 15:43, Felix Sasaki <fsasaki@w3.org> wrote: > >> Hi Arle, Thomas, all, >> >> thanks for your feedback, Thomas, I'll fix the typos you found. >> >> 2012/6/25 Arle Lommel <arle.lommel@dfki.de> >> >>> Was this an area where the ISO data category registry might come into >>> play? >>> >> >> No - this proposal is "data category agnostic". The idea is to provide a >> mechanism to map existing value lists (like the one Thomas mentioned). >> >> >>> That is, could we declare an agreed upon selection of fairly broad >>> top-level domains to promote interoperability while still allowing for >>> specification by users? >>> >> >> >> After our discussion in Dublin and quite a few mails about this, see e.g. >> the summary at >> >> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012May/0165.html >> or David's proposal at >> >> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012May/0079.html >> >> I don't see an agreement for even top level domains. >> >> >> >>> >>> Unfortunately there is a lot of complexity around this issue in general >>> that we will not resolve and that may indeed be fundamentally unresolvable. >>> But perhaps using the DCR as a place where domain ontologies can be >>> declared in an authoritative resource and pointed to we could at least >>> provide a way for someone to share what they mean. >>> >> >> >> There are so many running systems using their own value lists for domain >> - I wouldn't expect that Lucy software or others would change their >> systems. The benefit they would get with the proposal in this thread is >> that connecting systems (e.g. MT + CMS) gets easier. >> >> Of course one could point users to what codes they should use. The dublin >> core subject field I have put into the draft is such a pointer. In addition >> I would be happy to name DCR as another area to look into, like TAUS top >> level categories, Let's MT top level categories, etc. That is, of course we >> want people to be aware of DCR. >> >> I also saw your question wrt DCR in the other thread, but I also don't >> recall an area where we would have a direct dependency. But as I said >> above, it would be good to inform readers of ITS 2.0 about where relying on >> DCR makes sense. >> >> A related question: if I want to refer to DCR in an HTML "meta" element, >> how would the DCR "scheme" be identified? Here is an example from dublin >> core: >> >> <meta name="DCTERMS.issued" scheme="DCTERMS.W3CDTF" content="2003-11-01" >> /> >> >> >> If there is an approach to do that with DCR, I think we should have an >> example about it in ITS 2.0. Maybe you can check with the DCR experts in >> Madrid? >> >> >> Best, >> >> Felix >> >> >>> >>> Arle >>> >>> -- >>> Arle Lommel >>> Berlin, Germany >>> Skype: arle_lommel >>> Phone (US): +1 707 709 8650 >>> >>> Sent from a mobile device. Please excuse any typos. >>> >>> On Jun 25, 2012, at 16:02, "Thomas Ruedesheim" < >>> thomas.ruedesheim@lucysoftware.com> wrote: >>> >>> Hi Felix, >>> >>> I agree with your proposal. (There are just 2 typos in the examples: "" >>> in domainPointer attributes.) >>> Lucy's MT engine accepts a global SUBJECT_AREAS parameter holding a list >>> of domain names. Domains are organized in a hierarchy. >>> Here is a short excerpt (first 2 levels): >>> General Vocabulary >>> Common Social Voc. >>> Art & Literature >>> Ecology, Environment Protection >>> Economy & Trade >>> Law & Legal Science >>> ... >>> Common Technical Voc. >>> Agriculture & Fishing >>> Civil Engineering >>> Data Processing >>> ... >>> We will read the meta data and apply the mapping. Of course, the mapping >>> is specific for the used MT tool. >>> >>> Cheers, >>> Thomas >>> >>> >>> >>> ------------------------------ >>> *From:* Felix Sasaki [mailto:fsasaki@w3.org] >>> *Sent:* Montag, 25. Juni 2012 08:48 >>> *To:* public-multilingualweb-lt@w3.org >>> *Subject:* [All] domain data category section proposal, please review >>> >>> Hi all, >>> >>> I have created a proposal for the domain data category, see attachment. >>> This would resolve ISSUE-11, with the input from ACTION-87 taken into >>> account. >>> >>> Declan, Thomas, I think this is esp. important for you - we need to know >>> whether an implementation as described would be feasible and useful for >>> you. Of course, others, feel welcome to contribute. >>> >>> Please make comments in this thread - I will use them to provide another >>> version of the section. >>> >>> Thanks, >>> >>> Felix >>> >>> -- >>> Felix Sasaki >>> DFKI / W3C Fellow >>> >>> >> >> >> -- >> Felix Sasaki >> DFKI / W3C Fellow >> >> > > > -- > Dr. Declan Groves > Research Integration Officer > Centre for Next Generation Localisation (CNGL) > Dublin City University > > email: dgroves@computing.dcu.ie <dgroves@computing.dcu.ie> > phone: +353 (0)1 700 6906 > -- Felix Sasaki DFKI / W3C Fellow
Received on Tuesday, 26 June 2012 22:13:59 UTC