- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 10 Jul 2012 17:15:53 +0200
- To: Yves Savourel <ysavourel@enlaso.com>
- Cc: public-multilingualweb-lt@w3.org, David Lewis <dave.lewis@cs.tcd.ie>
- Message-ID: <CAL58czpRK+w+jxiPQhSwBZvU2vwY38PudOY3YgkbYAM2SoNaeg@mail.gmail.com>
Hi Yves, all, thanks for your comments. I tried to implement the comments at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0112.html via edits at http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#domain see the CVS diff here: --- WWW/International/multilingualweb/lt/drafts/its20/its20.odd 2012/07/07 11:52:03 1.25 +++ WWW/International/multilingualweb/lt/drafts/its20/its20.odd 2012/07/10 15:03:51 1.26 @@ -2718,7 +2718,7 @@ spotted coats.</code></p> <list type="unordered"> <item>A required <att>selector</att> attribute. It contains an XPath expression which selects the nodes to which this rule applies.</item> <item>A required <att>domainPointer</att> attribute that contains a relative XPath expression pointing to a node that contains the domain information.</item> - <item>An optional <att>domainMapping</att> attribute that contains a comma separated list of mappings between values in the content and workflow specific values. The values may contain spaces; in that case they <ref target="#rfc-keywords">MUST</ref> be delimited by quotation marks.</item> + <item>An optional <att>domainMapping</att> attribute that contains a comma separated list of mappings between values in the content and workflow specific values. The left part of the pair is part of the source content and unique within the mapping. The right part of the mapping belongs to the workflow. Several left parts can map to a single right part. The values in the left or the right part of the mapping may contain spaces; in that case they <ref target="#rfc-keywords">MUST</ref> be delimited by quotation marks, that is pairs of APOSTROPHE (Unicode code point U+0027) or QUOTATION MARK (U+0023).</item> </list> <note> <p>Although the <att>domainMapping</att> attribute it is optional, its usage is recommended. Many commercial machine translation systems use their own domain definitions; the <att>domainMapping</att> attribute will foster interoperability between these definitions and metadata items like <code>DC.subject</code> in Web pages or other types of content.</p> @@ -3135,4 +3135,4 @@ documents with ITS markup.</p> </div> </back> </text> -</TEI><!-- timestamp $Id: its20.odd,v 1.25 2012/07/07 11:52:03 fsasaki Exp $ --> +</TEI> Does that address your comments? Regarding to your questions at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jul/0115.html Yes, you are right, and I updated the table at http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#datacategories-defaults-etc Regarding your question below: 2012/7/10 Yves Savourel <ysavourel@enlaso.com> > Hi Felix, Dave, all, > > Sorry, one more question related to the implementation of Domain: > > I was looking for example and run into this DocBook one: > > <article xmlns='http://docbook.org/ns/docbook'> > <info> > <title>Example of subjectset</title> > <subjectset scheme="libraryofcongress"> > <subject> > <subjectterm>Electronic Publishing</subjectterm> > </subject> > <subject> > <subjectterm>SGML (Computer program language)</subjectterm> > </subject> > </subjectset> > </info> > <para>Text of the document</para> > </article> > > Where they explain that the //subjectset/subjectterm element indicates the > DC subject (so it falls into our domain data category). See > http://www.docbook.org/tdg5/publishers/5.1b3/en/html/ch02.html#ch-gsxml.3.8 > > As you can see, there are actually two entries in the example, so two > domains. > The question is: Can we have more than one domain associated with a > content? > I don't know. > > Just wondering what the implications are for the tools downstream like MT. > I don't know either - we very likely need feedback from Thomas and Declan on this. My feeling is that whatever we solution we take, this might lead to a best practice about how to make domain information in source content "digestible" for MT or other downstream tools. Best, Felix > > If the answer is 'no'. Then how do we know which one to use? We just leave > that decision to the author (i.e. s/he is responsible to provide only a > mapping to a single entry per document)? > > Or do we provide some kind of default behavior, like: the first or last > one wins? > > Thanks, > -yves > > > > -- Felix Sasaki DFKI / W3C Fellow
Received on Tuesday, 10 July 2012 15:16:24 UTC