Re: [Issue-75] - Domain

Hi Christian and all,

I wouldn't name it a messy situation/solution because it's one 
possibility of representing information in the current framework.

Well, if we merge ideas then the arbitary string value(s) for a 
potential itsDomain attribute like "financials" in your proposal should 
be replaced by a link value... (which in general would make more sense 
to be applicable across language services).

Cheers -- Jörg

On Jan 17, 2013, at 16:07 (UTC+1), "Lieske, Christian" wrote:
> Hi Jörg, Felix, all,
> Unfortunately, I still don't understand, the current draft doesn't have provisions for
> CL>>    Global: <its:domainRule selector="/h:html/h:body" its-domain="financials">
> CL>>    Local: <em its-domain="financials">IMF</em>
> If we don't have these provisions, we may end up with the messy situation/solution that Jörg sketches.
> Cheers,
> Christian
> -----Original Message-----
> From: Jörg Schütz []
> Sent: Mittwoch, 16. Januar 2013 15:28
> To:
> Cc:
> Subject: Re: [Issue-75] - Domain
> Hi Felix, Christian, and all,
> ITS should not be hijacked to take over the role of a workflow engine or
> similar application because there might be several consumers of ITS information...
> @Christian > [Could you provide one or two examples/proofs for this?]
> Here is an outline of my idea (which potentially also hijacks ITS to
> some extend):
> Possible ITS Application Scenario to Extend the "Domain" Data Category
> (1) Use (general) domain pointing for the broad classification of your
> content (global reach), i.e. employ the domain data categroy.
> (2) In cases where (1) is either too general (broad), or you want to
> further classify only parts of your content (local reach), use the
> disambiguation data category. This includes the further classifying of a
> sequence of strings which do not represent what usually is called a term
> (domain-specific vocabulary) or a multi-word unit (mwu).
> (3) For the term and mwu case use the terminology data category.
> Case (3) is applied as described in the ITS 2.0 specification; always
> consider to link to an appropriate authoritative internal or external
> terminology resource or ontology (e.g. Cyc, Snomed, MeSH, etc.) on which
> both producer and consumer have agreed upon (in this sense ITS is also
> part of a contract).
> In this scenario, case (2) is a bit trickier because "officially"
> disambiguation is also applied to meaningful string sequences, i.e. a
> word or a mwu, as in the terminology case, but now we extend this data
> category to arbitary elements, for example an entire paragraph, with the
> restriction that the attributes disambigConfidence and particularly
> disambigGranularity have a broader meaning such as the conceptual
> association to a domain's root element or to certain upper model elements.
> HTML Example (local)
> ...
> <p><span its-disambig-confidence="0.9"
> its-disambig-class-ref="">
>      Ambroxol has mucolytic and local-anaesthetic pharmacological effects
>      </span>.
> </p>
> ...
> Note: In this example, only the disambigClassRef attribute is used to
> account for the "broader" employment of the data category.
> This use case scenario might sound like a bootstrap paradox... but this
> is one possibility of using ITS 2.0 ... ;-)
> All the best -- Jörg
> On Jan 16, 2013, at 14:23 (CET), Felix Sasaki wrote:
>> Am 16.01.13 12:15, schrieb Lieske, Christian:
>>> Hi Felix, Pablo, all,
>>> Please find some my thoughts on the reply below.
>>> Cheers,
>>> Christian
>>> -----Original Message-----
>>> From: Felix Sasaki []
>>> Sent: Mittwoch, 16. Januar 2013 08:07
>>> To: Pablo Nieto Caride
>>> Cc: Lieske, Christian;
>>> Subject: Re: [Issue-75] - Domain
>>> (trying to minimize the number of mails, hence replying to several
>>> aspects in this mail)
>>> Hi Christian, Pablo, all,
>>> at Christian: you write at
>>> that 2b of your comment is resolved. How about 2a? If you are not
>>> satisfied with the replies in this thread, could you propose a change to
>>> the spec?
>>> CL>> Currently, I consider 2a as being unresolved.
>>> CL>> Addressing 2a (capture the information "This is for component X")
>>> to me does not appear to be straightforward, since
>>> CL>> you would need to accommodate an addition piece of information.
>>> One could imagine representations such as
>>> CL>>     <its:domainRule ...
>>> CL>>        domainMapping=
>>> CL>>            'MT-engine-X,"automotive auto, medical medicine,
>>> 'criminal law' law, 'property law' law"',
>>> CL>>             'TM-system-Y,"automotive X, 'criminal law' L,
>>> 'property law' law"'
>>> CL>>      />
>> Such a specification of the engine could lead to conflicting information:
>> MT-engine-X has a module for automotive. If however the engine is not
>> mentioned in a domain mapping, but a different one (which does not have
>> the automotive module): which one to choose?
>> It looks like what you add as information (= choosing the engine) is
>> something one would do after the domain mapping, not at the same time.
>> Otherwise you may run into the conflict described above.
>>> CL>> This, however, is not in line with the current normative text on
>>> "domain".
>>> Wrt to your proposal below (add a note about 2b to the spec): sure, do
>>> you want to draft something? The same for 2a (if you don't have a
>>> specific solution in mind, stating the issue might already be helpful).
>>> CL>> How about the following additional paragraph for the first note
>>> in ( for 2b?
>>> CL>>
>>> CL>> "domainMapping" even allows "domain" systems/hierarchies to be
>>> encoded. domainMapping="FIN, 'A A-1 A-1-X'" could for example be used
>>> to capture the following information:
>> Would it be OK to re-formulate that sentence above like this:
>> [
>> the domainMapping attribute does not itself specify how to encode
>> "domain" systems/hierachies. An application using domainMapping hence is
>> free to work with application specific hierarchies to capture
>> information like:
>> ]
>> It seems this is more in line with the language tag example: it is
>> saying that applications can do things that are on purpose underspecified.
>>> CL>> a. There exists a domain system that includes domains (e.g. A),
>>> sub-domains (e.g. A-1), and sub-subdomains (e.g. A-1-X)
>>> CL>> b. Prefer the lowest level in the system (e.g. work with an MT
>>> engine for A-1-X if available, otherwise work with one for A-1 or even
>>> A if available)
>>> CL>>
>>> CL>> This "power to encode and to interpret" is similar to matching of
>>> language tags, see
>>> CL>> "Language tag matching is a tool, and does not by itself specify
>>> a  complete procedure for the use of language tags ...
>>> CL>> The matching specification itself makes clear that it there are many
>>> CL>> aspects that are left out for actually using language tags. But
>>> having no matching at all would be even less interoperability, hence
>>> the "imperfect" matching scheme.
>> Best,
>> Felix
>>> Wrt to 1 (local domain): would this also be relevant for other
>>> implementers of domain (asking again)?
>> About this one: we have Pablo and Yves saying in separate mails this
>> might be of interest - enough to get through the w3c process. But is it
>> worth another last call period?
>> Best,
>> Felix
>>> Best,
>>> Felix
>>> Am 15.01.13 19:32, schrieb Pablo Nieto Caride:
>>>> Hi all,
>>>> Felix, I think that a local domain could be interesting, at least WP4
>>>> client would be happy with that, I don't know what the others think.
>>>> Christian, regarding the domain mapping I think that Yves and Felix
>>>> are right, you can implement your own mapping, you can adapt it to
>>>> specific MT if you want, as for the example <its:domainRule
>>>> selector="/h:html/h:body" ... domainMapping="FIN, 'A A-1 A1-A1X'"/>,
>>>> I certain MT Systems can manage the precedence by themselves.
>>>> Cheers,
>>>> Pablo.
>>>> Hi,
>>>> I wonder if it would be good idea to add the scenario I have provided
>>>> (domain "system") and Felix' information on how to approach it
>>>> (namely similar to language tag matching) to one of the "notes" that
>>>> currently are in place for in the "domain" section.
>>>> Best regards,
>>>> Christian
>>>> -----Original Message-----
>>>> From:
>>>> Sent: Dienstag, 15. Januar 2013 08:10
>>>> To: 'Felix Sasaki';
>>>> Subject: RE: [Issue-75] - Domain
>>>> Hi Felix,
>>>> I follow your line of thought related to the similarities between
>>>> "domainMapping" and matching of language tags. Thus, it would be OK
>>>> for me to consider 2.b of
>>>> closed.
>>>> Cheers,
>>>> Christian
>>>> -----Original Message-----
>>>> From: Felix Sasaki []
>>>> Sent: Montag, 14. Januar 2013 19:27
>>>> To:
>>>> Subject: Re: [Issue-75] - Domain
>>>> Hi Christian, Yves, all,
>>>> Am 14.01.13 16:52, schrieb Yves Savourel:
>>>>> Hi Christian, all,
>>>>> CL>> It seems as if I didn't manage to my point about this aspect of
>>>>> "domain" is clear.
>>>>> CL>> Let me to try to provide a remedy by adding to my original
>>>>> comment:
>>>>> CL>> Something like its-domain="financials" could not just be imagined
>>>>> CL>>to work in  a global rule (e.g. instead of a pointer); in
>>>>> addition, a local use of "domain"
>>>>> CL>> could be imagined
>>>>> CL>>    Global: <its:domainRule selector="/h:html/h:body"
>>>>> its-domain="financials">
>>>>> CL>>    Local: <em its-domain="financials">IMF</em>
>>>>> So (If I'm getting this right) you'd like a way to override the
>>>>> domain for spans of content? (Since the Dublin Core in HTML doesn't
>>>>> let you do that (the subject is define at the document level)).
>>>>> I think one of the reasons I hear early on was that today it would
>>>>> be difficult to make that distinction at the MT level. But I suppose
>>>>> MT engine selection is not the only application for domain. Maybe
>>>>> others have additional reason why we don't have a local domain?
>>>> Given the implementation driven approach we have made so far I would
>>>> ask: is there an implementation on the horizon that would process
>>>> local domain?
>>>>> CL>> Why do you think that the scenario that I sketch (multiply domain
>>>>> CL>> "systems" used in a processing chain) implies that a standard
>>>>> exists?
>>>>> CL>> I would rather think that the implication is the other way round:
>>>>> CL>> Since there is no standard, there is a need to accommodate
>>>>> heterogeneity.
>>>>> I agree, but so far that has not been part of the scope of ITS.
>>>>> CL>> I guess your point is valid in the sense that one could go for
>>>>> CL>> something like <its:domainRule selector="/h:html/h:body" ...
>>>>> CL>> domainMapping="FIN, 'A A-1 A1-A1X'"/>.
>>>>> CL>> However, this would require that additional information would have
>>>>> CL>> to be captured elsewhere (so that for example, the precedence
>>>>> CL>> 'A > A-1 > A1-A1X' could be captured).
>>>>> ITS doesn't prescribe what the right part of the mapping must be or
>>>>> how it should be used.
>>>>> It's really just a way to allow user-defined mechanisms to be
>>>>> connected to the input metadata.
>>>>> I suppose it is also beyond the scope of ITS.
>>>> As I understand Christian he does not ask to prescripe a mapping, but
>>>> "to accomodate for heterogeneity": allow people to formulate their own
>>>> mapping.
>>>> I think we do that: we don't make the usage of the mapping attribute
>>>> mandatory. It is an optional attribute. If "our" mapping algorithm
>>>> doesn't respond to a specific mapping approach, everybody can implement
>>>> his own mapping.
>>>> This is similar to matching of language tags, see
>>>> "Language tag matching is a tool, and does not by itself specify a
>>>> complete procedure for the use of language tags.  Such procedures are
>>>> intimately tied to the application protocol in which they occur."
>>>> The matching specification itself makes clear that it there are many
>>>> aspects that are left out for actually using language tags. But having
>>>> no matching at all would be even less interoperability, hence the
>>>> "imperfect" matching scheme.
>>>> Best,
>>>> Felix
>>>>> cheers,
>>>>> -yves

Received on Thursday, 17 January 2013 16:21:39 UTC