- From: Felix Sasaki <fsasaki@w3.org>
- Date: Sat, 22 Sep 2012 08:19:27 +0200
- To: Yves Savourel <ysavourel@enlaso.com>
- Cc: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czoWdQstNwe4yLuXOFX5Tg=yN0vSZr0RsSUKnDmv5KWe+w@mail.gmail.com>
Hi Yves, all, 2012/9/22 Yves Savourel <ysavourel@enlaso.com> > Hi Felix, all, > > > I think it would be OK to have a data category "ITS Tool information" > > which is available both locally and globally. Locally, > > I can see one implementation complication with this method. > It has to do with overriding (obviously): > > Imagine the following file: > > <doc> > <head> > ... > </head> > <body> > ... > </body> > </doc> > > Tool ABC processes this document and add the tool-ref attribute to <body> > to be sure any annotation it does inside <body> is properly associated with > its information: > > <doc> > <head its:toolRef="#tABC"> > ... > </head> > <body> > ...<span its:someStuff="abc">... > </body> > </doc> > > Then Tool XYZ takes this as input and process it for another data category. > But this time Tool XYZ choose to set the tool reference information on > <doc>: > > <doc its:toolRef="#tXYZ"> > <head its:toolRef="#tABC"> > ... > </head> > <body> > ...<span its:someStuff="abc">... > ...<span its:someOtherStuff="xyz">... > </body> > </doc> > > Now the information its:someOtherStuff added by Tool XYZ is seen as being > added by Tool ABC (or more exactly by no tool since Tool ABC is not labeled > for the data category corresponding to its:someOtherStuff). > > Obviously a tool can no solve such issue by searching for toolRef in the > whole document and adding a reference to itself when appropriate. Another solution would be to ask the tool XYZ to create a global rule: <its:toolRule selector="//*[@its:someOtherStuff]" toolRef="#tXYZ"/> That would not be done during stream based data category processing, but separately. And asking a tool that produces "ts:someOtherStuff" attributes to create even automatically the XPath expression "//*[@its:someOtherStuff]" is maybe OK? > But it can be a bit complicated to do. Especially if the tool is > stream-based rather than DOM-based. My point is that the mechanism is not > necessarily easy to implement correctly. > > Another question is whether tools will have tot support this new data > category if they support one of the data categories that make use of it. > My first thought would be no: so far we've keep each data category > separated, and identifying the tool that added some information may be seen > as optional in many cases. I would agree. > So there are little reasons to make it mandatory. > But this can lead to new problems: if it's optional two tools can process > the same document for the same data category e.g. mtConfidence, but if only > one provides the tool reference, then all mtConfidence markup is seen as > done by the lone tool that provided the tool reference. > I wouldn't force tools and wouldn't interrelate data categories. Tool information is just a hint, like HTML "meta" generator. Like with "meta" generator, the tool info cannot be checked (e.g. how to check that something has been created by a Drupal CMS?). Another reason is that I wouldn't see a way to write test cases. > So, it seems we must force tools to implement that data category, but it's > not one easy to implement... > > Anyway, some fruits for thoughts... > > Cheers, > -yves > > > > > > > > -- Felix Sasaki DFKI / W3C Fellow
Received on Saturday, 22 September 2012 06:19:55 UTC