- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 18 Sep 2012 12:26:00 +0200
- To: public-multilingualweb-lt@w3.org
- Message-ID: <CAL58czr1KwFt43UFt8tC6O0LKFHOrYy==rEr9eoWNWddBwX1Ng@mail.gmail.com>
Hi all, quite a while ago I had a mail exchange with Christian Lieske who had made me aware of a potential solution to issue-42. Please have a look below and let's discuss this also in Prague. - Felix 2012/7/27 Lieske, Christian <christian.lieske@sap.com> Hi,**** > > ** ** > > As mentioned by Felix, MultilingualWeb-LT Working Group<http://www.w3.org/International/multilingualweb/lt/>is discussing what possibly can be called “Configurations for/Information > on Linguistic Processors”: If for example a translation is produced by a > Machine Translation engine, the corresponding translation may need to be > annotated with information such as “This translation was produce with > version X of engine Y. The dataset that was used to train the engine was D, > and the corresponding language model was L”. In a world which recognizes > the importance of trust/reliability this kind of meta data from my point of > view is how importance. From my point of view, the “Configurations > for/Information on Linguistic Processors” thus is also related to > discussion surrounding “provenance”.**** > > ** ** > > The MLW-LT discussion reminded me of the fact that I did some work on the > topic a while ago: In the context of the Open Lexicon Interchange Format > (OLIF; see http://www.tekom.de/upload/2284/OASIS_40_Lieske.pdf ), I > investigated what type of configuration/information for example may have to > be captured in the context of Term Extraction processors. The outcome of > the investigations found its way into the latest version of OLIF in the > disguise of the “termExtrInfo” element (see attached screenshot of the OLIF > 3.0 schema, and schema at > http://www.olif.net/downloads/OLIF-3.0-Beta-20Feb2008-v5.zip).**** > > ** ** > > The “termExtrInfo” is basically a set of data categories that allow you to > capture for example the following:**** > > ** ** > > **1. **Tool Info and Features: Info on features of the tool that > was used (you can for example provide info on which approach to > morphological analysis is implemented)**** > > **2. **Input Info: Info on features of the data that was fed into > the tool (you can for example get a feeling for the quality you can expect) > **** > > **3. **Process Info: Info on the process that involved tool and > input (you can for example capture that you did something for a specific > client)**** > > ** ** > > Possibly, the “termExtrInfo” could serve as a model for a more general > “lingProcInfo”.**** > > ** ** > > Cheers,**** > > Christian**** > > ** ** > > [image: Description: cid:image001.jpg@01CD6A4E.153C0220]**** > > *Christian Lieske** > *Knowledge Architect > SAP Language Services (SLS) - “*Translating SAP for the World*“ > SAP Globalization Services > *SAP AG > *SAP Allee 15 > D-68789 St. Leon-Rot > Germany > T +49 (62 27) 7 - 6 13 03 > F +49 (62 27) 7 – 2 54 18 > mail to:*christian.lieske@sap.com** > **www.sap.com* **** > > Pflichtangaben/Mandatory Disclosure Statements: * > http://www.sap.com/company/legal/impressum.epx*<http://www.sap.com/company/legal/impressum.epx> > Diese E-Mail kann Betriebs- oder Geschäftsgeheimnisse oder sonstige > vertrauliche Informationen enthalten. Sollten Sie diese E-Mail irrtümlich > erhalten haben, ist Ihnen eine Kenntnisnahme des Inhalts, eine > Vervielfältigung oder Weitergabe der E-Mail ausdrücklich untersagt. Bitte > benachrichtigen Sie uns und vernichten Sie die empfangene E-Mail. Vielen > Dank. > > This e-mail may contain trade secrets or privileged, undisclosed, or > otherwise confidential information. If you have received this e-mail in > error, you are hereby notified that any review, copying, or distribution of > it is strictly prohibited. Please inform us immediately and destroy the > original transmittal. Thank you for your cooperation.**** > -- Felix Sasaki DFKI / W3C Fellow
Attachments
- image/jpeg attachment: image003.jpg
Received on Tuesday, 18 September 2012 10:26:30 UTC