W3C home > Mailing lists > Public > public-i18n-its@w3.org > January to March 2005

RE: Term Identification: Requirement

From: Lieske, Christian <christian.lieske@sap.com>
Date: Thu, 31 Mar 2005 17:05:18 +0200
Message-ID: <0F568FE519230641B5F84502E0979DD1028A6512@dewdfe12.wdf.sap.corp>
To: <public-i18n-its@w3.org>
Cc: "Masaki Itagaki" <imasaki@qwest.net>

Hi everyone,

Sorry for sending a pointer to an inaccessible link.

I include the text below ...

Best regards,
Christian
---
Hi there, 

Since the discussion for intra-source/intra-target extension points revolves around terminology-/linguistics related examples, and since we are touching this area also in other contexts, here a couple of thoughts about the representation of terminology-/linguistics related information ...

I would favor an approach which some might term 'normalized' (in the sense of database theory), others 'single source-oriented/non-redundant'. It boils down to the following:

If you have chunks of information (e.g. all information related to a terminology concept) than don't repeat it, rather link to it. Accordingly, Yves' example would be become:

<source>Our guests can appease their spirit of adventure and itchy 
feet by exploring the various islands of our small  <xyz:term cID="1234ABE34FE">>archipelago</xyz:term></source>.

'cID' abbreviates 'concept identifier'.

The identifier would be related to terminological database which is identified by the URN to which the namespace prefix belongs.

An XML serialization of the concept identified by 'cID' would look like the following:

<xyz:termEntry cID="1234ABE34FE">
 <xyz:term>archipelago</xyz:term>
 <xyz:pos>noun</xyz:pos>
 <xyz:info>
  <xyz:def>Group of island<xyz:def>
  <xyz:pronunciation>"är-k&-'pe-l&-"gO, "är-ch&-<xyz:pronunciation>
 </xyz:info>
</xyz:termEntry>

Using the Open Lexicon Interchange Format (OLIF; see http://www.olif.net) as the representation for the XML serialization, you could end up with XLIFF which looks like the following:

<xliff version="1.1" xmlns="urn:oasis:names:tc:xliff:document:1.1" xmlns:xyz="urn:appInfo:Items" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:olf="http://www.olif.net/base/sampleDatabase">
 <file original="island.txt" source-language="en" target-language="fr" datatype="plaintext">
  <header>
   <glossary>
    <internal-file>
     <olif OlifVersion="2.0, February 2002">
      <header CreaTool="CoolTerm" CreatToolVersion="1.4.3" OrigFormat="internal" AdminLang="EN" CreaDate="20031119091301Z" CreaId="X">
       <dataCatReg>
        <subjFieldDCS DCSType="replacement">CompLingCompany</subjFieldDCS>
       </dataCatReg>
       <contentInfo>
        <quotMarkInfo QuotMarkRet="some"/>
        <langIdUse>region_exception</langIdUse>
       </contentInfo>
      </header>
      <body>
       <entry ConceptUserId="2312">
        <mono MonoUserId="1232">
         <keyDC>
          <canForm>archipelago</canForm>
          <language>en</language>
          <ptOfSpeech>noun</ptOfSpeech>
          <subjField>general-naturalScience-geography</subjField>
         </keyDC>
         <monoDC>
          <monoSem>
           <definition>Group of islands</definition>
          </monoSem>
         </monoDC>
         <generalDC>
          <note>pron:"är-k&amp;-'pe-l&amp;-"gO, "är-ch&amp;-</note>
         </generalDC>
        </mono>
       </entry>
      </body>
     </olif>
    </internal-file>
   </glossary>
  </header>
  <body>
   <trans-unit id="x">
    <source>Our guests can appease their spirit of adventure and itchy feet by exploring the various islands of our small  <olf:term cID="2312">archipelago</olf:term>
    </source>. 
   <target>Notre visiteurs ...</target>
   </trans-unit>
  </body>
 </file>
</xliff>

The benefits of this 'link-only' approach would clearly start to show if you for example would have the term 'archipelago' 1000 times in your document ... 

Best regards,
Christian
-----Original Message-----
From: public-i18n-its-request@w3.org [mailto:public-i18n-its-request@w3.org] On Behalf Of Masaki Itagaki
Sent: Thursday, March 31, 2005 4:47 PM
To: public-i18n-its@w3.org
Subject: RE: Term Identification: Requirement


Hi Christian

Thank you for your feedback. Yves was talking about TMX-link in a previous
teleconference. Let's discuss that today. I would like to look at the
contents in the Oasis link, but it asks ID and PW. Would you mind posting
the page itself here (if it's not too long).

Masaki

-----Original Message-----
From: Lieske, Christian [mailto:christian.lieske@sap.com] 
Sent: Thursday, March 31, 2005 2:18 AM
To: public-i18n-its@w3.org
Cc: Masaki Itagaki
Subject: RE: Term Identification: Requirement

Hi Masaki,

Great. Some feedback:

1. I am not sure that sth. like TMX-Link exists.
2. Possibly, we can reuse some of the thoughts from

http://www.oasis-open.org/apps/org/workgroup/xliff/email/archives/200407/msg
00026.html


Best regards,
Christian
      
Received on Thursday, 31 March 2005 15:05:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:44 GMT