Re: JLA OSDS seminor from Felix Sasaki on 2006-02-01 (www-i18n-comments@w3.org from February 2006)

From: Felix Sasaki <fsasaki@w3.org>
Date: Wed, 01 Feb 2006 10:44:09 +0900
To: "Hiroki Sato" <hrs@allbsd.org>
Cc: "www-i18n-comments@w3.org" <www-i18n-comments@w3.org>, "public-i18n-its@w3.org" <public-i18n-its@w3.org>
Message-ID: <op.s39trvv9x1753t@ibm-60d333fc0ec.mag.keio.ac.jp>
Hi Hiroki,

Do you mind if I send your comments to the list www-i18n-comments@w3.org?  
This is the public list our working group is using for discussions on  
working drafts. You can also post to that list directly, if you want.
I will give some answers below.

On Wed, 01 Feb 2006 04:08:35 +0900, Hiroki Sato <hrs@allbsd.org> wrote:

> "Felix Sasaki" <fsasaki@w3.org> wrote
>   in <op.s36jx3z8x1753t@ibm-60d333fc0ec.mag.keio.ac.jp>:
>
> fs> There is no reason except that the current working draft of the ITS  
> tagset
> fs> at http://www.w3.org/TR/its  is very incomplete. The ITS working  
> group has
> fs> a requirement, see http://esw.w3.org/topic/its0503ReqSpan , about  
> this
> fs> topic. We don't know yet if we will have enough time to add the  
> definition
> fs> of a span element to the next working draft, but such an element will
> fs> definetely be part of the final tag set.
>
>  Thanks for the clarification.  I totally thought it was almost in the
>  final stage of the standardization process.
>
> fs> One reason why we mainly define attributes is to have a small impact  
> on
> fs> existing markup schemes: Attributes are easier to integrate into a  
> schema
> fs> than elements.
>
>  I see.  I think such impacts depend on how well modularized the schema  
> is.
>  Attributes also have difficulty about integration especially if the  
> schema
>  is written in DTD.  This may be a moot point, though.

In our latest DTD version of the ITS schema, we make definitions like the  
one which I attach at the end of this mail. We are using many parameter  
entities. I think to add ITS attributes to an existing DTD, you only need  
to add the parameter entities att.datacats.attributes and  
att.selector.attributes to each element of your schema. As for ITS  
elements, you need to add the declaration of the documentRules element to  
your schema. What do you think: what problems could occur from this design?

>
>  BTW, in addition to span-like element, I would like an attribute which
>  indicates "translation unit."  The translation unit I mean here is
>  a range which can be translated independently.  This sort of range  
> indicator
>  is very useful to automatic separation of TRADOS-like translation  
> tools; such
>  tools need to analyze the document and separate translatable parts in a  
> sentence
>  basis.  its:translate is not enough for the purpose because it is only
>  for translatability and does not useful to making relationships between
>  two blocks written in two languages---so translation unit is orthogonal
>  to translatability.  More specifics are here.  Let us assume there is
>  the following XML fragment (in DocBook vocabulary):
>
>  <para>The floppy disk should be removed before
>     the system<footnote><para>applicable only if the system has  
> FDD.</para>
>     </footnote> enters <term>safe mode</term>.</para>
>
>  and we mark up using its:translate attribute:
>
>  <para its:translate="yes">The floppy disk should be removed before
>     the system<footnote><para>applicable only if the system has  
> FDD.</para>
>     </footnote> enters <term its:translate="no">safe mode</term>.</para>
>
>  These markups indicate translatability but still we cannot decide
>  a translation unit, which is a part that should have the translated
>  counterpart and has a certain "level" of the compositional unit.
>  its:translate does not show the latter information especially.
>
>  This idea may be an expansion of its:term in  
> its0503ReqTermIdentification and
>  closely relates to its0504ReqPurposeSpecMap.  The translation unit  
> always
>  implies which component it is, such as sentence, word, term, and so on.
>  Primary difference from its0504ReqPurposeSpecMap is that this does not
>  depend on the base vocabulary (this is not 1:1 element mapping).  By  
> way of
>  experiment if we use its:transunit="XXX" for the purpose, the above  
> example
>  can be written as:
>
>  <para its:translate="yes" its:transunit="paragraph">The floppy disk  
> should be removed before
>     the system<footnote its:transunit="footnote"><para>applicable only  
> if the system has FDD.</para>
>     </footnote> enters <term its:translate="no"  
> its:transunit="term">safe mode</term>.</para>
>
>  or
>
>  <its:documentRules>
>    <its:documentRule its:transunit="sentence"  
> its:transunitScope="//para" />
>    <its:documentRule its:transunit="term" its:transunitScope="//term" />
>    <its:documentRule its:transunit="footnote"  
> its:transunitScope="//footnote" />
>  </its:documentRules>
>  <para its:translate="yes">The floppy disk should be removed before
>     the system<footnote><para>applicable only if the system has  
> FDD.</para>
>     </footnote> enters <term its:translate="no">safe mode</term>.</para>
>
>  As probably you noticed, this may be a duplicate definition because
>  <para> element already indicates it is a paragraph.  However, it is
>  not bad for ITS to have own markers for categorization of localizable
>  objects, I think.

This is a very interesting proposal. I think it is somehow related to the  
requirement of identifying inline and subflow elements, see  
http://people.w3.org/rishida/localizable-dtds/#inline-elements . We are  
currently working on this and would like to take your feedback into  
account.

>
>  I am still reading the materials on the related topics from
>  http://www.w3.org/International/its/Overview.html, so I may be
>  missing something.  If I conceive of another ideas I will send
>  them again if you do not mind.  Thanks.

Thanks a lot!

Felix Sasaki

>
> --
> | Hiroki SATO



<!ENTITY % NS 'its:' >
<!ENTITY % n.documentRule "%NS;documentRule">
<!ENTITY % n.documentRules "%NS;documentRules">
<!ENTITY % n.ruby "%NS;ruby">
<!ENTITY % n.rubyBase "%NS;rubyBase">
<!ENTITY % n.rubyText "%NS;rubyText">
<!ENTITY % n.schemaRule "%NS;schemaRule">

<!-- start datatypes -->

<!ENTITY % data.selector ' CDATA' >

<!ENTITY % data.itsBoolean '(yes|no)' >

<!-- end datatypes -->

<!-- start predeclared patterns -->

<!-- start rest of patterns -->

<!-- end patterns -->

<!-- start classes -->

<!ENTITY % att.datacats.attributes '
  %NS;translate %data.itsBoolean;  #IMPLIED
  %NS;locInfo CDATA #IMPLIED
  %NS;locInfoType CDATA #IMPLIED
  %NS;term CDATA #IMPLIED
  %NS;termRef CDATA #IMPLIED
  %NS;dir CDATA #IMPLIED
  %NS;rubyText CDATA #IMPLIED'>
<!ENTITY % att.selector.attributes '
  %NS;translateSelector %data.selector;  #IMPLIED
  %NS;locInfoSelector %data.selector;  #IMPLIED
  %NS;termSelector %data.selector;  #IMPLIED
  %NS;dirSelector %data.selector;  #IMPLIED
  %NS;rubySelector %data.selector;  #IMPLIED'>
<!-- stop classes -->

<!-- start elements -->

<!--doc:A rule to
      express ITS
      information and select
      parts of a
      document respectively. documentRule
      is to be used in a dislocated
               position. -->
<!ELEMENT %n.documentRule; EMPTY>
<!ATTLIST %n.documentRule;
  %att.selector.attributes;
  %att.datacats.attributes; >
<!--doc:This element contains rules for ITS information, to be used in  
documents. -->
<!ELEMENT %n.documentRules; (%n.documentRule;)+>
<!ATTLIST %n.documentRules; >
<!--doc: -->
<!ELEMENT %n.ruby; (%n.rubyBase;,%n.rubyText;)>
<!ATTLIST %n.ruby; >
<!--doc: -->
<!ELEMENT %n.rubyBase;  (#PCDATA)>
<!ATTLIST %n.rubyBase; >
<!--doc: -->
<!ELEMENT %n.rubyText;  (#PCDATA)>
<!ATTLIST %n.rubyText; >
<!--doc:A rule to
      express ITS
      information about the
      element declaration to
      which the
      schemaRule
      element is attached as schema annotation. -->
<!ELEMENT %n.schemaRule; EMPTY>
<!ATTLIST %n.schemaRule;
  %att.datacats.attributes; >
<!-- end elements -->
Received on Wednesday, 1 February 2006 01:44:16 UTC