W3C home > Mailing lists > Public > public-i18n-its@w3.org > January to March 2005

RE: Term identification

From: Masaki Itagaki <imasaki@qwest.net>
Date: Sun, 27 Feb 2005 03:27:48 +0900
Message-Id: <6.0.0.20.2.20050227032740.0409bec0@localhost>
To: public-i18n-its@w3.org




Hi Martin

Thank you for your clarification and some comments on my posting.

 >First, I think the result of this WG should be on two levels:
 >1) Advice to DTD designers(*) about what kind of things need tagging.
 >2) An actual set of tags

This is what we confirmed during the last conference call. Richard pointed
out that it's often hard to persuade information architects to
reorganize/redesign original data structures for i18n, so ITSs will more
likely to be applied to extended data structures for translation. I would
imagine that this process is more like transforming source structures into
ITS-compliant ones, which will be retransformed back to original after
localization.

Masaki Itagaki

-----Original Message-----
From: public-i18n-its-request@w3.org [mailto:public-i18n-its-request@w3.org]
On Behalf Of Martin Duerst
Sent: Thursday, February 24, 2005 7:54 PM
To: Masaki Itagaki; public-i18n-its@w3.org
Subject: RE: Term identification


[By way of introduction, I'm not participating in the teleconferences
to make it possible to hold them at a more agreeable time for
everybody else, but I'm very interested in the work of this WG.]

At 07:22 05/02/25, Masaki Itagaki wrote:

  >This term issue is a good example. It's brilliant idea to tag terminology,
  >but terms in a document tend to be tagged already with, for example, an
  >index tag:
  >
  ><p>
  >Payment terms can range from simple to complex, depending on the policy of
  >your organization or <idx item="Cost Center" sortingstr="costcenter">cost
  >center</idx>.
  ></p>
  >
  >What we wish to tag as a term ("cost center" in this example) is a concept
  >pertaining to a specific domain or subject. I would say such a term should
  >have been tagged with either Index or Glossary tags, for example. Now if
we
  >proposed some sort of a terminology element and attributes, I'm wondering
  >how it would live together with existing terminology-related tags. What's
  >happening here is that source writers claim "it's an INDEX item", while
  >localizers would say "it's a TERM item." ITS's intension may conflict with
  >DTD design, and now what we do here....?

First, I think the result of this WG should be on two levels:
1) Advice to DTD designers(*) about what kind of things need tagging.
2) An actual set of tags

(*) please note that I'm using that in a rather wide sense, including
      all the people who may give advice to the person actually writing
      a DTD, pleople who create variants of DTDs (e.g. for localization),
      and so on.

2) would be seen as a 'ready-made, easy to use' way to do 1), but a DTD
designer could also say "well, I already have something like this".

In the specific case above, a DTD designer could say "okay, we need
something to identify terms, but we already have index items, so
we can reuse that for terms." (there are of course details to look
at, such as how to mark up terms that are not index items, or index
items that are not terms,...)

Another solution that a DTD designer could chose would be to say
"index items are index items, terms are terms, if it's both,
add both markups".

Because elements often need attributes, there may also be the case
where we provide a <term> element with some attributes, but we
design it so that the attributes also can be used on other
elements. As an example, a 'glossary' attribute pointing to
a glossary might then be taken as an indication that an index
item is also a term.

This is just an example, but I think that this should be our
general approach, namely to provide markup that can easily
be used 'ready-made' whereas also allow users the flexibility
to use different solutions if that works better for them.

As for tool support, tools could support the markup we define
directly, but could also provide a mechanism to indicate what
specific role in the localization process some element plays.
[Please note that any such mechanisms are not in scope for the
charter of this WG; see
http://www.w3.org/2004/11/i18n-recharter/its-charter,
section 4]

  >Another example is Section 2.4 and 2.5. I would say that these will
provide
  >a way to add localization-related annotations to either an element or a
part
  >of element text. When you think about documentation process, source
authors
  >may not care about those comments and instructions. Psychologically ITS
tags
  >are invisible, in a way, for those folks (unfortunately). Now maybe (at
  >least the documentation process at a company I used to work) final editors
  >could place some comments and instructions. Mostly translation leads are
the
  >ones who do some pre-process content review and add clear comments on
  >gotchas for translators. Now what I'm wondering here is what DTD designing
  >process would be like there. As you don't know where those "downstream
  >people" add ITS <span>-like tags, all of the elements need to accommodate
  >the ITS tags (almost everywhere). Is there any good idea to avoid this
sort
  >of overkill?

Well, first of all, there will be a lot of locations where one kind or
the other of additional markup is needed, so to some extent that's not
overkill at all, it's just a necessity.

In addition, while ITS tags are psychologically invisible to source authors,
it should be possible for translations leads,... to feed back their needs
to DTD designers. In many ways, making this much easier is one of the
goals of this group.

Another way to deal with this is that translation leads create their
own, more extended version of the DTD.

What we produce should easily work in various different scenarios.

  >I'm mainly thinking of XML documentation, rather than XML data storage,
and
  >that may be why I would think of these issues. Maybe it's just simply that
  >ITS should propose some ways to accommodate I18N tag sets into existing
data
  >structures. Hopefully we can touch on this point during the teleconference
  >tomorrow.

The charter is mostly written with an eye on document-oriented XML,
but doesn't exclude work on data-oriented XML. So these issues are
very relevant indeed.
(I'd tend to avoid the term 'documentation', because their are many
documents that have very little to do with documentation.)


Regards,    Martin. 
Received on Saturday, 26 February 2005 18:46:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:44 GMT