[Minutes] ITS IG call 2014-10-20 from Felix Sasaki on 2014-10-20 (public-i18n-its-ig@w3.org from October 2014)

From: Felix Sasaki <fsasaki@w3.org>
Date: Mon, 20 Oct 2014 18:26:09 +0200
To: public-i18n-its-ig <public-i18n-its-ig@w3.org>
Message-Id: <5D9356CE-2B6A-41F0-9168-0330E8F4A030@w3.org>

See

http://www.w3.org/2014/10/20-i18nits-minutes.html
and below as text. Note: next call will be 10 November *4 p.m. UTC* (due to time zone changes)
http://www.timeanddate.com/worldclock/fixedtime.html?iso=20141110T16
Main topic will be again the xliff mapping.

- Felix

[1]W3C

[1] http://www.w3.org/

- DRAFT -

its ig

20 Oct 2014

[2]Agenda

[2] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0032.html

See also: [3]IRC log

[3] http://www.w3.org/2014/10/20-i18nits-irc

Attendees

Present
christian, yves, david, renat, felix

Regrets
Chair
SV_MEETING_CHAIR

Scribe
fsasaki

Contents

* [4]Topics
1. [5]action items
2. [6]XLIFF 2.0 mapping
3. [7]testing output
4. [8]rules file
5. [9]way to describe the transformations
6. [10]http://lists.w3.org/Archives/Public/public-i18n-it
s-ig/2014Oct/0034.html
7. [11]next call
* [12]Summary of Action Items
__________________________________________________________

[13]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014
Oct/0032.html

[13] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0032.html

<scribe> scribe: fsasaki

action items

[14]http://www.w3.org/International/its/ig/track/actions/open

[14] http://www.w3.org/International/its/ig/track/actions/open

[15]http://www.w3.org/International/its/ig/track/actions/open?s
ort=due

[15] http://www.w3.org/International/its/ig/track/actions/open?sort=due

XLIFF 2.0 mapping

[16]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014
Oct/0006.html

[16] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0006.html

"ITS scope with sm/em"

yves: issue is: in XLIFF you can markup things with starting
and ending empty elements
... these are used as marker
... content is not XML well formed content but between related
elements
... they are related through semantics, not syntax
... they can be converted to mrk
... issue is: in ITS we cannot describe that relation
... e.g. if "sm" has ITS information, the information woud
apply to empty content
... Fredrik and Felix provided ways to solve the problem
... by reducing numbers of sm and em,
... but there would still be some cases
... in case in which things are overlaped
... this cannot resolved with ITS
... this is similar to NIF were we can have overlap as well
... so ITS cannot handle everything
... we can migate 98% of the case with transformation

david: fundamental issue
... richard / felix sometimes say that ITS is an abstract set
of data categories
... so far tech. has only been defined for XML and HTML
... these have the limitations that Yves described
... I agree that you can define simplification to reduce the
number of spans that will be marked with empty markers, in
XLIFF or other formats
... this does not solve the fundamental issue
... you can clash with structural XLIFF markup and so on
... not quite sure what the value of the exercise is
... of trying to reduce the number of sm / em marked spans
... if you start in a perfect value html / xml you can add ITS
value
... you can't end up with spans that won't be possible to be
marked in the right way
... don't think that there is a solution
... you cannot enforce wellformed spans
... so all external ITS processors will be at loss

yves: that type of issue applies only for em / sm that you
cannot split into separate mrks
... e.g. for "translate" you can split things up in several
mrks
... the issue is with "terminology" or "text analysis" where
you cannot split up things

david: if the wellformed format has the requirements then we
can convert that

yves: this is a problem, not a major problem. it is a problem
on the ITS representaiton. shoudl not stop us for using sm / em
... not a showstopper for the mapping

david: agre
... it is a limitatino what a generic ITS processor can do
... another reason for having separate XLIFF namespaces

christian: a few points: first, general issue of XML contraints
... what is the viewpoint of researchers on the overlap issue?
... second, we are looking at xliff
... the observation we have may have some impact on the future
version of xliff
... maybe we find that sm / em is not the only approach - again
an insight based on overlap research
... third - being able to cover 98%, like yves said
... we could also say: for certain flavours of XLIFF you are
ok, for others you have certain constraints
... that may call for a special variant of xliff
... e.g. variant X of xliff: OK, variant Y: may have issues

renat: want to add some comments on overlap aspect
... in xliff 2.0 there will be several modules
... e.g. a specific module for ITS metadata
... is that so?
... then we could resolve the scenario if we split overlapping
pieces of metadata between different instances of target text

david: useful to look at theoretical options from TEI

[see TEI options here
[17]http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html
]

[17] http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html

david: multiple instances was used in 1.2
... it was abandoned in XLIFF 2.0
... standoff markup is another option
... but also has issues
... agree with Yves, there is no problem on the XLIFF side
... the problem occurs during conversion to a format that has
wellformedness requirements
... a comment on what christian said about XLIFF flavours:
... wellformed spans are interconvertable with non-wellformed
spans
... that is true for annotation and quote markers
... there is a way to reduce the number of non wellformed spans
... you could define types of content that works with the
reduction and overs that does not

[18]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014
Oct/0033.html

[18] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0033.html

felix: next step would be to do some tests with the conversion,
see yves' mail

<scribe> ACTION: felix to work on overlap example and to do
conversion [recorded in
[19]http://www.w3.org/2014/10/20-i18nits-minutes.html#action01]

<trackbot> Created ACTION-53 - Work on overlap example and to
do conversion [on Felix Sasaki - due 2014-10-27].

yves: things to be done:
... coming up wtih rules for processing the mapping
... using also an ITS processor
... output would be similar to the test output we generate
... but we need also to come up to process the file with an
XLIFF processor
... but I don't have a format for that

testing output

[20]https://www.w3.org/International/its/wiki/XLIFF_2.0_Mapping
#General_implementation_and_testing_considerations

[20] https://www.w3.org/International/its/wiki/XLIFF_2.0_Mapping#General_implementation_and_testing_considerations

yves: for XLIFF output, testing:
... every single element for which we can apply ITS
... all ements have IDs
... so we can generate an XLIFF location of the node
... instead of using XPath, using the XLIFF IDs
... most of the xliff processors should be able to process that

felix: would one need to take the scope of the ID into account?

yves: good point
... technically you are testing only if the value of the ITS
information is correct
... applying the scope is only an XLIFF problem
... the tests for the ITS module don't need to test the scope

david: still the same issue
... the scope of the IDs can be non-wellformed

<scribe> ACTION: yves to try to come up with example of
xliff+its test format / output [recorded in
[21]http://www.w3.org/2014/10/20-i18nits-minutes.html#action02]

<trackbot> Created ACTION-54 - Try to come up with example of
xliff+its test format / output [on Yves Savourel - due
2014-10-27].

rules file

[22]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014
Oct/0000.html

[22] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0000.html

felix: discussion on xliff namespace - semantic of attribute
would affect spans

david: would affect also xliff
... so makes sense to have the namespace xliff hosted

christian: need to be clear what the issues is and to see where
we have the issue: in xliff or its or both

[23]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014
Oct/0023.html

[23] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0023.html

[24]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014
Oct/0024.html

[24] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0024.html

<dF>
[25]http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html

[25] http://www.w3.org/2001/tag/doc/URNsAndRegistries-50.html

yves: processing is fine for ITS processor, to do extra
processing before it processes it
... if it is well defined
... e.g. having an XSLT that does it back and forth
... it is a limitation too because you cannot use an XLIFF file
with an ITS processor
... for me it is something marginal - in most of the cases it
will be with an XLIFF processor, not an ITS processor

felix: agree

david: above links explains differrent approaches to namespaces
in w3c and osasis - w3c uses http uris, osasis uses urn
... xliff syntax expects urn type uri, not http type of uri
... another good reason to have oasis hosted namespace

yves: advantage of not using directly ITS namespace:
... in some cases we will need to add attributes
... e.g. ITS does not define a local "domain"
... you need that at XLIFF
... we have only a global marker in ITS

david: that was the primary reason to use the additional
namespace

yves: exactly
... that allows you to put together all attributes in the
mapping
... validation is then easier
... there is one case with pre- and post-processing of the file
... we don't have a way to map "tools information"
... there is no way to map tools info in XLIFF and map that
into ITS
... which is ok since we have a preprocessing step

way to describe the transformations

idea to have an algorithm and implement that in differnet ways:
xslt and others

<dF> I agree that the algorithm should be defined independently

[26]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/00
34.html

[26] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0034.html

Provenance and Change Track Module

<dF> still will need the xslt example

action-9?

<trackbot> action-9 -- David Lewis to Look at the XLIFF 2.0
change tracking module for provenance -- due 2014-05-30 -- OPEN

<trackbot>
[27]http://www.w3.org/International/its/ig/track/actions/9

[27] http://www.w3.org/International/its/ig/track/actions/9

<dF> and preferably at least one more

yves: we thought about this before, but did not address this
yet

david: not very clear what the relation is

yves: you could end up with conflicts - which one is right?

david: one could use ctr for historical provenance
... current provenance on core elements should be encoded using
the ITS module

christian: sounds like a new concept / terminology
... "historical provenance"
... we need to define this properly

david: purpose of change track is to be able to tell who made
change

next call

10 november

adjourned

Summary of Action Items

[NEW] ACTION: felix to work on overlap example and to do
conversion [recorded in
[28]http://www.w3.org/2014/10/20-i18nits-minutes.html#action01]
[NEW] ACTION: yves to try to come up with example of xliff+its
test format / output [recorded in
[29]http://www.w3.org/2014/10/20-i18nits-minutes.html#action02]

[End of minutes]
__________________________________________________________

Minutes formatted by David Booth's [30]scribe.perl version
1.138 ([31]CVS log)
$Date: 2014-10-20 16:02:00 $
__________________________________________________________

[30] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
[31] http://dev.w3.org/cvsweb/2002/scribe/

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.138 of Date: 2013-04-25 13:59:11
Check for newer version at [32]http://dev.w3.org/cvsweb/~checkout~/2002/
scribe/

[32] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Found Scribe: fsasaki
Inferring ScribeNick: fsasaki
Present: christian yves david renat felix
Agenda: [33]http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014O
ct/0032.html

[33] http://lists.w3.org/Archives/Public/public-i18n-its-ig/2014Oct/0032.html

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Got date from IRC log name: 20 Oct 2014
Guessing minutes URL: [34]http://www.w3.org/2014/10/20-i18nits-minutes.h
tml
People with action items: felix yves

[34] http://www.w3.org/2014/10/20-i18nits-minutes.html

[End of [35]scribe.perl diagnostic output]

[35] http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm

Received on Monday, 20 October 2014 16:26:46 UTC