Re: [ACTION-160] (related to [ACTION-135] too) Summarize specialRequirements

Hi Yves,

thanks a lot for tacking the time to explain this in detail, this makes
things much clearer for me. I still have some questions below.

2012/7/6 Yves Savourel <ysavourel@enlaso.com>

> Hi Felix, all,
>
> >> - It mixes executing the check with storing
> >> the info to check. Separating the max-size info
> >> from the text is cumbersome.
> >
> > No need to separate the check from the file - you can
> > add a link to a schematron file from the file to
> > be checked, just like with ITS linked global rules.
>
> Sorry if I was confusing. What I meant was: In you example, it's not good
> to have the 35 limit (or the reference to 'gui') hard-coded in the
> Schematron script: Those information should come from the document.
> Hard-coding that would be the "quick and dirty" way to go. If you start
> really developing this, my guess is that quickly you'll want to use
> parameters/variables provided from the input.
>
>
>
> >> - It works with XML only. Interactive checks are
> >> more efficient than batch process for this (checks
> >> as you type the translation), and that happens in
> >> tools not on XML files.
> >
> > Isn't it the case for all data categories in ITS 1.0
> > and 2.0 that they only work in a markup world?
> > After all global rules rely on XPath.
>
> Sure. And most of those rules just pass along some information: they don't
> act on it: this is a term, this is LTR, this is translatable, this should
> be at most 35 chars, etc.
>
>
>
> >> IMO the goal is to provide the information about
> >> the maximum size, so it can be passed on to
> >> whatever system is used to do the translation/validation.
> >
> > Mm ... not sure:  assuming you have a global rule like
> > <itsx:lengthConstraintRule select="//gui" length="100"/>
> > the system using that rule needs to have at least a
> > pre-process that's in the HTML5/XML world: it needs
> > to process XPath.
>
> I think often the tool doing the actual check will have no relation with
> XPath at all: Nowadays we work with components a lot more: For example, A
> filter extracts the data from a HTML5/XML document and stores that in--for
> instance--XLIFF or PO, then another tool import that extracted data and
> this is where translation and validation occurs. That second component
> probably does not know anything about ITS or even XML.
>


So I think - if I understand you correctly - what you want to achieve is
that tools that make use of metadata that is coming from various sources
(XLIFF, ITS, PO, ..). Currently that metadata is not coming at all, or only
in priority ways. The aim now is to have one agreed metadata definition for
max-size, right?



>
> In addition Localization environments are obviously not restricted to
> HTML5/XML input: they will provide this same type of length check for many
> resource-type formats.
>
> Sure, one localization system can choose to do all the work in XML and use
> Schematron for the checks. No harm in that. But that would be happening
> separately from the original documents, at a stage where all the data have
> been extracted and are somewhere at the middle of the process.
>
>
>
> > I think Schematron has also the benefit that you can do
> > general constraint checks - basically everything that
> > can be expressed in an XPath expressions.
>
> Sure. For HTML5/XML documents and on the original data. And I'm sure you
> could work out a Schematron script that uses the ITS rules to provide such
> check in a generic way (taking its data from the document, rather than
> hard-coding them).
> That is a very valid way to go--in the right context.
>
>
>
> > Advertising tool makers to use that mechanism instead
> > of inventing (potentially a set of) several new ones
> > (length check, character restriction, languages
> > allowed in the process etc.) might actually lower
> > implementation efforts.
>
> But my point is that all this stuff is already implemented one way or
> another :)
>

Sure - and I understand that building tool chains that do the check in the
XML context and then do not pass the requirements of what to check to other
modules, but the outcome of a schematron check to these modules, would
probably need more implementation effort.


> Localization tools haven't waited for XML/XLIFF/ITS/etc. to do this.
>
> We just need a standard way to pass them a meaningful set of information.
> Some tools may need to be adapted a bit to consume it, but it's not a big
> endeavor.
>
> Also: checker tools don't just look at length, they do a lot more, that
> Schematron may or may not be able to do. We have to consider the big
> picture: checking max size is a tiny portion of translation verification.
> You don't want to it to be a special case.
>

What worries me then is that we aim to create a single piece of metadata,
which is not part of the big picture. That raises several questions /
requirements:

- If we go that route, we need to make sure that the solution is compatible
to what is being developed at least in XLIFF. Otherwise I see roundtripping
with competing definitions. Frederik, do you see max-length as compatible
to what you are going to develop? Are there any public drafts that we can
review? Note that I am *not* talking about a formal liaison between the
XLIFF TC and the MLW-LT group - I just want to make sure that the technical
details of potential solutions fit together.

- How will our special purpose solution for length relate to what tools do
in other areas? In a sense this is similar to "quality": I'm sure other
tools too, like what you describe in the "questions about OKAPI errors"
thread: like with quality, it is hard to find a definition of the big
picture that fits all tools. So is it a good approach to have separate
small pieces defined for the things we easily can identify, or should we
put more effort in the "big picture"?

I'm not sure yet if I would agree with max-size, if we don't have the
relation to the "big picture" and esp. to XLIFF made clear.

Best,

Felix


>
> Just my 2 cents.
> -yves
>
>
>
>


-- 
Felix Sasaki
DFKI / W3C Fellow

Received on Sunday, 8 July 2012 16:05:36 UTC