W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > February 2013

Re: Processing question for HTML default behavior

From: Phil Ritchie <philr@vistatec.ie>
Date: Thu, 21 Feb 2013 06:42:38 +0000
To: "Yves Savourel" <ysavourel@enlaso.com>
Message-ID: <05EACE76-1ED1-4022-BC39-7BD6AF47AE1B@vistatec.ie>
Cc: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>

I think a best practice document is the optimal way to address this.

Phil



On 20 Feb 2013, at 22:54, "Yves Savourel" <ysavourel@enlaso.com> wrote:

> Hi Felix, Karl,
>
> Thanks for the feedback.
>
> Basically I think, like both of you, that this could be address in some
BP document.
> But if so, I think it need to be very strong and clear.
>
> The bottom line is: how MUST an ITS processor behave on an HTML file if
you give it no rules?
> Currently it behaves in a way that is not really useable: one has to add
rules to properly process the file.
> So having an official set of rules to complement the specification is
good, but I'm almost thinking it should be mentioned in the specifications
(pointing to the rules file for example?)
>
> Karl: the work-around you described would probably work, but it would be
quite a hack. I don't think we'd like to go that direction.
>
> -ys
>
> -----Original Message-----
> From: Karl Fritsche [mailto:karl.fritsche@cocomore.com]
> Sent: Wednesday, February 20, 2013 12:26 PM
> To: public-multilingualweb-lt@w3.org
> Subject: Re: Processing question for HTML default behavior
>
> Hi Yves,
>
> I thought the best practice document should address exactly this topic.
> There should be described such default rule-set authors can use.
>
> But with your describe problem you have a very good hint. And I'm not
sure if you can solve this, as long as you add automatically rules to the
content. The author has to be aware of this rules and should send them to
you, so he can disable the rules when everything shouldn't be translated.
>
> Only thing I can currently think of would be a very hackish solution on
your side. If you automatically add rules, then the local attributes has
more weight than any rule, even on inheritance AND attributes. As long as a
local attribute or because of inheritance of a local attribute, translate
rules doesn't apply.
> I think this would only work for the translate data category and not for
other categories, as you don't have only yes or no. And it would be nothing
we could describe well in a standard, because from the standard point of
view, if you have this rules than the apply, which is totally correct. But
not in the case when the are added automatically like in your case, because
you are nice to your customers and aware of people, which doesn't know much
about ITS.
>
> But in my mind you can't handle this the correct way, when you're
changing content and add rules there. They have to send the rules to you
and not you have to add rules for them. Because when the author doesn't
want you to translate any title tag, he also has to be aware that you add
rules to translate the title tag, to write a rule which disallows it and
overwrites your rule.
>
> Cheers,
> Karl
>
> On 20.02.2013 18:54, Yves Savourel wrote:
> > Hi all,
> >
> > I'm running into processing issue in our HTML filter because I'm trying
to provide a set of default rules.
> > Maybe some of you have run into the same issue and have fund a
solution.
> >
> > The problem:
> >
> > When our filter process an HTML file we set a list of default ITS rules
that correspond to what user would expect from a normal extraction of HTML.
For example title or alt attributes should be translatable, b, I, u, em,
and many more elements should be seen as inline, etc.
> >
> > The user does not have to define those rules. they can modify them, but
usually they would not.
> >
> > The issue comes when there are local ITS markup. For example a
translate='no' on <html>. Such document when you look at it should be
completely non-translatable. But in our case, because we have default
rules, anything that is defined globally as translatable in those rules is
not inheriting the top-level translate='no' and therefore is seen as
translatable.
> >
> > The problem then is that an author doesn't necessarily know what our
default-HTML rules are and therefore is not able to markup his HTML
accordingly.
> >
> > How do other people work with default-ITS behavior vs default
HTML-expected behaviors?
> >
> > To some degree there is a disconnect between some of the default ITS
behavior and the HTML reality. For example the specification explicitly
says an HTML id attribute is the same as an ITS id attribute, so there is
an expectation that you don't have to set a rule for it. But what about
many other things like for example the title and alt attributes? They
should be normally translated, but ITS does not say that, so it's up to the
tool to provide a way to do it.
> >
> > I think we really need to have a more formal way to define what are the
expectation on HTML. Maybe not normative, but something written in stone
that processors can rely on, otherwise we'll end up with different tools
behavior on the same input HTML.
> >
> > Cheers,
> > -yves
> >
> >
> >
>
>
>
>


************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the sender immediately by e-mail.

www.vistatec.com
************************************************************
Received on Thursday, 21 February 2013 06:43:40 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:08 UTC