Re: Processing question for HTML default behavior

Hi Yves,

I thought the best practice document should address exactly this topic.
There should be described such default rule-set authors can use.

But with your describe problem you have a very good hint. And I'm not 
sure if you can solve this, as long as you add automatically rules to 
the content. The author has to be aware of this rules and should send 
them to you, so he can disable the rules when everything shouldn't be 

Only thing I can currently think of would be a very hackish solution on 
your side. If you automatically add rules, then the local attributes has 
more weight than any rule, even on inheritance AND attributes. As long 
as a local attribute or because of inheritance of a local attribute, 
translate rules doesn't apply.
I think this would only work for the translate data category and not for 
other categories, as you don't have only yes or no. And it would be 
nothing we could describe well in a standard, because from the standard 
point of view, if you have this rules than the apply, which is totally 
correct. But not in the case when the are added automatically like in 
your case, because you are nice to your customers and aware of people, 
which doesn't know much about ITS.

But in my mind you can't handle this the correct way, when you're 
changing content and add rules there. They have to send the rules to you 
and not you have to add rules for them. Because when the author doesn't 
want you to translate any title tag, he also has to be aware that you 
add rules to translate the title tag, to write a rule which disallows it 
and overwrites your rule.


On 20.02.2013 18:54, Yves Savourel wrote:
> Hi all,
> I'm running into processing issue in our HTML filter because I'm trying to provide a set of default rules.
> Maybe some of you have run into the same issue and have fund a solution.
> The problem:
> When our filter process an HTML file we set a list of default ITS rules that correspond to what user would expect from a normal extraction of HTML. For example title or alt attributes should be translatable, b, I, u, em, and many more elements should be seen as inline, etc.
> The user does not have to define those rules. they can modify them, but usually they would not.
> The issue comes when there are local ITS markup. For example a translate='no' on <html>. Such document when you look at it should be completely non-translatable. But in our case, because we have default rules, anything that is defined globally as translatable in those rules is not inheriting the top-level translate='no' and therefore is seen as translatable.
> The problem then is that an author doesn't necessarily know what our default-HTML rules are and therefore is not able to markup his HTML accordingly.
> How do other people work with default-ITS behavior vs default HTML-expected behaviors?
> To some degree there is a disconnect between some of the default ITS behavior and the HTML reality. For example the specification explicitly says an HTML id attribute is the same as an ITS id attribute, so there is an expectation that you don't have to set a rule for it. But what about many other things like for example the title and alt attributes? They should be normally translated, but ITS does not say that, so it's up to the tool to provide a way to do it.
> I think we really need to have a more formal way to define what are the expectation on HTML. Maybe not normative, but something written in stone that processors can rely on, otherwise we'll end up with different tools behavior on the same input HTML.
> Cheers,
> -yves

Received on Wednesday, 20 February 2013 19:26:33 UTC