- From: Daniel Weck <daniel.weck@gmail.com>
- Date: Thu, 4 Dec 2008 16:08:37 +0000
- To: "John Birch" <john.birch@screen.subtitling.com>
- Cc: "Hayes Sean" <Sean.Hayes@microsoft.com>, "Glenn A. Adams" <gadams@xfsi.com>, "Public TTWG List" <public-tt@w3.org>
On 4 Dec 2008, at 15:07, John Birch wrote:
> JB>> Generic XML can be processed using internal content and external
> criteria. I personally view switches as being a way of pre-coding
> common
> processing operations - but I view it as ~dangerous~ to only allow
> those
> pre-coded choices to be made in order to remain 'conformant'.
I see what you mean: you see it as some kind of "anti-pattern", in
reference to software development :)
Now, let's consider this fictitious, yet relevant sample:
<text xml:lang="en">
<sequence xml:lang="fr" title="Titre en français">
<p>Texte en français.</p>
<p xml:lang="fr-CA">Texte en québécquois.</p>
<p xml:lang="en-GB">Text in British English.</p>
</sequence>
<p>Text in (unspecified) English.</p>
</text>
If "xml:lang" was to be processed by user-agents as a content
selection criteria, there would be a number of issues:
1) Clearly, content selection wasn't the original intent of the
author. It is obvious that here, the "xml:lang" attributes decorate
the elements to merely indicate the locale of the content. With the
above XML snippet, XPath and the lang() function can be used, for
example, pre-process (e.g. XSLT transform) or to dynamically alter the
content (e.g. "highlight any English text in bright yellow"). This
kind of processing made by the user-agent seems perfectly reasonable.
On the other hand, my instinctive subjective assumption is that
content pruning is not the desired goal. To remove this ambiguity, the
TT/DFXP distribution format for captions should provide more than just
a hint, it should clearly specify the intent (IMHO). This would
promote re-using content across multiple processors.
2) The "xml:lang" attribute applies to an entire XML fragment, until
it is overridden. In a content selection scenario, this nesting
ability prompts a number of questions. For example, what happens if
the user-agent locale is set to "fr": should the top-level "text"
element be totally ignored/pruned, or should the "sequence" be
processed and the following "p" ignored ? My personal systematic /
scientific mind is in favor of the former, but I know authors who
would "feel" that the latter is right.
3) What about more complex selection criteria ? Let's say that I want
to mark a piece of text as "suitable for all flavors of French expect
Canadian": using a (fictitious) 'matchLanguage' attribute, I could
write matchLanguage="fr AND NOT fr-CA". Note: the coma-separated
values in the SMIL systemLanguage attribute represent a OR boolean
logic, so there are limitations in the selection model.
4) What about a fallback logic, so that if no suitable language is
matched, then a specific XML fragment is enabled ? In SMIL, the
'switch' offers this mechanism, which enriches the default selection
model based on the combinatory attribute value.
I feel that a proper "content control" mechanism would address these
concerns. Otherwise, I am not convinced that TT/DFXP will sufficiently
eliminate ambiguities that user-agent implementors and content authors
(or developers of production tools) will face, and I would recommend
to clearly state that xml:lang is not designed for content selection,
and that to be reflected in user-agent conformance guidelines.
> JB>> If we did not have existent implementations then I would be
> proposing two language attributes. One to allow a language specific
> instance of a DFXP document (i.e. the true xml:lang sense) and
> another -
> perhaps ttm:lang, to define the language used in sections of the
> document.
The "xml:lang" attribute from XML 1.0 and 1.1 can do both scenarios
you mention. "xml:lang" is not meant to be limited to the document
instance as far as I know. The "lang" versus "xml:lang" mess has been
fixed in XHTML 1.1 IIRC, isn't that a good trend to follow ?
Regards, Dan
Received on Thursday, 4 December 2008 16:09:15 UTC