[Bug 2878] "Element within text" data category

http://www.w3.org/Bugs/Public/show_bug.cgi?id=2878





------- Comment #13 from ysavourel@translate.com  2006-03-15 23:31 -------

Hi Andrzej,

I agree with you that ITS should be able to specify the "subflow" elements
(elements that contain an independent text run, like a the footnote).

But my understanding was that it was addressed by not listing such elements in
the withinText list and relying on the process to identify them.

As I see it (so far) we can do this because when an element can be "subflow" it
does not matter whether it is inside a text run or outside, from a processing
view point there is little or no difference.

Maybe examples would make my thoughts clearer. I start with these principles:

- If an element is set as being "within text" it goes (and its content) with
the text unit where it occures.

- If an element is not set as being "within text", it is either an element with
no text content, or an element with text content that is a subflow. And in both
cases you have to do (almost) the same.

My thought is that an its:subflow attribute would work, but also would force us
to have more rules, while that same information could be gathered during the
process.

For example, in XHTML an <li> element can contain PCDATA and %Flow; which
includes things like <p> and <b>.
That means we can have:

<li>Some <b>text</b> and some more. <p>And some more text</p></li>

<b> is withinText, no issue. But <p> is also found outside of <li> and should
be treated as not inline then. And when inside <li> it should be treated as
subflow. So we certainly could do something like:

<its:documentRules>
 <its:withinText its:selector="//b" its:withinText="yes"/>
 <its:withinText its:selector="//p" its:withinText="no"/>
 <its:withinText its:selector="//li/p" its:withinText="yes" its:subflow="yes"/>
 ...
</its:documentRules>

But things get very quickly out of control as far as the number of rules and
overrides you have to do: The case of <p> is true for <h1>, and many more
elements (And it can be worst if you move to formats such as DocBook)

In the other hand, the difference between handing "//p" and "//li/p" is
minimal: iIn both case you start a new text unit. The only difference is
whether it's a subflow text unit or not, and that can be detected by looking if
the parent of <p> has text nodes or not.

Thinking about all the normal withinText cases like <b>, someone is going to
ask me:

"Then why can't you replace withinText by simply checking if the parent has
text nodes? Like for <p>? No need for that data category then."

That is because there are cases where it cannot be detected:

<li><b>Some text </b><i>Some text</i></li>

Then, Andrzej you are going to say:

"Then it means in cases like that <p> as a subflow cannot be detected either!"

And I'm going to say:

Yes, like here:

<li><p>Some text </p><p>Some text</p></li>

But it is OK because if such case is treated as normal text run instead of
subflow, it does not matter. While for <b> and <i> the order of the elements
may need to be changed during translation and therefore it is important that
<b> is detected as withinText in that case.

In other words: In the case of an "inline" element that is "subflow": I think
the "subflowness" information can be obtained by simply not listing the element
as withinText, and the detection that such element is actually "inline" can be
detected by simply knowing whether or not you are already within a text run for
that element.

All this obvioulsy needs to be validated by implementations...
Working on it.

Cheers,
-yves

Received on Wednesday, 15 March 2006 23:32:07 UTC