Re: [MathML4] Whitespace and attributes canonicalization in MathML VS HTML5/CSS

On 01/08/2016 17:31, Frédéric Wang wrote:
> Hi Math WG,
>
> Continuing on feedback for a future MathML specification, here is a
> (probably non-exhaustive) list of inconsistencies between MathML and
> HTML5/CSS regarding whitespace and attributes canonicalization. As a
> rule of thumb, it would be better for web engines if MathML can align on
> HTML5 so that we can reuse as much code as possible and avoid extra code
> to handle MathML special cases. Also people familiar with HTML5 will be
> less surprised when handling MathML.
>
> 1) Whitespace collapsing/trimming
>    https://www.w3.org/TR/MathML/chapter2.html#fund.collapse
>
>    Whitespace collapsing is consistent with the default CSS property
> "white-space" and people are familiar with it.
>
>    Removing "whitespace at the beginning and end of the content" is less
> expected. Gecko has some code to handle this but it would be very
> helpful to avoid this additional complexity. WebKit does not handle it
> at the moment and it's not clear it's worth doing it... Except in the
> MathML spec/test, everybody seems to just write <mo>(</mo> and not <mo>
> ( </mo>. Can we deprecate this behavior in MathML4? Or maybe you should
> work with the HTML5 WG to define such collapsing rules during document
> parsing, so that the MathML rendering code no longer need to handle it?
>
> 2) In MathML, white spaces are understood as XML spaces (U+0020), tabs
> (U+0009), line feeds (U+000A), and carriage returns (U+000D) while HTML5
> also includes "form feed" (U+000C).
>
>     https://www.w3.org/TR/html5/infrastructure.html#space-character
>    
> 3) MathML attributes are case-sensitive while HTML5 attributes are
> case-insensitive. case-sensitiveness is probably not a problem for users
> and it's easier for the parsing. However, WebKit developers writing or
> reviewing patches have often considered doing case-insensitive
> comparisons as that's consistent with the rest of the code base.
>
> 4) MathML boolean attributes take value "true" and "false". In HTML5,
> the boolean value is given by the presence/absence of the attribute and
> the only allowed value is the name of the attribute. This allows to get
> more compact syntax like <mo largeop stretchy> instead of <mo
> largeop="true" stretchy="true">. However, Web engines and authoring
> tools will continue to support the true/false syntax anyway, so it's
> probably not worth adding complexity here...
>
>    https://www.w3.org/TR/html5/infrastructure.html#boolean-attributes
>
> 5) As I said in a previous message, the values "small", "normal", "big"
> of mathsize do not exist for CSS font-size. Removing them will simplify
> a bit the parsing code.
>
> 6) The definition of numbers is also not very accurate in the MathML
> recommendation compared to HTML5. One has to check the RelaxNG schemas
> and the predefined RelaxNG types to know the exact syntax. Again, it
> think it would be best to rely on the HTML5 definitions. For example,
> <math><mspace width="1E1em" height="10em" mathbackground="red"/></math>
> draws a red square in WebKit but Gecko says "1E1em" is invalid.
>
>    https://www.w3.org/TR/html5/infrastructure.html#numbers
>
> Frédéric
>
>
>
Moved to individual github issues:

https://github.com/mathml-refresh/mathml/issues/28

https://github.com/mathml-refresh/mathml/issues/21

https://github.com/mathml-refresh/mathml/issues/22

https://github.com/mathml-refresh/mathml/issues/33

https://github.com/mathml-refresh/mathml/issues/7

https://github.com/mathml-refresh/mathml/issues/23

-- 
Frédéric Wang

Received on Friday, 22 February 2019 14:11:17 UTC