[Bug 14858] [FO30] format-integer picture string

https://www.w3.org/Bugs/Public/show_bug.cgi?id=14858

--- Comment #4 from Michael Kay <mike@saxonica.com> 2012-01-19 11:13:04 UTC ---
I think the regex for the format modifier should be as follows: if the string
ends with

(([co](\([^()]+\))?)?[at]?)

then this is taken as the format modifier. This differs from the published
regex by the addition of the "+" quantifier. There may be better ways of
expressing this (or paraphrasing it).

The regex for the first part of the picture is wrong. The part preceding the
format modifier must either be a decimal digit pattern, which matches

((\p{Nd}|#|[^\p{N}\p{L}])+?)

or some other format token defined in the specification, which matches

(A|a|I|i|W|w|Ww)

or an implementation-defined format token, which I propose should be
unrestricted except that it must not be empty.

I think we should express this as follows.

(1) The value of $picture must match the regular expression ^ (.+) ( ([co](
\([^()]\))? )? [at]? )$ according to the rules of the matches() function with
flags "xs". (This is not a severe restriction: only the zero-length string
fails to match this pattern.)

(2) Following this match, the content of captured group 1 is referred to as the
primary format token, and the content of captured group 2 is referred to as the
format modifier. The semantics of the format modifier are covered by existing
rules.

(3) The primary format token is classified as follows:

(3a) if it contains a decimal digit or "#" then it is taken as a
decimal-digit-pattern and must follow the rules for decimal digit patterns
(otherwise, error)

(3b) if it is one of (A|a|I|i|W|w|Ww) then it is handled as defined in the
specification for that format token

(3c) otherwise, its meaning (if any) is implementation-defined; if the
implementation does not attach any other meaning to the format token then it is
handled in the same way as the primary format token "1" (Currently the spec
says it must use a format token of "1", which discards the modifier: this is a
change.)

These rules change the result of some tests that are currently deemed errors,
to being implementation-defined, with a fallback that uses format picture "1".
Examples of such tests include format-integer-038, whose picture is "()Wwo",
and format-integer-057, which uses a picture of "boo".

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Thursday, 19 January 2012 11:13:07 UTC