W3C home > Mailing lists > Public > www-ws-desc@w3.org > January 2007

Re[2]: Testcases for HTTP location grammar [CR130]

From: <georgi.georgiev.pv@hitachi.com>
Date: Fri, 12 Jan 2007 15:20:30 +0900
Message-ID: <XNM1$2$0$3$$8$3$2$A$5000228U45a7289a@hitachi.com>
To: <www-ws-desc@w3.org>
Cc: <matsuki.yoshino.pw@hitachi.com>, <jakaputin@gmail.com>, <plh@w3.org>
When deciding on the grammar, please consider a method that would not restrict the possible content of the processed result in any way.

With John's town=Paris example (quoted near the end of this mail):

- inner-most pair first method:
  There is no way of writing "{town}" or "{randomstring}" as any matching braces will be tried for expansion.
- double braces first method:
  There is no way of writing "Paris}" as the closing brace of "{town}" would be matched to a following "}".

And regarding Tony's method with the stacking:

- Does "{{" have to be stacked? Double braces do not have to come in pairs.
- Similar to the above, should a lone "}}" without an opening equivalent be really treated as an error? Input like "/foo}}bar" is pretty legal.
- How would "{coun{town}try}" be parsed? This should be illegal input.
- If nested braces are not allowed, why is the stack necessary? Could its use be avoided if mismatched braces are treated as errors?

To me (if worth anything) left-to-right greedy parsing sounds like the obvious approach but as John mentioned it is "not as simple as 'looking for...'". It is very likely that I am overlooking something.

John, could you please ellaborate on your statement?

Of course, when I say "left-to-right greedy parsing" I assume the following:
- Nested braces are not allowed
- The parsing is performed from the left to the right. Therefore:
  - If a "{" is encountered, it is considered to be either of the following (in this order)
    1) an error if a "{" has already been encountered (and is not the previous character)
    2) the first or second of two braces "{{"
    3) an opening brace
  - If a "}" is encountered, it is considered to be either of the following (in this order)
    1) a match for a previous "{"
    2) the first or second of double braces "}}"
    3) an error (no matching opening brace)
So, "{{{town}" and "{town}}}" are O.K. but "{town{{}" is invalid (the brace after the "n" is illegal).

I am starting to have the feeling that using a backslash to escape literal braces would have been less confusing...

>I think the parser need to have a stack for braces - I don't believe even a state machine can hold all the information we need - when we match up a pair we need to know what our state was before we opened that pair. My sketch of the processing would go:
>if the next character is {
>a. if previous character was { and top of stack is { then change top of stack to {{
>b. otherwise stack {  (remembering where it was seen)
>if the next character is }
>a. if top of stack is {{ look for another } immediately following
>    i. if next char is }, unstack the {{  - we have a matching pair  {{}}
>    ii. if next char is not }, throw error or treat as literal }
>b. if top of stack is {, unstack the {  - we have a matching pair {}
>c. if stack is empty, throw error or treat as literal }
>at the end, the stack should be empty, assuming all { matched }, otherwise unstack the extras and treat as literals (which is why we remembered their locations)
>To put it into words, I see } or }} as matching to the nearest unpaired { or {{, but always respecting nesting. I also see longer sequences of { taken as pairs until there's one or none left.
>So to my mind {{{{X}}}} parses as {{  {{  X  }}  }}  - even though that's a questionable construct.  Or do we want to add another rule saying that {{ cannot be nested inside {{ ?
>How does that sound?
>Tony Rogers
>CA, Inc
>Senior Architect, Development
>co-chair UDDI TC at OASIS
>co-chair WS-Desc WG at W3C
>From: www-ws-desc-request@w3.org on behalf of John Kaputin (gmail)
>Sent: Fri 12-Jan-07 8:50
>To: Philippe Le Hegaret; www-ws-desc@w3.org
>Subject: Testcases for HTTP location grammar [CR130]
>Today's working group call concluded that a grammar should define how the http location is parsed and you have the action, so as discussed I'm sending you some of my testcases. My post [1] is now captured as CR130.
>In deciding on the grammatical rules, things to consider include the precedence of double curly braces versus single braces and how to match pairs of single braces - e.g. by scanning from left to right, by 'inner most pair' (or whatever the terminology is), etc.
>When trying several approaches in Woden I found it's not as simple as 'find a left curly brace, check for a double brace, then scan for a right curly brace'. Also, it appeared from my initial interpretation of the spec that double curly braces should take precedence over single braces, but this produced some unexpected results. A better approach seems to be 'inner most pair' takes precedence, then double curly braces, then other single braces.
>Below are some test cases using different approaches. "Valid/invalid" simply indicates whether non-paired single braces end up in the parsed string (literal single braces are okay).
>Inner-most pair, then doubles, then unpaired singles. town=Paris:
>"{town}"       > {town}           > "Paris"     > valid
>"{{town}}"     > {,{town},}       > "{Paris}"   > invalid
>"{{{town}}}"   > {{,{town},}}     > "{Paris}"   > valid
>"{{{{town}}}}" > {{,{,{town},}},} > "{{Paris}}" > invalid
>"{{town}"      > {,{town}         > "{Paris"    > invalid
>"{{{town}"     > {{,{town}        > "{Paris"    > valid
>"{town}}"      > {town},}         > "Paris}"    > invalid
>"{town}}}"     > {town},}}        > "Paris}"    > valid
>Double braces first, then pairs of singles left-to-right. town=Paris:
>"{town}"       > {town}           > "Paris"     > valid
>"{{town}}"     > {{,town,}}       > "{town}"    > valid
>"{{{town}}}"   > {{,{,town,}},}   > "{{town}}"  > invalid
>"{{{{town}}}}" > {{,{{,town,}},}} > "{{Paris}}" > invalid
>"{{town}"      > {{,town,}        > "{town}"    > invalid
>"{{{town}"     > {{,{town}        > "{Paris"    > valid
>"{town}}"      > {,town,}}        > "{town}"    > invalid
>"{town}}}"     > {,town,}},}      > "{town}}"   > invalid
>Other test cases:
>""                      (is an   empty string location valid?)
>It would be good if the spec could include similar examples and/or if the test suite covered the grammar.
>John Kaputin
>[1] http://lists.w3.org/Archives/Public/www-ws-desc/2007Jan/0045.html


Best regards,
Georgi Georgiev
Received on Friday, 12 January 2007 06:21:01 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:07:05 UTC