RE: Testcases for HTTP location grammar [CR130] from Jonathan Marsh on 2007-01-12 (www-ws-desc@w3.org from January 2007)

From: Jonathan Marsh <jonathan@wso2.com>
Date: Thu, 11 Jan 2007 21:42:56 -0800
To: "'Rogers, Tony'" <Tony.Rogers@ca.com>, "'John Kaputin $gmail$'" <jakaputin@gmail.com>, "'Philippe Le Hegaret'" <plh@w3.org>, <www-ws-desc@w3.org>
Message-ID: <005b01c7360c$842d7eb0$3501a8c0@DELLICIOUS>
I think I agree.  I don't like either of the result sets John presents.
FWIW, here's what I would expect:

 

"{town}"       > {town}           > "Paris"     > valid
"{{town}}"     > {{,town,}}       > "{Paris}"   > valid
"{{{town}}}"   > {{,{town},}}     > "{Paris}"   > valid
"{{{{town}}}}" > {{,{{,town,}},}} > "{{Paris}}" > valid
"{{town}"      > {,{town}         > "{Paris"    > invalid
"{{{town}"     > {{,{town}        > "{Paris"    > valid
"{town}}"      > {town},}         > "Paris}"    > invalid
"{town}}}"     > {town},}}        > "Paris}"    > valid



I don't know if it's Tony's algorithm, but I mentally parse it as reading
left to right, consuming {{ and emitting {, consuming {name} and replacing
it with the data, and consuming }} and emitting }.  I'm sure Philippe will
come up with a clean formulation with the maximum number of "valid" results.

 

Jonathan Marsh -  <http://www.wso2.com> http://www.wso2.com -
<http://auburnmarshes.spaces.live.com> http://auburnmarshes.spaces.live.com

 

  _____  

From: www-ws-desc-request@w3.org [mailto:www-ws-desc-request@w3.org] On
Behalf Of Rogers, Tony
Sent: Thursday, January 11, 2007 5:50 PM
To: John Kaputin (gmail); Philippe Le Hegaret; www-ws-desc@w3.org
Subject: RE: Testcases for HTTP location grammar [CR130]

 

I think the parser need to have a stack for braces - I don't believe even a
state machine can hold all the information we need - when we match up a pair
we need to know what our state was before we opened that pair. My sketch of
the processing would go:

 

if the next character is {

a. if previous character was { and top of stack is { then change top of
stack to {{

b. otherwise stack {  (remembering where it was seen)

 

if the next character is }

a. if top of stack is {{ look for another } immediately following

    i. if next char is }, unstack the {{  - we have a matching pair  {{}}

    ii. if next char is not }, throw error or treat as literal }

b. if top of stack is {, unstack the {  - we have a matching pair {}

c. if stack is empty, throw error or treat as literal }

 

at the end, the stack should be empty, assuming all { matched }, otherwise
unstack the extras and treat as literals (which is why we remembered their
locations)

 

To put it into words, I see } or }} as matching to the nearest unpaired { or
{{, but always respecting nesting. I also see longer sequences of { taken as
pairs until there's one or none left.

 

So to my mind {{{{X}}}} parses as {{  {{  X  }}  }}  - even though that's a
questionable construct.  Or do we want to add another rule saying that {{
cannot be nested inside {{ ?

 

How does that sound?

 

Tony Rogers

CA, Inc

Senior Architect, Development

tony.rogers@ca.com

co-chair UDDI TC at OASIS

co-chair WS-Desc WG at W3C

 

  _____  

From: www-ws-desc-request@w3.org on behalf of John Kaputin (gmail)
Sent: Fri 12-Jan-07 8:50
To: Philippe Le Hegaret; www-ws-desc@w3.org
Subject: Testcases for HTTP location grammar [CR130]

Phillipe,
Today's working group call concluded that a grammar should define how the
http location is parsed and you have the action, so as discussed I'm sending
you some of my testcases. My post [1] is now captured as CR130.

In deciding on the grammatical rules, things to consider include the
precedence of double curly braces versus single braces and how to match
pairs of single braces - e.g. by scanning from left to right, by 'inner most
pair' (or whatever the terminology is), etc.

When trying several approaches in Woden I found it's not as simple as 'find
a left curly brace, check for a double brace, then scan for a right curly
brace'. Also, it appeared from my initial interpretation of the spec that
double curly braces should take precedence over single braces, but this
produced some unexpected results. A better approach seems to be 'inner most
pair' takes precedence, then double curly braces, then other single braces.

Below are some test cases using different approaches. "Valid/invalid" simply
indicates whether non-paired single braces end up in the parsed string
(literal single braces are okay).

Inner-most pair, then doubles, then unpaired singles. town=Paris:

"{town}"       > {town}           > "Paris"     > valid
"{{town}}"     > {,{town},}       > "{Paris}"   > invalid
"{{{town}}}"   > {{,{town},}}     > "{Paris}"   > valid
"{{{{town}}}}" > {{,{,{town},}},} > "{{Paris}}" > invalid
"{{town}"      > {,{town}         > "{Paris"    > invalid
"{{{town}"     > {{,{town}        > "{Paris"    > valid
"{town}}"      > {town},}         > "Paris}"    > invalid
"{town}}}"     > {town},}}        > "Paris}"    > valid

Double braces first, then pairs of singles left-to-right. town=Paris:

"{town}"       > {town}           > "Paris"     > valid
"{{town}}"     > {{,town,}}       > "{town}"    > valid
"{{{town}}}"   > {{,{,town,}},}   > "{{town}}"  > invalid
"{{{{town}}}}" > {{,{{,town,}},}} > "{{Paris}}" > invalid
"{{town}"      > {{,town,}        > "{town}"    > invalid
"{{{town}"     > {{,{town}        > "{Paris"    > valid
"{town}}"      > {,town,}}        > "{town}"    > invalid
"{town}}}"     > {,town,}},}      > "{town}}"   > invalid

Other test cases:

""                      (is an   empty string location valid?)
"/temperature/"
"/temperature/{town}/"
"/temperature/{town}/{state}/{country}"
"/temperature/{town}/{{{state}}}/{country}"

It would be good if the spec could include similar examples and/or if the
test suite covered the grammar.

regards,
John Kaputin

[1] http://lists.w3.org/Archives/Public/www-ws-desc/2007Jan/0045.html
Received on Friday, 12 January 2007 05:49:36 UTC