Re: String Templates (was RE: ACTION 614-12: Smart Quotes)

On Sep 16, 2015, at 1:27 PM, Robie, Jonathan wrote:

> At the last telcon, I was asked to provide a modified proposal for string templates (formerly known as smart quotes).
> 
> How's this?

Looks like a good start; the grammar seems a bit squishy, though.

> 
> Jonathan
> 
> [128]  PrimaryExpr	   ::=   ... | StringTemplate
> 
> 3.16.5.1 String Templates
> 
> A <term>String Template</term> is a string constructor that allows
> embedded expressions, which are evaluated and used to create a string.
> 
> The syntax is suitable for containing fragments of JavaScript, JSON,
> CSS, SPARQL or other languages that might use curly braces, double and
> single quotes, or be difficult or tedious to construct directly in
> XQuery using other mechanisms.

For 'curly braces' read just 'braces', I think.

And I wonder if we can avoid saying that construction techniques
which some readers will find perfectly normal are difficult or tedious.
Perhaps something like

  The syntax makes it simpler to construct strings containing
  expressions in JavaScript, JSON, CSS, SPARQL, XQuery, XPath
  itself, or other languages that might use braces, quotation marks, 
  or other strings used as delimiters in XPath.


> 
> [201] StringTemplate ::= "<[" . StringTemplateText "]>"
> [203] StringTemplateText := ((Char* - "<[") | StringTemplateExpr )*
> [xxx] StringTemplateExpr := "<{" Expr "}>"

In the grammar syntax used by the XML spec (which I think we are
using), rule 203 means that String Template Text may be zero or more
occurrences of

   - any string which is not the string '<{'
   - a delimited string template expression

I think there are two problems here:

1 the string ']>' must also be forbidden to appear; otherwise we have
ambiguity issues (and, I think, some nasty lookahead problems even in
cases with no ambiguity).

With the grammar given, the expression

  <[ aaa ]> || <[ bb ]>

can be parsed either as

  Expr 
  . ExprSingle
  .. OrExpr
  ... AndExpr
  .... ComparisonExpr
  ..... StringConcatExpr

  ..... - RangeExpr 
  ..... - . AdditiveExpr
  ..... - .. MultiplicativeExpr
  ..... - ... UnionExpr
  ..... - .... IntersectExceptExpr
  ..... - ..... InstanceOfExpr
  ..... - ..... . TreatExr
  ..... - ..... .. CastableExpr
  ..... - ..... ... CastExpr
  ..... - ..... .... UnaryExpr
  ..... - ..... ..... ValueExpr
  ..... - ..... ..... . SimpleMapExpr
  ..... - ..... ..... .. PathExpr
  ..... - ..... ..... ... RelativePathExpr
  ..... - ..... ..... .... StepExpr
  ..... - ..... ..... ..... PostfixExpr
  ..... - ..... ..... ..... . PrimaryExpr
  ..... - ..... ..... ..... .. StringTemplate
  ..... - ..... ..... ..... .. -- '<[' 
  ..... - ..... ..... ..... .. -- StringTemplateText 
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. ' aaa '
  ..... - ..... ..... ..... .. -- '||'
  ..... - ..... ..... ..... .. -- StringTemplateText 
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. ' bbb '
  ..... - ..... ..... ..... .. -- ']>' 

or as 

  Expr 
  . ExprSingle
  .. OrExpr
  ... AndExpr
  .... ComparisonExpr
  ..... StringConcatExpr

  ..... - RangeExpr 
  ..... - . AdditiveExpr
  ..... - .. MultiplicativeExpr
  ..... - ... UnionExpr
  ..... - .... IntersectExceptExpr
  ..... - ..... InstanceOfExpr
  ..... - ..... . TreatExr
  ..... - ..... .. CastableExpr
  ..... - ..... ... CastExpr
  ..... - ..... .... UnaryExpr
  ..... - ..... ..... ValueExpr
  ..... - ..... ..... . SimpleMapExpr
  ..... - ..... ..... .. PathExpr
  ..... - ..... ..... ... RelativePathExpr
  ..... - ..... ..... .... StepExpr
  ..... - ..... ..... ..... PostfixExpr
  ..... - ..... ..... ..... . PrimaryExpr
  ..... - ..... ..... ..... .. StringTemplate
  ..... - ..... ..... ..... .. -- '<[' 
  ..... - ..... ..... ..... .. -- StringTemplateText 
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. ' aaa '
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. ']>'
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. ' || '
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. '<['
  ..... - ..... ..... ..... .. -- . Char* - '<{'
  ..... - ..... ..... ..... .. -- .. ' bbb '
  ..... - ..... ..... ..... .. -- ']>' 


2 The grammar needs to forbid not just any string equal to either '<{'
or ']>', but any string *containing* either of them.  So a first
rewrite would give us:

[201] StringTemplate ::= '<[' StringTemplateText ']>'
[202] StringTemplateText ::= (StringTemplateChars | StringTemplateExpr)*
[nnn] StringTemplateChars ::= Char* - (Char* ('<{' | ']>') Char*)) 
[nnn] StringTemplateExpr ::= '<{' Expr '}>'

But this is also ambiguous, because it allows two adjacent occurrences
of StringTemplateChars, which means we can parse

  <[ <{ $s }> fish ]>

as

  StringTemplateText
  . '<['
  . StringTemplateText
  .. StringTemplateChars
  ... Char* - '<{'
  .... ' <' 
  .. StringTemplateChars
  ... Char* - '<{'
  .... '{ $s }> fish ' 
  . ']>'

as well as in the expected way (which I will leave an an exercise for
the reader).

3 I think that what we intend is that StringTemplateText be a sequence
of uninterrupted stretches of characters, interrupted by
StringTemplateExpr expressions:

[201] StringTemplate ::= '<[' StringTemplateText ']>'
[202] StringTemplateText ::= (StringTemplateChars,
                             (StringTemplateExpr, StringTemplateChars)*)
[nnn] StringTemplateChars ::= Char* - (Char* ('<{' | ']>') Char*)) 
[nnn] StringTemplateExpr ::= '<{' Expr '}>'


> 
> An string template allows embedded expressions, which are called
> string template expressions. 

Perhaps s/An string template allows/A string template can contain/ ?

> The string value of each string template expression $e is computed
> using the expression string-join($e ! string(.), ' ').  
> Thus, <[ <{ 1 to 3 }> ]> evaluates to the string "1 2 3".
> 
> When a string template is evaluated, string template text is treated
> as literal text.  Line endings are processed as elsewhere in XQuery;
> no other processing is performed on string template text. Each string
> template expression is evaluated and converted to its string value,
> then concatenated with string value text to create one string, which
> is the value of the string template expression.
> 
> Note: 
> 
>  In string template text, & is not recognized as special, and < is
>  only recognized when immediately followed by "{".  Thus, <[ &lt; ]>
>  evaluates to the string "&lt;", not the < character, and <[ <[ ]>
>  evaluates to the string "<[".

The rules given in this draft seem to me to say that <[ &lt; ]>
evaluates to the string " &lt; ", not "&lt;".  Otherwise, what is happening
to the whitespace?


> 
> Example:
> 
> for $s in ("one", "two", "red", "blue")
> return <[ <{ $s }> fish ]>
> 
> Example:
> 
> The following example from json.org adds a session ID
> member that is generated dynamically:
> 
>    declare variable $json := <[ {"menu": {
>      "id": "file",
>      "value": "File",
>      "popup": {
>        "menuitem": [
>          {"value": "New", "onclick": "CreateNewDoc()"},
>          {"value": "Open", "onclick": "OpenDoc()"},
>          {"value": "Close", "onclick": "CloseDoc()"},
>          {"callback": null },
>          {"session-id": <{ get-session-id() }> }
>        ]
>      }
>    }} ]>;
> 
> Example:
> 
> Embedded expressions can contain string templates to created nested
> string templates.
> 
> Example:
> 
> <[ 
>   <{ 
>      $i, <[ literal text ]>, 
>      $j, <[ more literal text ]> 
>   }> 
> ]>


I think the examples should probably show exactly what string the expressions
evaluate to.

Michael

-- 
****************************************************************
* C. M. Sperberg-McQueen, Black Mesa Technologies LLC
* http://www.blackmesatech.com 
* http://cmsmcq.com/mib                 
* http://balisage.net
****************************************************************

Received on Wednesday, 16 September 2015 20:25:23 UTC