RE: XPath grammar from Pratik Datta on 2010-12-06 (public-xmlsec@w3.org from December 2010)

From: Pratik Datta <pratik.datta@oracle.com>
Date: Mon, 6 Dec 2010 14:06:31 -0800 (PST)
To: Pratik Datta <pratik.datta@oracle.com>, Meiko Jensen <Meiko.Jensen@ruhr-uni-bochum.de>, XMLSec WG Public List <public-xmlsec@w3.org>
Message-ID: <3c0108db-8103-4c80-b4f9-7d9dd99fc4ea@default>
I have modified Meiko's proposal by putting back all the functions, boolean expressions, arithmetic expressions etc, and removing the distinction between Included/Excluded and Final/NonFinal step.

Take a look:







=== Grammar for top Level expression === 

// At the top level it is either an id or an union of AbsoluteLocationPaths
XPathSubsetForStreamin ::= 
   IDXPath 
   | (AbsoluteLocationPath '|' )* AbsoluteLocationPath


IDXPath ::= 'id(' IDValue ')'

IDValue ::= NCName

AbsoluteLocationPath ::= '/' RelativeLocationPath? | AbbreviatedAbsoluteLocationPath

AbbreviatedAbsoluteLocationPath ::=  '//' RelativeLocationPath

RelativeLocationPath ::= Step | RelativeLocationPath '/' Step | AbbreviatedRelativeLocationPath

AbbreviatedRelativeLocationPath ::=  RelativeLocationPath '//' Step

Step ::= AxisSpecifier NameTest RestrictedPredicate* | '.'

AxisSpecifier ::=  (AxisName '::')?

AxisName ::= 'attribute' | 'child' | 'descendant' | 'descendant-or-self' | 'following' | 'following-sibling' | 'self'

NameTest ::= '*' | NCName ':' '*' | QName









=== Grammar for Predicate ====

// A predicate is basically an expression involving AttributeReference 

RestrictedPredicate ::= '[' AttributeExpr ']'

AttributeExpr::= OrExpr

OrExpr ::= AndExpr | OrExpr 'or' AndExpr

AndExpr ::= EqualityExpr | AndExpr 'and' EqualityExpr

EqualityExpr ::= RelationalExpr
    | EqualityExpr '=' RelationalExpr
    | EqualityExpr '!=' RelationalExpr

RelationalExpr ::= AdditiveExpr
    | RelationalExpr '<' AdditiveExpr
    | RelationalExpr '>' AdditiveExpr
    | RelationalExpr '<=' AdditiveExpr
    | RelationalExpr '>=' AdditiveExpr

AdditiveExpr ::=
    MultiplicativeExpr
    | AdditiveExpr '+' MultiplicativeExpr
    | AdditiveExpr '-' MultiplicativeExpr

MultiplicativeExpr ::= UnaryExpr
    | MultiplicativeExpr MultiplyOperator UnaryExpr
    | MultiplicativeExpr 'div' UnaryExpr
    | MultiplicativeExpr 'mod' UnaryExpr

UnaryExpr ::=
   PrimaryExpr
    | AttributeReference
    | '-' UnaryExpr

AttributeReference::=  
    'attribute' '::' NameTest 
    | '@' NameTest


PrimaryExpr ::= 
    VariableReference
    | '(' AttributeExpr ')'
    | Literal
    | Number
    | FunctionCall

FunctionCall ::= 
      FunctionName '(' ( Argument ( ',' Argument )* )? ')'

Argument ::= AttributeExpr

NameTest ::= 
    '*' | NCName ':' '*' | QName

Literal ::= '"' [^"]* '"' | "'" [^']* ""

Number ::= Digits ('.' Digits?)?  | '.' Digits

Digits ::= [0-9]+


MultiplyOperator ::= '*'

FunctionName ::= QName - NodeType     

VariableReference ::= '$' QName







==== Functions ==== 

These functions can only take in AttributeExpr as arguments.  Some functions take in either an AttributeReference or nothing  local-name(), namespace-uri(), name()

Node set functions

    * position()
    * id()
    * local-name(AttributeReference or no-argument)     // no argument implies self::node()
    * namespace-uri(AttributeReference or no-argument)  // no argument implies self::node()
    * name(AttributeReference or no-argument)           // no argument implies self::node()

String functions

    * string(object)                    // argument not optional
    * concat(string, string, string*)
    * starts-with(string, string)
    * contains(string, string)
    * substring-before(string, string)
    * substring-after(string, string)
    * substring(string, number, number)
    * string-length(string)             // argument not optional
    * normalize-space(string)           // argument not optional

Boolean functions

    * boolean(object)
    * true()
    * false()
    * lang(string)

Number functions

    * number(object)                  // argument not optional
    * sum(node-set)
    * floor(number)
    * ceiling(number)
    * round(number)




Pratik

-----Original Message-----
From: Pratik Datta 
Sent: Saturday, December 04, 2010 10:39 PM
To: Meiko Jensen; XMLSec WG Public List
Subject: RE: XPath grammar

Meiko,

This subset that you have proposed is far more restrictive that the subset that we already have. That makes is not very useful, many of the examples that are currently put in are not allowed by your grammar any more. I see no reason for removing all the arithmetic operators, relational operators, functions, variable references, etc. They do not affect streamability.


I was intending this XPath subset to be reusable for other streaming applications too.   Because when it comes to implementation it will be very unlikely that anybody will create a separate XPath implementation just for XML signature. Rather I expect people will create a single streaming XPath implementation and use it for many things. So it is better if do not put in the IncludedXPath and ExcludedXPath contructs into the XPath subset, rather we put this in as an additional limitation imposed by the C14N 2.0 data model, which cannot accept attributes without their owner elements..  Similarly we shouldn't distinguish between final and non final steps. Basically I am saying that we should go with your option c).

Pratik



 
-----Original Message-----
From: Meiko Jensen [mailto:Meiko.Jensen@ruhr-uni-bochum.de] 
Sent: Friday, December 03, 2010 6:34 AM
To: XMLSec WG Public List
Subject: XPath grammar

I just tried to create the BNF grammar of the streamable XPath subset as specified during the F2F meeting. Please review carefully, since I'm not convinced I got it right, and especially since I was rather restrictive on what I allowed to be used within the predicates. I think there would be several spots where additional operators or functions might be useful, but I didn't see the grammar getting easier, hence decided to cut them off anyway.

Also, I decided to do two separate grammars, one for IncludedXPath and one for ExcludedXPath. The problem of merging both into one grammar is that this would require most of the tokens to be relabeled to "IncludedXPathFoo" and "ExcludedXPathFoo", which I already found annoying for the "FinalStepFoo" and "NonFinalStepFoo" tokens.

After all, I'm not happy with that solution either, since the only difference between IncludedXPath and ExcludedXPath grammar is the "attribute" AxisName being allowed in the latter only. Hence, we have three options here:

a) merge Included and Excluded into one grammar
b) leave as is
c) simplify the grammar even more, stating the intended differences between Included and Excluded, or non-final step and final step, respectively, as textual comments for each definition (as done in the spec document by now).

However, this should close my Action-687, Action-688, and Action-690 for now.

best regards

Meiko

--
Dipl.-Inf. Meiko Jensen
Chair for Network and Data Security
Horst Görtz Institute for IT-Security
Ruhr University Bochum, Germany
_____________________________
Universitätsstr. 150, Geb. ID 2/411
D-44801 Bochum, Germany
Phone: +49 (0) 234 / 32-26796
Telefax: +49 (0) 234 / 32-14347
http:// www.nds.rub.de
Received on Monday, 6 December 2010 22:07:53 UTC