Streamable XPath subset

To define the Streamable XPath subset more formally, I have taken the 
grammar from the XPath 1.0 spec, and shown the modifications to it. 
Additions are in green, and deletions are in red strikeout. I have kept 
the rule numbers the same, so that we can correlate with the spec.

I am planning to put this in the draft spec.


Grammer
 Explanation
[0a] IncludedXPath ::=
    ( LocationPath '|' )* LocationPath
[0b] ExcludedXPath ::=
    ( LocationPath '|' )* LocationPath

 The Included and Excluded Xpath do not use the generic XPath Expr. 
Instead they are just a union of LocationPath. There is a slight 
difference between IncludedXPath and ExcludedXPath, ExcludedXpath can 
select attributes and element, whereas IncludedXPath can only select 
elements.

[1]    LocationPath    ::=    RelativeLocationPath 
   | AbsoluteLocationPath 

[2]    AbsoluteLocationPath    ::=    '/' RelativeLocationPath? 
   | AbbreviatedAbsoluteLocationPath 

[3]    RelativeLocationPath    ::=    Step 
   | RelativeLocationPath '/' Step 
   ( StepNoPredicate '/')* Step 
   | AbbreviatedRelativeLocationPath 
 

    * RelativeLocationPath is not allowed, only Absolute is allowed.
      This is because if you use relative, it would probably mean
      relative to the <Signature> element, and then you would have to
      support the ancestor axis to reach the other nodes, but that axis
      is not streamable.

    * Only the last Step can have a predicate. So I have created a
      non-terminal "StepNoPredicate"

[4]    Step    ::=    AxisSpecifier NodeTest Predicate* 
   | AbbreviatedStep
 
[4a] StepNoPredicate    ::=    AxisSpecifier NodeTest
   | AbbreviatedStep

[4b] StepAttributeOnly ::=
   'attribute' '::' NameTest
  | '@' '::' NameTest
 Added two new versions of Step.
One is a Step with no Predicate, and the other is a step attribute only

e.g. in this XPath expression

/doc/chapter[@type="warning"]

    * doc" is StepNoPredicate
    * chapter[@type="warning"] is Step
    * @type is StepAttributeOnly


[5]    AxisSpecifier    ::=    AxisName '::' 
   | AbbreviatedAxisSpecifier 
 unchanged
[6]    AxisName    ::=    'ancestor' 
   | 'ancestor-or-self' 
   | 'attribute' 
   | 'child' 
   | 'descendant' 
   | 'descendant-or-self' 
   | 'following' 
   | 'following-sibling' 
   | 'namespace' 
   | 'parent' 
   | 'preceding' 
   | 'preceding-sibling' 
   | 'self'

 All the non streamable axes have been removed - ancestor, 
ancestor-or-self, following, following-sibling,  namespace, parent, 
preceding, preceding-sibling
[7]    NodeTest    ::=    NameTest 
   | NodeType '(' ')' 
   | 'processing-instruction' '(' Literal ')' 

 
 processing instruction test is not allowed.
only the node()  nodetest is allowed, not comment(), text() and 
processing-instruction()
[8]    Predicate    ::=    '[' PredicateExpr ']' 
[9]    PredicateExpr    ::=    Expr
 unchanged
but the definition of Expr has changed, so it is only a 
additive/relative expressions of StepAttributeOnly and Literals.
[10]    AbbreviatedAbsoluteLocationPath    ::=    '//' 
RelativeLocationPath 
[11]    AbbreviatedRelativeLocationPath    ::=    RelativeLocationPath 
'//' Step
[12]    AbbreviatedStep    ::=    '.' 
   | '..'
[13]    AbbreviatedAxisSpecifier    ::=    '@'?
 unchanged
[14]    Expr    ::=    OrExpr 
[15]    PrimaryExpr    ::=   VariableReference
   | '(' Expr ')' 
   | Literal 
   | Number 
   | FunctionCall

  unchanged
[16]    FunctionCall    ::=    FunctionName '(' ( Argument ( ',' 
Argument )* )? ')' 
[17]    Argument    ::=    Expr

 unchanged
[18]    UnionExpr    ::=    PathExpr 
   | UnionExpr '|' PathExpr 
[19]    PathExpr    ::=    LocationPath 
   | FilterExpr 
   | FilterExpr '/' RelativeLocationPath 
   | FilterExpr '//' RelativeLocationPath 
[20]    FilterExpr    ::=    PrimaryExpr 
   | FilterExpr Predicate


 UnionExpr, PathExpr and FilterExpr have been removed.
[21]    OrExpr    ::=    AndExpr 
   | OrExpr 'or' AndExpr 
[22]    AndExpr    ::=    EqualityExpr 
   | AndExpr 'and' EqualityExpr 
[23]    EqualityExpr    ::=    RelationalExpr 
   | EqualityExpr '=' RelationalExpr 
   | EqualityExpr '!=' RelationalExpr 
[24]    RelationalExpr    ::=    AdditiveExpr 
   | RelationalExpr '<' AdditiveExpr 
   | RelationalExpr '>' AdditiveExpr 
   | RelationalExpr '<=' AdditiveExpr 
   | RelationalExpr '>=' AdditiveExpr 

 unchanged
[25]    AdditiveExpr    ::=    MultiplicativeExpr 
   | AdditiveExpr '+' MultiplicativeExpr 
   | AdditiveExpr '-' MultiplicativeExpr 
[26]    MultiplicativeExpr    ::=    UnaryExpr 
   | MultiplicativeExpr MultiplyOperator UnaryExpr 
   | MultiplicativeExpr 'div' UnaryExpr 
   | MultiplicativeExpr 'mod' UnaryExpr 
[27]    UnaryExpr    ::=    UnionExpr
   PrimaryExpr
   |  StepAttributeOnly
   | '-' UnaryExpr

 The unaryExpr is changed to only allow a PrimaryExpr or StepAttributeOnly
[28]    ExprToken    ::=    '(' | ')' | '[' | ']' | '.' | '..' | '@' | 
',' | '::' 
   | NameTest 
   | NodeType 
   | Operator 
   | FunctionName 
   | AxisName 
   | Literal 
   | Number 
   | VariableReference 
[29]    Literal    ::=    '"' [^"]* '"' 
   | "'" [^']* "'" 
[30]    Number    ::=    Digits ('.' Digits?)? 
   | '.' Digits 
[31]    Digits    ::=    [0-9]+ 
[32]    Operator    ::=    OperatorName 
   | MultiplyOperator 
   | '/' | '//' | '|' | '+' | '-' | '=' | '!=' | '<' | '<=' | '>' | '>=' 
[33]    OperatorName    ::=    'and' | 'or' | 'mod' | 'div' 
[34]    MultiplyOperator    ::=    '*' 
[35]    FunctionName    ::=    QName - NodeType  
[36]    VariableReference    ::=    '$' QName 
[37]    NameTest    ::=    '*' 
   | NCName ':' '*' 
   | QName 
[38]    NodeType    ::=    'comment' 
   | 'text' 
   | 'processing-instruction' 
   | 'node' 
[39]    ExprWhitespace    ::=    S

 unchanged, expect for the NodeTest
Node set functions

    * last()
    * position()
    * count(nodeset)
    * id()
    * local-name(nodeset)
    * namespace-uri(nodeset)
    * name(nodeset)

String functions

    * string(object)
    * concat(string, string, string*)
    * starts-with(string, string)
    * contains(string, string)
    * substring-before(string, string)
    * substring-after(string, string)
    * substring(string, number, number)
    * string-length(string?)
    * normalize-space(string?)

Boolean functions

    * boolean(object)
    * true()
    * fakse()
    * lang(string)

Number functions

    * number(object?)
    * sum(node-set)
    * floor(number)
    * ceiling(number)
    * round(number)

 Note:
As mentioned before, only the last Step can have a Predicate, and this 
predicate's expression can only involve attribute nodes of the current 
element. Functions can only be used inside this last step's predicate, 
and this function can only accept a single attribute as an argument. 
There is no way to use element names, text nodes,
comments and processing instructions in functions.

The "string-value" become just the attributes value.

All functions involving context position and context size are not 
supported i.e.. last, position, count or their shortcut versions e.g. 
foo[1]. the streaming parser cannot maintain counts.

String, number and boolean functions are all supported.




Also it might be easier to understand this with some examples, of what 
is a valid streamable XPath and what is not.


*Non streamable XPaths*
 Reason
.//para
 Relative paths are not allowed.
/doc/chapter[5]/section[2]
 Predicates are only allowed in the last step.
Also the position() function is not allowed
//chapter[title="Introduction"]
 Cannot use a child element in the predicate. Only attributes are 
allowed in predicate.
following-sibling::chapter[position()=1]
 following-sibling axis is not allowed., position function is not 
allowed, relative path is not allowed
../title
 parent axis is not allowed, relative paths are not allowed.
id("foo")
 id function is not allowed. functions can only be used inside a predicate
/doc/chapter/child::text()
 Not allowed to select text nodes.


*Streamable XPath examples*

    * //olist/item
    * //para[@type="warning"]
    * //employee[@secretary and @assistant]
    * /soap:Envelope/soap:Header/*[@actor = "msh"]
    * /*/para
    * //para/@type   (this XPath is valid in ExcludedXpath, but not in
      IncludedXpath)


Pratik

Received on Friday, 4 September 2009 18:37:29 UTC