Streamable XPath subset

To define the Streamable XPath subset more formally, I have taken the 
grammar from the XPath 1.0 spec, and shown the modifications to it. 
Additions are in green, and deletions are in red strikeout. I have kept 
the rule numbers the same, so that we can correlate with the spec.

I am planning to put this in the draft spec.

[0a] IncludedXPath ::=
    ( LocationPath '|' )* LocationPath
[0b] ExcludedXPath ::=
    ( LocationPath '|' )* LocationPath

 The Included and Excluded Xpath do not use the generic XPath Expr. 
Instead they are just a union of LocationPath. There is a slight 
difference between IncludedXPath and ExcludedXPath, ExcludedXpath can 
select attributes and element, whereas IncludedXPath can only select 

[1]    LocationPath    ::=    RelativeLocationPath 
   | AbsoluteLocationPath 

[2]    AbsoluteLocationPath    ::=    '/' RelativeLocationPath? 
   | AbbreviatedAbsoluteLocationPath 

[3]    RelativeLocationPath    ::=    Step 
   | RelativeLocationPath '/' Step 
   ( StepNoPredicate '/')* Step 
   | AbbreviatedRelativeLocationPath 

    * RelativeLocationPath is not allowed, only Absolute is allowed.
      This is because if you use relative, it would probably mean
      relative to the <Signature> element, and then you would have to
      support the ancestor axis to reach the other nodes, but that axis
      is not streamable.

    * Only the last Step can have a predicate. So I have created a
      non-terminal "StepNoPredicate"

[4]    Step    ::=    AxisSpecifier NodeTest Predicate* 
   | AbbreviatedStep
[4a] StepNoPredicate    ::=    AxisSpecifier NodeTest
   | AbbreviatedStep

[4b] StepAttributeOnly ::=
   'attribute' '::' NameTest
  | '@' '::' NameTest
 Added two new versions of Step.
One is a Step with no Predicate, and the other is a step attribute only

e.g. in this XPath expression


    * doc" is StepNoPredicate
    * chapter[@type="warning"] is Step
    * @type is StepAttributeOnly

[5]    AxisSpecifier    ::=    AxisName '::' 
   | AbbreviatedAxisSpecifier 
[6]    AxisName    ::=    'ancestor' 
   | 'ancestor-or-self' 
   | 'attribute' 
   | 'child' 
   | 'descendant' 
   | 'descendant-or-self' 
   | 'following' 
   | 'following-sibling' 
   | 'namespace' 
   | 'parent' 
   | 'preceding' 
   | 'preceding-sibling' 
   | 'self'

 All the non streamable axes have been removed - ancestor, 
ancestor-or-self, following, following-sibling,  namespace, parent, 
preceding, preceding-sibling
[7]    NodeTest    ::=    NameTest 
   | NodeType '(' ')' 
   | 'processing-instruction' '(' Literal ')' 

 processing instruction test is not allowed.
only the node()  nodetest is allowed, not comment(), text() and 
[8]    Predicate    ::=    '[' PredicateExpr ']' 
[9]    PredicateExpr    ::=    Expr
but the definition of Expr has changed, so it is only a 
additive/relative expressions of StepAttributeOnly and Literals.
[10]    AbbreviatedAbsoluteLocationPath    ::=    '//' 
[11]    AbbreviatedRelativeLocationPath    ::=    RelativeLocationPath 
'//' Step
[12]    AbbreviatedStep    ::=    '.' 
   | '..'
[13]    AbbreviatedAxisSpecifier    ::=    '@'?
[14]    Expr    ::=    OrExpr 
[15]    PrimaryExpr    ::=   VariableReference
   | '(' Expr ')' 
   | Literal 
   | Number 
   | FunctionCall

[16]    FunctionCall    ::=    FunctionName '(' ( Argument ( ',' 
Argument )* )? ')' 
[17]    Argument    ::=    Expr

[18]    UnionExpr    ::=    PathExpr 
   | UnionExpr '|' PathExpr 
[19]    PathExpr    ::=    LocationPath 
   | FilterExpr 
   | FilterExpr '/' RelativeLocationPath 
   | FilterExpr '//' RelativeLocationPath 
[20]    FilterExpr    ::=    PrimaryExpr 
   | FilterExpr Predicate

 UnionExpr, PathExpr and FilterExpr have been removed.
[21]    OrExpr    ::=    AndExpr 
   | OrExpr 'or' AndExpr 
[22]    AndExpr    ::=    EqualityExpr 
   | AndExpr 'and' EqualityExpr 
[23]    EqualityExpr    ::=    RelationalExpr 
   | EqualityExpr '=' RelationalExpr 
   | EqualityExpr '!=' RelationalExpr 
[24]    RelationalExpr    ::=    AdditiveExpr 
   | RelationalExpr '<' AdditiveExpr 
   | RelationalExpr '>' AdditiveExpr 
   | RelationalExpr '<=' AdditiveExpr 
   | RelationalExpr '>=' AdditiveExpr 

[25]    AdditiveExpr    ::=    MultiplicativeExpr 
   | AdditiveExpr '+' MultiplicativeExpr 
   | AdditiveExpr '-' MultiplicativeExpr 
[26]    MultiplicativeExpr    ::=    UnaryExpr 
   | MultiplicativeExpr MultiplyOperator UnaryExpr 
   | MultiplicativeExpr 'div' UnaryExpr 
   | MultiplicativeExpr 'mod' UnaryExpr 
[27]    UnaryExpr    ::=    UnionExpr
   |  StepAttributeOnly
   | '-' UnaryExpr

 The unaryExpr is changed to only allow a PrimaryExpr or StepAttributeOnly
[28]    ExprToken    ::=    '(' | ')' | '[' | ']' | '.' | '..' | '@' | 
',' | '::' 
   | NameTest 
   | NodeType 
   | Operator 
   | FunctionName 
   | AxisName 
   | Literal 
   | Number 
   | VariableReference 
[29]    Literal    ::=    '"' [^"]* '"' 
   | "'" [^']* "'" 
[30]    Number    ::=    Digits ('.' Digits?)? 
   | '.' Digits 
[31]    Digits    ::=    [0-9]+ 
[32]    Operator    ::=    OperatorName 
   | MultiplyOperator 
   | '/' | '//' | '|' | '+' | '-' | '=' | '!=' | '<' | '<=' | '>' | '>=' 
[33]    OperatorName    ::=    'and' | 'or' | 'mod' | 'div' 
[34]    MultiplyOperator    ::=    '*' 
[35]    FunctionName    ::=    QName - NodeType  
[36]    VariableReference    ::=    '$' QName 
[37]    NameTest    ::=    '*' 
   | NCName ':' '*' 
   | QName 
[38]    NodeType    ::=    'comment' 
   | 'text' 
   | 'processing-instruction' 
   | 'node' 
[39]    ExprWhitespace    ::=    S

 unchanged, expect for the NodeTest
Node set functions

    * last()
    * position()
    * count(nodeset)
    * id()
    * local-name(nodeset)
    * namespace-uri(nodeset)
    * name(nodeset)

String functions

    * string(object)
    * concat(string, string, string*)
    * starts-with(string, string)
    * contains(string, string)
    * substring-before(string, string)
    * substring-after(string, string)
    * substring(string, number, number)
    * string-length(string?)
    * normalize-space(string?)

Boolean functions

    * boolean(object)
    * true()
    * fakse()
    * lang(string)

Number functions

    * number(object?)
    * sum(node-set)
    * floor(number)
    * ceiling(number)
    * round(number)

As mentioned before, only the last Step can have a Predicate, and this 
predicate's expression can only involve attribute nodes of the current 
element. Functions can only be used inside this last step's predicate, 
and this function can only accept a single attribute as an argument. 
There is no way to use element names, text nodes,
comments and processing instructions in functions.

The "string-value" become just the attributes value.

All functions involving context position and context size are not 
supported i.e.. last, position, count or their shortcut versions e.g. 
foo[1]. the streaming parser cannot maintain counts.

String, number and boolean functions are all supported.

Also it might be easier to understand this with some examples, of what 
is a valid streamable XPath and what is not.

*Non streamable XPaths*
 Relative paths are not allowed.
 Predicates are only allowed in the last step.
Also the position() function is not allowed
 Cannot use a child element in the predicate. Only attributes are 
allowed in predicate.
 following-sibling axis is not allowed., position function is not 
allowed, relative path is not allowed
 parent axis is not allowed, relative paths are not allowed.
 id function is not allowed. functions can only be used inside a predicate
 Not allowed to select text nodes.

*Streamable XPath examples*

    * //olist/item
    * //para[@type="warning"]
    * //employee[@secretary and @assistant]
    * /soap:Envelope/soap:Header/*[@actor = "msh"]
    * /*/para
    * //para/@type   (this XPath is valid in ExcludedXpath, but not in


Received on Friday, 4 September 2009 18:37:29 UTC