Re: MicroXPath from Uche Ogbuji on 2016-07-09 (public-microxml@w3.org from July 2016)

From: Uche Ogbuji <uche@ogbuji.net>
Date: Sat, 9 Jul 2016 08:35:06 -0600
To: Michael Kay <mike@saxonica.com>
Cc: "public-microxml (public-microxml@w3.org)" <public-microxml@w3.org>
Message-ID: <CAPJCua0v8_nYXWs-HORYka58X9J-evUBHdydC6MDdVE7-tJ4ZQ@mail.gmail.com>
On Sat, Jul 9, 2016 at 4:16 AM, Michael Kay <mike@saxonica.com> wrote:

> >After adding a few more features here and there I realized that I should
> just implement something more rigorous, so I started to reason about how a
> proper MicroXPath might look. Why not start with the great simplicity of
> XPath 1 but borrow a few of the neat tricks from XPath 2+ (read sequences)?
>
>
> I'm missing any sense of what the guiding design principles are.
>

There's a bit of that in the paper. They're pretty much the same guiding
design principles as XPath 1's, what's changed are some details informed
from experience implementing and using the language. However, you say some
things below that inspire me to expand on my design ideas a bit.



> Why are you doing regular expressions differently from the way XPath 2.0
> does them, for example?
>

I took regex directly from EXSLT, largely because I had implementations
thereof I could readily plug in. That's something I'll have a look at.



> I think it's definitely important that the language should be relationally
> complete (i.e. allow you to do arbitrary joins), and that requires range
> variables.
>

With regard to accessing collections of nodes, largely to feed a host
environment to do the heavy lifting, could you give some example use cases
where the lack of range variables is a major problem? I should point out
that I had written the following in the paper:

"XPath 2.0/3.0 range expressions are not supported, but there are core
functions to provide similar features."

But I forgot to add those functions to the outline, or to implement them.
The most likely reason is that my experience doesn't lend urgency to that,
but it's certainly something I'll rethink. I'll have another look at some
of the XPath 2/3 use-cases.


If you're aiming for maximum power with minimum syntax then I think you
> definitely want a basic set of functional programming primitives: the
> ability to define functions and use them to map and filter sequences. With
> that capability you can actually drop an awful lot of XPath 2.0 extensions
> without losing much.
>

I'm not really aiming for all the power of XPath 2. Again much of that
isn't needed in my experience, where I'm largely using XPath as a
convenience for real work done in C, Javascript, Python, or Go (my own
personal mix). That said I did come very close to adding most of EXSLT's
Dynamic module [1]. In the end I thought: could that be done without? and
concluded: yes. Worth a rethink, and another look at XSLT 2's approach.


If you're starting from scratch, I've always felt that redefining axes as
> functions and node-tests as predicates ought to give some mileage in terms
> of reducing the number of concepts; perhaps use the concept of "abbreviated
> syntax" to map following-sibling::x to following-sibling(.)[name(.)='x'].
>

Interesting. My first reaction is that the basic navigation of node trees
is at the heart of XPath, and has enough conceptual heft to warrant its own
syntax. I guess I'd rather keep that degree of complexity and leave out
others.



> The way that XPath 2.0 extended "/" to allow atomic values on the right
> but not on the left has always seemed deeply unsatisfactory, and I'm not
> sure how you are tackling that problem. You seem to have extended "|" to
> apply to atomic values, and to have extended the concept of document order
> to be an ordering over all items, which might be the answer: you can then
> have "!" as a simple map/apply operator, and "/" to mean "!" with ordering
> and deduplication of the result, but with no constraints on the type of the
> operands.
>

Yes, I think you've divined that I chose to solve as much of that problem
as seemed pragmatic to me. Using "|" as a more universal ordering was a key
A-HA moment for me in terms of how to reconcile sequence wit the heart of
XPath 1. Your thoughts on "!" and "/" are intriguing, and I'll be thinking
about that as I ponder the list/functional processing as mentioned above.



> Things like date and time handling are of course very important to users
> but there's no reason they need support in the language, they can just be
> function libraries, and you don't lose much by representing date/time
> values as strings rather than with custom data types.
>

Right. I thought the host language could provide stuff such as current
date/time as variables, as well as a few other things. I also pondered
adding functions for date comparison, etc. It won't surprise you to hear
that I agree that leaving dates as strings within MicroXPath meets the
80/20 rule, and anything more sophisticated can be kicked to the host
environment.

Thanks so much for this, Mike. Even your rapid reactions have given me a
lot to chew on, and given me some energy to reconsider things. For the past
week I'd been juggling work with trying to get this proposal thingy
(whatever it might turn out to be) out to the list, but I'd been flagging a
bit.

[1] http://exslt.org/dyn/index.html


-- 
Uche Ogbuji                                       http://uche.ogbuji.net
Founding Partner, Zepheira                  http://zepheira.com
Author, _Ndewo, Colorado_                 http://uche.ogbuji.net/ndewo/
Founding editor, Kin Poetry Journal      http://wearekin.org
http://copia.ogbuji.net    http://www.linkedin.com/in/ucheogbuji
http://twitter.com/uogbuji
Received on Saturday, 9 July 2016 14:35:37 UTC