Re: Refactor of ixml grammar from C. M. Sperberg-McQueen on 2021-11-04 (public-ixml@w3.org from November 2021)

From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
Date: Thu, 4 Nov 2021 10:02:26 -0600
To: Dave Pawson <dave.pawson@gmail.com>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, Steven Pemberton <steven.pemberton@cwi.nl>, ixml <public-ixml@w3.org>
Message-Id: <35C5410C-139C-4B54-BF37-9D14178277B0@blackmesatech.com>

Dave, good point.

Both Tom Hillman (Jay Parser) and I (Aparecium) are working on ixml
implementations that are intended to be callable from XSLT.  The
current initial implementation of Aparecium is XQuery, but I expect to
get around to making an XSLT version ‘real soon now’.  (And if memory
serves, Saxon allows XQuery functions to be called from XSLT, though
I’ve never actually done that …)

And yes, running an ixml parser that provides a coarse analysis, and
then processing, and then running an ixml parser with a different grammar
in individual segments, is a style of usage I think makes sense.  

At the moment, though, Aparecium is too slow to be anything but a toy
for playing with, while dreaming of the day when we have something
usable in a production system.

Michael

> On 4,Nov2021, at 6:26 AM, Dave Pawson <dave.pawson@gmail.com> wrote:
> 
> Sorry Steven, I meant an XSLT tokenize. I 'presume' that XSLT may eventually
> take in iXML?
> 
> regards
> 
> On Thu, 4 Nov 2021 at 10:39, Steven Pemberton <steven.pemberton@cwi.nl> wrote:
>> 
>> I'm not sure if I exactly understand your comment, but ixml doesn't tokenise at all. It only recognises characters.
>> Steven
>> 
>> On Thursday 04 November 2021 09:45:32 (+01:00), Dave Pawson wrote:
>> 
>>> A common XSLT processing sequence, plain text to XML (e.g. CSV) is to
>>> tokenize by eol first, then
>>> within line?
>>> 
>>> I'd hope you might support this form of processing.
>>> 
>>> regards
>>> 
>>> On Thu, 4 Nov 2021 at 08:31, Steven Pemberton <steven.pemberton@cwi.nl> wrote:
>>>> 
>>>> Good points.
>>>> 
>>>> ABC completely denies the existence of end-of-line characters. It delivers input as an array of lines, where the line terminators have been elided. This is because different operating systems use different line end conventions, and the language hides these differences. So there is no way to get a LF delivered.
>>>> 
>>>> Steven
>>>> 
>>>> On Thursday 04 November 2021 01:32:13 (+01:00), C. M. Sperberg-McQueen wrote:
>>>> 
>>>>> I like most of these changes.
>>>>> 
>>>>> But having
>>>>> 
>>>>> ixml: s, rule+.
>>>>> rule: (mark, s)?, name, s, -[“:=“], s, -alts, -“.”, s.
>>>>> 
>>>>> instead of
>>>>> 
>>>>> ixml: s, rule+s.
>>>>> rule: (mark, s)?, name, s, -[“:=“], s, -alts, -“.”.
>>>>> 
>>>>> has the unfortunate effect that a grammar like
>>>>> 
>>>>> { Section 1: …}
>>>>> a: … .
>>>>> b: … .
>>>>> 
>>>>> 
>>>>> { Section 2: …}
>>>>> z: … .
>>>>> y: … .
>>>>> 
>>>>> produces XML in which the comment ‘ Section 2: … ‘ turns up not
>>>>> between the last rule of section 1 and the first rule of section 2, but
>>>>> within the last rule of section 1.
>>>>> 
>>>>> Also, I’m curious what the bug involving lf was.
>>>>> 
>>>>> Michael
>>>>> 
>>>>>> On 3,Nov2021, at 5:02 PM, Steven Pemberton <steven.pemberton@cwi.nl> wrote:
>>>>>> 
>>>>>> In an idle moment, I refactored the grammar. Comments gladly received.
>>>>>> Changes: * I hid all nonessential terminals. I know above all Tom was asking for this.
>>>>>> * I moved the spaces from the rule for ixml into the rule for rule. Tidier and more consistent.
>>>>>> * I renamed S to s.
>>>>>> * I simplified 'namestart', since I realised class L covered all the cases.
>>>>>> 
>>>>>> I think that's all.
>>>>>> 
>>>>>> See attachment.
>>>>>> 
>>>>>> Steven<ixml-new.ixml>
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>>> 
> 
> 
> 
> -- 
> Dave Pawson
> XSLT XSL-FO FAQ.
> Docbook FAQ.
>

Received on Thursday, 4 November 2021 16:02:52 UTC