Re: Draft iXML minutes, 4 March 2025

The world is full of people who sometimes spell "Section" as "Article" in their data and I have been very much looking forward to XPath 4 and the invisible-xml() function as a better means of handling such cases. If an invisible XML parser is not guaranteed to return well-formed XML, that function becomes a complex means of generating error messages and I'll have to build a full alternative anyway, which turns the (real!) effort of writing a grammar into hard-to-justify overhead.

I think it's important to keep the constraint that you're guaranteed to get well-formed XML back from the parser for any conformant grammar.

-- Graydon

On Tue, Mar 18, 2025, at 10:12, Steven Pemberton wrote:
> "I think the spec implies that if a grammar could produce not well-formed XML, the grammar isn't conformant."
> 
> I think you may be right, depending on what "any" means here:
> 
> Grammars *must* be written so that any serialization of a parse tree produced from the grammar is well-formed XML
> 
> and one interpretation would disallow my ABC example, which I would consider wrong; so we may have to review that sentence.
> 
> Steven
> On Tuesday 18 March 2025 15:04:47 (+01:00), Steven Pemberton wrote:
> 
>> 
>> 
>> " I think a single root element is relatively easy to work out; but suppressed elements make it a little harder.
>> … I think we should think about removing the statement from the spec or we should talk about changing those to static errors."
>> 
>> I am a strong believer in static error checking, and we should make the class of static errors as large as possible. I try to give warnings about potential errors (such as orphan rules), and I even warn about certain classes of ambiguity that can be spotted statically.
>> 
>> But I think while there are some errors that are clearly in the static domain, such as undefined rules, and some that are probably dynamic, such as ambiguity, some are in a murky could-be-static, could-sometimes-be-dynamic area, and we should avoid making them normatively static, and leave that to smart implementations that can spot subclasses that are discoverable statically.
>> 
>> Steven
>> 
>> 
>> 
>> On Tuesday 18 March 2025 14:15:49 (+01:00), Steven Pemberton wrote:
>> 
>>> 
>>>> 
>>>> which I would rather not have to do.
>>> By which I mean, that this error should at best be a warning.
>>> 
>>> Steven
>>> 
>>>> 
>>>> 
>>>> On Tuesday 18 March 2025 13:50:26 (+01:00), Steven Pemberton wrote:
>>>> 
>>>>> 
>>>>> Looks like you had a great discussion in the status reports section. Sorry I missed it.
>>>>> 
>>>>> Bethan says: "What I'm interested in working on are tools that will treat your grammar as a generator rather than a recognizer."
>>>>> 
>>>>> I've written several of these in the past (for instance, when I wrote a version of Eliza, the Rogerian psychotherapist, I wrote another program to generate random paranoid ramblings for Eliza to respond to (https://cwi.nl/~steven/Talks/2024/09-oxford/ai.html#L2734)
>>>>> 
>>>>> In fact, they are quite easy to write, since it is just a recursive random path through the grammar tree. This is the complete code, where 'thing' is either a terminal or nonterminal ('choice' returns a random element of a sequence, in this case returning a random alternative from a rule):
>>>>> 
>>>>> HOW TO GENERATE thing 
>>>>>  FROM grammar:
>>>>>     SELECT:
>>>>>         
>>>>> nonterminal(thing):
>>>>>             FOR symbol 
>>>>> 
>>>>> IN choice grammar[thing]:
>>>>>             
>>>>>     GENERATE symbol FROM grammar
>>>>>         
>>>>> ELSE:
>>>>>             WRITE thing, " 
>>>>> "
>>>>> 
>>>>> And you generate one rambling with "GENERATE '<sentence>' FROM sentences"
>>>>> 
>>>>> Steven
>>>>> 
>>>>> On Tuesday 04 March 2025 16:42:47 (+01:00), Norm Tovey-Walsh wrote:
>>>>> 
>>>>> > Hi folks,
>>>>> >
>>>>> > Draft minutes are online:
>>>>> >
>>>>> > https://www.w3.org/2025/03/04-ixml-minutes.html
>>>>> >
>>>>> > Be seeing you,
>>>>> > norm
>>>>> >
>>>>> > --
>>>>> > Norm Tovey-Walsh
>>>>> > Saxonica
>>>>> >
>>>>> >
>>>>> 
>>>> 
>>> 
>> 
> 

Received on Friday, 21 March 2025 08:15:33 UTC