Re: Ambiguity (what else!?) question from David Birnbaum on 2025-01-28 (public-ixml@w3.org from January 2025)

From: David Birnbaum <djbpitt@gmail.com>
Date: Tue, 28 Jan 2025 12:35:40 -0500
To: graydonish@gmail.com
Cc: "Liam R. E. Quin" <liam@fromoldbooks.org>, LdBeth <andpuke@foxmail.com>, ixml <public-ixml@w3.org>
Message-ID: <CAP4v81rZmRm1Sv7VdoDRd6ewjM+jOjVhiUBJpVWu_PZyfhRhQg@mail.gmail.com>

Dear Graydon (cc public-ixml),

Thank you; your description of the distinction between "the plain text
conforms to a clear, consistent, and documented (or, at least,
documentable) structure" and "a human recognizes the structure of the plain
text but that structure is not represented in a clear, consistent, and
documented way, and must therefore be discovered, perhaps painfully" is
helpful.

Best,

David

On Tue, Jan 28, 2025 at 11:50 AM Graydon <graydonish@gmail.com> wrote:

> On Tue, Jan 28, 2025 at 12:40:04AM -0500, David Birnbaum scripsit:
> > This leaves me still wondering whether there are rules of thumb for
> > choosing between using regex (e.g., analyze-string()) and using ixml
> > when both are available.
>
> Have you got rules, or do you need rules?
>
> E.g., "this string is a citation conforming to a known set of written
> rules" or "I need to contract with my client that the generated text
> will conform to productions of an agreed grammar". (Or "these are 80
> column records with known fields".)
>
> iXML is good for those.
>
> If it's "the rules must be discovered", as in the sort of conversion
> project were you've used fifteen passes to walk source to target in
> comprehensible steps, iXML is NOT good for those. (In theory, yes,
> a grammar could be written, but the cognitive load to write it is not
> an especially practical undertaking.)
>
> I am really looking forward to being able to pass those citation strings
> to a grammar; having to tweeze them apart with regular expressions fails
> to delight.
>
> -- Graydon
>
> --
> Graydon Saunders  | graydonish@fastmail.com
> Þæs oferéode, ðisses swá mæg.
> -- Deor  ("That passed, so may this.")
>

Received on Tuesday, 28 January 2025 17:35:56 UTC