- From: David Birnbaum <djbpitt@gmail.com>
- Date: Tue, 28 Jan 2025 12:35:40 -0500
- To: graydonish@gmail.com
- Cc: "Liam R. E. Quin" <liam@fromoldbooks.org>, LdBeth <andpuke@foxmail.com>, ixml <public-ixml@w3.org>
- Message-ID: <CAP4v81rZmRm1Sv7VdoDRd6ewjM+jOjVhiUBJpVWu_PZyfhRhQg@mail.gmail.com>
Dear Graydon (cc public-ixml), Thank you; your description of the distinction between "the plain text conforms to a clear, consistent, and documented (or, at least, documentable) structure" and "a human recognizes the structure of the plain text but that structure is not represented in a clear, consistent, and documented way, and must therefore be discovered, perhaps painfully" is helpful. Best, David On Tue, Jan 28, 2025 at 11:50 AM Graydon <graydonish@gmail.com> wrote: > On Tue, Jan 28, 2025 at 12:40:04AM -0500, David Birnbaum scripsit: > > This leaves me still wondering whether there are rules of thumb for > > choosing between using regex (e.g., analyze-string()) and using ixml > > when both are available. > > Have you got rules, or do you need rules? > > E.g., "this string is a citation conforming to a known set of written > rules" or "I need to contract with my client that the generated text > will conform to productions of an agreed grammar". (Or "these are 80 > column records with known fields".) > > iXML is good for those. > > If it's "the rules must be discovered", as in the sort of conversion > project were you've used fifteen passes to walk source to target in > comprehensible steps, iXML is NOT good for those. (In theory, yes, > a grammar could be written, but the cognitive load to write it is not > an especially practical undertaking.) > > I am really looking forward to being able to pass those citation strings > to a grammar; having to tweeze them apart with regular expressions fails > to delight. > > -- Graydon > > -- > Graydon Saunders | graydonish@fastmail.com > Þæs oferéode, ðisses swá mæg. > -- Deor ("That passed, so may this.") >
Received on Tuesday, 28 January 2025 17:35:56 UTC