Re: Some checklink WIP stuff from Ville Skyttä on 2004-09-23 (public-qa-dev@w3.org from September 2004)

From: Ville Skyttä <ville.skytta@iki.fi>
Date: Thu, 23 Sep 2004 20:09:47 +0300
To: QA-dev <public-qa-dev@w3.org>
Message-Id: <1095959387.2960.56.camel@bobcat.mine.nu>

Sorry about being late with this, I've been extremely busy, and the
cream on top of that cake is an ongoing hardware "upgrade" with my main
devel box which hasn't quite gone as planned :I  (Still "under
construction", up now temporarily in order to purge some of the piled up
mail.)

Anyway, RFC re: WIP link checker stuff:

0) I assume the SAX-like event-based interface for extracting links and
anchors has been sort of agreed on.  We may need more than that for
dynamic fragment extraction (eg. XPointer), but unless someone has good
ideas how to handle that elegantly now, let's think about it later.

1) We should try to come up with a sane representation of a "link" and a
"fragment" in terms of what gets passed to the respective handlers.  At
least both of them have a "value", as well as a locator of some kind.

The "value" part is pretty trivial I think, but the locator needs some
thought.  This is related to Bjoern's earlier m12n post where he
outlined a similar design task for "problem" representation in a wider
scope, so it seems we'd better be consistent here.

A locator could include something like line and column numbers if
applicable, as well as some info about the context.  For example, in XML
documents, this context in the locator could be the element (+ attribute
if applicable) where the "thing" (link|fragment|problem|$whatever)
corresponding to the locator occurred.  Now, how to design the API for
that locator so that it would "work" almost anywhere, for example
"things" in HTTP headers, CSS, binary $whatever documents, etc etc
should be discussed.  Flattening the base locator interface into a one
simple as_string() method is one possiblity (eg.
line:column:element/@attribute for XML), but it would have a negative
effect on the possibilities how to represent that in human readable form
in different contexts (plain text, HTML, XML, etc).  Maybe an URI should
be included in the locator as well, or maybe the handlers can always get
it from somewhere else, dunno.

2) My initial implementation of the XML Base stuff modified the event
stream, by adding things to it.  What do people think, does it matter if
those are left in it?  One cleanup possibility would be to hook another
handler into the event stream (right after the things relying on
xml:base always being populated), and have that filter out the things
that were not in the original stream.

That's about it for the things off the top of my head now, probably more
RFC's will follow.

Received on Friday, 24 September 2004 07:09:52 UTC