Re: The Mire 'twixt Documents And Data

With xlink, topic maps, and RDF, we have plenty of possiblities for annotating
documents, even third-party documents.  Provided, that is, they are marked up
in some useful way.  xhtml isn't usually enough for that.  Now if we want to
annotate non-xml-marked-up documents, well, that's what groves are supposed to
be good for, isn't it? (If you can somehow parse the non-markup documents,
anyway)

What we need are usable tools, preferably gui editor-like tools, to let us do
these things.  We've got the standards infrastructure, I think.  I want to be
able to take a document, highlight parts and add notes, comments (like, dare I
say it, you can do in MS Word), and links to other documents, and have an xml
document some else can work with and read too.


Cheers,

Tom P

Sean B. Palmer wrote about mixing xhtml with annotation markup -

...
> I believe that one of the best ways to transition into RDF, if not a
> long-term deployment strategy for RDF, is to manage the information in
> human-consumable form (XHTML) annotated with just enough info to extract the
> RDF statements that the human info is intended to convey. [...] We all know
> that we have to produce a human-readable version of the thing... why not use
> that as the primary source?
> ]]] - [2]
> Or in other words, using XHTML [3] as a repository for data, but one that
> can still be marked up with annotations, explanations, and summaries...aha!
> The key concepts we have here is the following: Data can be stored somehow
> in XHTML, and annotated with two different types of further data -
> annotation intended to facilitate the machine transformation and extraction
> of that data into machine (RDF?) form, and annotation to assist humans in
> the interpretation of that data [4].
...
> If we added those simple tags etc. to a kind of XHTML slurry, then we would
> have a lot more power to walk through the mire 'twixt documents and data.
> But this is all an abstract conversation isn't it? Not really. Browsers
> worldwide grok XHTML, and a few can use CSS to style other forms of XML. At
> the moment, to cleanly extract data from XHTML, we have to pepper it (i.e.
> annotate it) with hundreds of "classes" - class attributes [5] to imply our
> meaning, for example as discussed in the semantic design principles [6], and
> so instead we could just add a few custom based annotation and logic based
> tags (like the ones above) to (e.g.) m12n, and create a transformable form
> of XHTML, to bridge the gap.

Received on Saturday, 2 December 2000 12:57:41 UTC