Re: Streaming OWL Parsers from Jonathan Borden on 2003-02-14 (www-webont-wg@w3.org from February 2003)

From: Jonathan Borden <jonathan@openhealth.org>
Date: Fri, 14 Feb 2003 16:48:24 -0500
To: "Sean Bechhofer" <seanb@cs.man.ac.uk>, "WebOnt WG" <www-webont-wg@w3.org>
Message-ID: <004f01c2d472$d0760170$b6f5d3ce@L565>

Sean Bechhofer wrote:

>
> We've been trying to build a streaming parser, but this is proving
> difficult with OWL represented as RDF triples. Given an OWL-RDF ontology,
> the various triples that make up any expression or assertion could be
> scattered liberally throughout the document, so even if I try and build
> things in a streaming fashion, there's a load of stuff that I'm going to
> have to cache or remember or make assumptions about and clean up later.

Yep. Though the idea about using RDF/XML as the exchange syntax for OWL is
that you ought not need to write your own OWL parser ... you pick or write
an RDF/XML parser.

Now you can always develop a specialized presentation syntax for OWL that is
optimized for a particular application ... do you find the same issues with
the OWL XML syntax?

>
> If we're talking about the kinds of example model that we've seen in
> things like the test sets, this is, of course, not an issue. Just build
> the data structure. Big deal. But what happens when I've got an ontology
> with 10^8 concepts/individuals in it and I want to do some simple
> processing on it, that doesn't necessarily warrant me building the whole
> data structure?

That would be called a database :-))

>
> <flippant>
> An analogy that springs to mind is that it's like building a model of the
> eiffel tower out of matchsticks. Except that what's happened is that
> someone's done it already, labelled each pair of touching ends of each
> matchstick with a unique number, dismantled it, and then given it to you
> in a box saying "what do you think that is then?".
> </flippant>
>
> Ok, perhaps an over-exaggeration, but it's what it feels like sometimes.
>

I think by "parser" you really mean that you are looking for an algorithm
that can iterate over a sequence of RDF triples and match the productions in
the OWL Abstract Syntax.

That is an interesting issue. If I were doing that, I'd look for a transform
that would reorder/rearrange the sequence of triples to allow a "one pass"
algorithm to match OWL productions. We might call such a transform some type
of "normalization" of a given triple store or ordered knowledge base. If it
were possible to develop such an algorithm it would be a terrific candidate
for an OWL "canonocalization" -- I expect that this work item won't make the
list for the current WG given time constraints :-(

Jonathan

Received on Friday, 14 February 2003 16:48:31 UTC