W3C home > Mailing lists > Public > www-rdf-interest@w3.org > June 2001

Java RDF Parsers

From: Aaron Swartz <aswartz@swartzfam.com>
Date: Mon, 04 Jun 2001 23:59:07 -0500
To: RDF Interest <www-rdf-interest@w3.org>
CC: Douglas Campbell <Douglas.Campbell@natlib.govt.nz>
Message-ID: <B741D149.D27B%aswartz@swartzfam.com>
I'm forwarding this message on behalf of Douglas. Please keep him in the CC
list of any responses, as he is not on the list.
    - [ Aaron Swartz | me@aaronsw.com | http://www.aaronsw.com ]

I've had a look at RDFFilter and SiRPAC (and a couple of others on Dave
Beckett's list) and yes it's easy to get the tuples.  It's what to do with
tuples that's harder.  My aim is to extract the DC, DCQ, and possibly other
non-DC elements so I can put them in a database and make them searchable.
This means navigating through the generated tuples recombining them back
into "useful" chunks of data.

Some of the grey areas for me:
- When re-combining tuples, how can I be sure which predicate is an RDF
value and which is a dc:title, etc. etc. - if other namespaces were used for
predicates instead of "http://www.w3.org/1999/02/22-rdf-syntax-ns#value" and
"http://purl.org/dc/elements/1.1/title" etc. I'd miss them altogether.
- I haven't worked through the Dumb-Down algorithm in "Expressing Qualified
Dublin Core in RDF" in detail but it looks like it only dumbs down to Simple
DC, not bare Qualified DC.
- I haven't quite worked out how to incorporate RDF schema lookups in the
Java tools yet :-(
- I'm a tad confused which RDF tool I should use anyway - from what I can
gather there are a number of proposals for an RDF API (namely Jena and
SiRPAC) but no consensus yet, and SiRPAC has 2 flavours - the W3C download
or the Stanford download.  I know there should never be only one choice for
a tool, but it's useful to know which are popular and which have a
compatibility future - I chose SiRPAC as DCMI's "EOR (Extensible Open Rdf)
Toolkit" seem to use it.

DC/DCQ is easier, RDF is richer, but it's a big leap of faith to RDF - eg.
if I dumb it back down to DC/DCQ I'll probably loose data, but to keep the
full RDF richness, I suppose I'd really need to move to an RDF-aware
database system [which seems unjustified as at this stage I haven't
encountered any RDF yet in my pilot DC crawls of the New Zealand domain].

Does anyone know of any [Java] RDF tools which a. are DC/DCQ aware and/or b.
lookup RDF schemas or c. incorporate the Dumb Down algorithm.

Douglas Campbell
National Library of New Zealand
Received on Tuesday, 5 June 2001 00:59:18 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:07:36 UTC