Different uses of RDF

I am working on a open content non-profit project involving the publication
of U.S. court documents.

*http://*www.opengavel.com

I am new to RDF technologies and I was wondering if someone could give me
some advice and steer me in the right direction on where RDF may be
useful for my project.

The most obvious use of RDF would be as a method to describe metadata for
the court documents the project posts on its website. Currently the
documents are converted from the original (usually PDF or HTML files)
into XML markup, that XML data gets transform via XSL into an standardized
HTML version of the document. Since the site posts the original PDF, the raw
XML version, and the new HTML version, my main confusion concerns where to
place the metadata (in its own file, in each file, multiple separate files,
etc.) I imagine I could generate the RDF in the new HTML during the
XSL transformation, but should I also generate a second version of the XML
file with the RDF markup? If the point is to be machine understandable, the
metadata on the HTML version would only identify a document that can be
displayed (not very useful but good for search engine indexing). RDF on the
XML file, however, would present an opportunity for machine processing and
understanding of the underlying XML data, especially if the the RDF also
defined what the documents other tags meant (e.g., citations, footnotes,
etc.) so that they could be more useful. Although the project's goal is
currently limited to publication or republication of existing court
documents, the ultimate goal is to create a foundation for others to create
web-based legal research tools, so the ease at which others
can automatically read, process, and understand the documents is obviously
important.

I also found myself creating database tables that define things such
as proper abbreviations (U.S. court names, etc.). I assume an RDF that
defines these relationships, which is then parsed and loaded into the
database would be the best method because the knowledge (court name,
abbreviation, state, court circuit, etc.) could be reused by me or others
more easily, right? Unlike the metadata situation this would be a more pure
knowledge use of RDF, right?

Thanks, any advice is appreciated.

Edward B.

Received on Friday, 6 October 2006 02:15:21 UTC