- From: Edward Bryant <edward.bryant@gmail.com>
- Date: Thu, 5 Oct 2006 10:57:37 -0500
- To: semantic-web@w3.org
- Message-ID: <4ccf35820610050857t31c1c126t91e54f21cd3f07d3@mail.gmail.com>
I am working on a open content non-profit project involving the publication of U.S. court documents. *http://*www.opengavel.com I am new to RDF technologies and I was wondering if someone could give me some advice and steer me in the right direction on where RDF may be useful for my project. The most obvious use of RDF would be as a method to describe metadata for the court documents the project posts on its website. Currently the documents are converted from the original (usually PDF or HTML files) into XML markup, that XML data gets transform via XSL into an standardized HTML version of the document. Since the site posts the original PDF, the raw XML version, and the new HTML version, my main confusion concerns where to place the metadata (in its own file, in each file, multiple separate files, etc.) I imagine I could generate the RDF in the new HTML during the XSL transformation, but should I also generate a second version of the XML file with the RDF markup? If the point is to be machine understandable, the metadata on the HTML version would only identify a document that can be displayed (not very useful but good for search engine indexing). RDF on the XML file, however, would present an opportunity for machine processing and understanding of the underlying XML data, especially if the the RDF also defined what the documents other tags meant (e.g., citations, footnotes, etc.) so that they could be more useful. Although the project's goal is currently limited to publication or republication of existing court documents, the ultimate goal is to create a foundation for others to create web-based legal research tools, so the ease at which others can automatically read, process, and understand the documents is obviously important. I also found myself creating database tables that define things such as proper abbreviations (U.S. court names, etc.). I assume an RDF that defines these relationships, which is then parsed and loaded into the database would be the best method because the knowledge (court name, abbreviation, state, court circuit, etc.) could be reused by me or others more easily, right? Unlike the metadata situation this would be a more pure knowledge use of RDF, right? Thanks, any advice is appreciated. Edward B.
Received on Friday, 6 October 2006 02:15:21 UTC