Legal citations for, e.g., usvms from Dan Connolly on 1999-11-09 (www-rdf-interest@w3.org from November 1999)

From: Dan Connolly <connolly@w3.org>
Date: Tue, 09 Nov 1999 12:51:42 -0600
To: www-rdf-interest@w3.org
CC: liberte@w3.org
Message-ID: <38286D3E.247DCDB4@w3.org>
If you'll excuse my use of the RDF IG for "bookmarking" ideas
that aren't really thought thru...

I read the Findings of Fact on Microsoft in the usvsms case[1],
and it reminded me of Philg's tutorial on legal citations[2] and it
seems to
me that

	-- the promise of the "semantic web" is automating
	(parts of) social protocols, and those social protocols
	are often grounded in law

	-- there are established conventions for legal citations

	-- more and more legal proceedings are published via the web
	all the time

	-- those legal proceedings are often copied in many places,
	and there's no recognized canonical URI for them, so
		-- caches don't help
		-- my browser doesn't tell me I've been there before
		-- etc.

So... some ideas...
	-- an RDF schema for legal citations
		(probably one schema per jurisdiction, with lots of
		sharing and sublcassing)

	-- a corresponding HTML form for each jurisdiction that, in effect,
	allows you to compute the address of a document

To take the example from philg's tutorial:

	Ford Motor Co. v. Lonon, 2117 Tenn 400, 398 S.W.2d 240 (1966)

Perhaps in RDF, I'd spell that:

	<RDF:Description xmlns="" xmlns:RDF="http:...I.forget..">
	 <plaintiff>Ford Motor Co.</plaintiff>
	 <defendant>Lonon</>
	 <volume>2117</>
	 <jurisdiction>Tenn</> <!-- there should be a URI for this;
				if not for the jurisdiction, then for
				the (web projection of) the reporter -->
	 <page>400</>
	 ...

Hmm... the other part:
	398 S.W.2d 240 (1966)

seems to have RDF:alternate semantics. And the year is related to the
dublin core notion of "coverage". Hmm...

Anyway... to compute the canonical address, you need to know
	(1) the address of the reporter of the jurisdiction;
	The web site for the state of tennesse is:
		http://www.state.tn.us/reporter
	so let's call it:
		http://www.state.tn.us/reporter

	(ok... so it would probably be in a subdomain for the judicial
	branch of government, ala the TN supreme court:
http://tscaoc.tsc.state.tn.us/
	but let's gloss over that for now.)

	(2) the RDF schema for that jurisdiction; let's say it just
	has defendant, plaintiff, volume, and page number.

I'd make an HTML form ala:

	<form action="http://www.state.tn.us/reporter">
	<input name="defendant" />
	<input name="plaintiff" />
	<input name="volume"/>
	<input name="page"/>
	</form>

hm... we may need conventions for canonical representations of page
numbers
(which we should be able to get from [3]). More tricky: canonical
spelling
of plaintiffs and defendants. Those won't be computable; in the general
case, you'll have to look at the published document to be sure.

Anyway... the resulting address is:

http://www.state.tn.us/reporter?defendant=Ford%20Motor%20Co.&defendant=Lonon&volume=2117&page=400

and there would be another address for the unofficial reporter, and
an assertion relating them.

Strictly speaking, we don't need the function from citation to address
to be computable locally; we can allow courts to publish an arbitrary
mapping, so that the canonical address of that case is something like:

	http://www.state.tn.us/archive/1966/32l4ij5203984u029384029

but my intuition says it's more cost-effective for the citation->
address mapping to be a globally deployed convention rather than
a web-site-private issue.

There are some thorny issues around copyright etc. of the actual page
numbers and such; I gather the Westlaw folks have defended their
ownership
of this stuff rigorously. But it's hard for me to believe that it's not
best for all concerned for courts to publish authoritative copyies of
their
stuff from their own web sites.

Cornell has published a bunch of stuff... for example
	U.C.C. - ARTICLE 3 - § 3-104.
	http://www.law.cornell.edu/ucc/3/3-104.html

Related issues:
	-- authenticity, non-repudiation
		one mechanism is digital signatures, but another
		mechanism is massively redundant publishing, ala newspapers,
		which is effectively non-repudiable
		(of course, it takes revenue away from Westlaw)

		(I have some notes on authenticity at
		http://www.w3.org/Architecture/qos that may be relevant)

		hmm... this looks interesting:
		The Authority Public Key Distribution Protocol
		http://www.oasis-open.org/cover/publicKeyXML.html

	-- format for the content itself
		(the TN court uses some friggin Java applet to publish
		their content! I wonder if that's the easiest way they
		found to extract data from their legacy database,
		or if its a copy restriction mechanism)

		e.g.
		TEI Extensions for Legal Text
		http://www.oasis-open.org/cover/finkeTEI10.html

		Legal XML Working Group
		http://www.oasis-open.org/cover/xml.html#legalXMLWG

		Legal XML
		http://www.legalxml.org/

	-- stable publishing
		guarantees of availability and persistance,
		perhaps with time-limits (ala DNS and ala phone
		company area code changes; they don't guarantee
		an address will work forever, but they tell you
		how much notice you'll get before a change, or
		how long you can cache a binding).

[1] United States of America v. Microsoft Corporation,
                               C.A. 98-1232
http://usvms.gpo.gov/

[2] Reading Legal Citations
by Philip Greenspun 
http://photo.net/philg/litigation/reading-cites.html

[3] XML Schema Part 2: Datatypes
http://www.w3.org/TR/xmlschema-2/


[Hmm... these citation issues are closely related to URI design and
philosophy; I considered crossposting to uri@w3.org, but decided
against it, for now.]

-- 
Dan Connolly, W3C
http://www.w3.org/People/Connolly/
Received on Tuesday, 9 November 1999 13:51:44 UTC