W3C home > Mailing lists > Public > public-lod@w3.org > April 2008

Re: U.S. corporate ownership RDF data

From: Joshua Tauberer <jt@occams.info>
Date: Tue, 22 Apr 2008 20:04:07 -0400
Message-ID: <480E7CF7.7040502@occams.info>
To: Richard Cyganiak <richard@cyganiak.de>
CC: public-lod@w3.org

Richard Cyganiak wrote:
> 1. I tried to open the example URIs in the Tabulator data browser, but 
> for some reason Tabulator doesn't kick in, I just see Firefox's usual 
> RDF source view. The “Get Info” window in Firefox says that the RDF is 
> served as text/xml, which would explain the effect. But if I try to 
> confirm this with curl it seems like the results are actually correctly 
> served as application/rdf+xml. Any idea what's going on?

It's possible that if text/xml is specified in the Accept: request 
header (perhaps in addition to application/rdf+xml), that text/xml may 
be the content type set in the response. I'll have to take a closer look.

> 2. You use 302 redirects to get from the identifiers for people and 
> companies to their RDF descriptions. Is this an oversight? Shouldn't it 
> be 303, because the redirect goes from one resource (e.g. a person) to a 
> different resource (an RDF document about the person)?

It's an oversight (though I think the difference is rather silly).

> 3. Is the schema (especially the ussec namespace) documented anywhere? 
 > Is there a list of all the available properties?

Nope. Except in the source code of the parser.

> I tried to get such a 
> list using "SELECT DISTINCT ?p WHERE { ?s ?p ?o }" but this seems to 
> exceed the endpoint's execution time limit.

Yeah. That's a difficult one to execute rapidly... Also note that it's 
the same end point that serves the Census data set and everything else I 
have, so that's potentially scanning a billion statements.

> 4. As I'm not familiar with the U.S. environment: What is the coverage 
> of this data? Is it all publicly traded U.S. companies?

I'm not sure. Data entries only exist when stock holdings change, so 
really nothing is guaranteed to be in there.

> 5. I see you have a Semantic Sitemap at rdfabout.com -- can you add this 
> dataset to it? (This helps us index the dataset into Sindice.)

Ooops, yes.

Thanks.

-- 
- Josh Tauberer

http://razor.occams.info

"Yields falsehood when preceded by its quotation!  Yields
falsehood when preceded by its quotation!" Achilles to
Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)


> 
> Cheers, and keep up the great work,
> Richard
> 
> 
> On 19 Apr 2008, at 13:15, Joshua Tauberer wrote:
>>
>> (cross-posted to LOD and get-theinfo...)
>>
>> In response to a thread on Aaron Swartz's get-theinfo list, I
>> resurrected my RDF data for U.S. corporate ownership derived from
>> publicly filed records to the U.S. Securities and Exchange commission's
>> EDGAR database.
>>
>> It's 1 million triples, HTTP and SPARQL-accessible. More here 
>> (including source code, data dump, and examples):
>> http://rdfabout.com/demo/sec/
>>
>> The records establish board membership, officer positions, and 
>> 10%-or-more ownership relations. Note that people can enter into any 
>> of those relations with corporations, but additionally corporations 
>> can be 10% owners of other corporations. The records exist at time 
>> points when the interest (i.e. stock ownership) of an individual or 
>> corporation that is in one of the relations above with a corporation 
>> changes. It is thus possible (and likely) that individuals who are no 
>> longer in such a relation with a corporation are still listed as such 
>> in this data.
>>
>> Here are some starting points:
>>
>> News Corp (owner of FOX, WSJ, and other media things):
>> http://www.rdfabout.com/rdf/usgov/sec/id/cik0001308161
>>
>> Rupert Murdoch (media mogul behind News Corp):
>> http://www.rdfabout.com/rdf/usgov/sec/id/cik0001024835
>>
>> There are no links to other data sets.
>>
>> -- 
>> - Josh Tauberer
>>
>> http://razor.occams.info
>>
>> "Yields falsehood when preceded by its quotation!  Yields
>> falsehood when preceded by its quotation!" Achilles to
>> Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
>>
>>
Received on Wednesday, 23 April 2008 00:04:41 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:16 UTC