Re: U.S. corporate ownership RDF data

Josh,

On 23 Apr 2008, at 01:04, Joshua Tauberer wrote:
> Richard Cyganiak wrote:
>> 1. I tried to open the example URIs in the Tabulator data browser,  
>> but for some reason Tabulator doesn't kick in, I just see Firefox's  
>> usual RDF source view. The “Get Info” window in Firefox says that  
>> the RDF is served as text/xml, which would explain the effect. But  
>> if I try to confirm this with curl it seems like the results are  
>> actually correctly served as application/rdf+xml. Any idea what's  
>> going on?
>
> It's possible that if text/xml is specified in the Accept: request  
> header (perhaps in addition to application/rdf+xml), that text/xml  
> may be the content type set in the response. I'll have to take a  
> closer look.

That's what I thought too, but couldn't make the server reply with  
text/xml with the header that Tabulator uses (a little helper for  
learning what header a particular RDF browser can be found at [1]).

>> I tried to get such a list using "SELECT DISTINCT ?p WHERE { ?s ?p ? 
>> o }" but this seems to exceed the endpoint's execution time limit.
>
> Yeah. That's a difficult one to execute rapidly... Also note that  
> it's the same end point that serves the Census data set and  
> everything else I have, so that's potentially scanning a billion  
> statements.

Yes, I appreciate the difficulty of making this kind of very general  
query run fast on the amount of data you serve. Having some vocabulary  
documentation around (such as an RDFS document) lessens the need for  
such queries. Without either one (documentation or ability to run the  
queries) it's fairly hard to get a feel for the data.

>> 5. I see you have a Semantic Sitemap at rdfabout.com -- can you add  
>> this dataset to it? (This helps us index the dataset into Sindice.)

>
> Ooops, yes.

I notice that there are still a bunch of example.org URIs in the  
dataset, would be great if you could fix/remove them.

Best,
Richard

[1] http://dowhatimean.net/2008/03/what-is-your-rdf-browsers-accept-header

>
>
> Thanks.
>
> -- 
> - Josh Tauberer
>
> http://razor.occams.info
>
> "Yields falsehood when preceded by its quotation!  Yields
> falsehood when preceded by its quotation!" Achilles to
> Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
>
>
>> Cheers, and keep up the great work,
>> Richard
>> On 19 Apr 2008, at 13:15, Joshua Tauberer wrote:
>>>
>>> (cross-posted to LOD and get-theinfo...)
>>>
>>> In response to a thread on Aaron Swartz's get-theinfo list, I
>>> resurrected my RDF data for U.S. corporate ownership derived from
>>> publicly filed records to the U.S. Securities and Exchange  
>>> commission's
>>> EDGAR database.
>>>
>>> It's 1 million triples, HTTP and SPARQL-accessible. More here  
>>> (including source code, data dump, and examples):
>>> http://rdfabout.com/demo/sec/
>>>
>>> The records establish board membership, officer positions, and 10%- 
>>> or-more ownership relations. Note that people can enter into any  
>>> of those relations with corporations, but additionally  
>>> corporations can be 10% owners of other corporations. The records  
>>> exist at time points when the interest (i.e. stock ownership) of  
>>> an individual or corporation that is in one of the relations above  
>>> with a corporation changes. It is thus possible (and likely) that  
>>> individuals who are no longer in such a relation with a  
>>> corporation are still listed as such in this data.
>>>
>>> Here are some starting points:
>>>
>>> News Corp (owner of FOX, WSJ, and other media things):
>>> http://www.rdfabout.com/rdf/usgov/sec/id/cik0001308161
>>>
>>> Rupert Murdoch (media mogul behind News Corp):
>>> http://www.rdfabout.com/rdf/usgov/sec/id/cik0001024835
>>>
>>> There are no links to other data sets.
>>>
>>> -- 
>>> - Josh Tauberer
>>>
>>> http://razor.occams.info
>>>
>>> "Yields falsehood when preceded by its quotation!  Yields
>>> falsehood when preceded by its quotation!" Achilles to
>>> Tortoise (in "Godel, Escher, Bach" by Douglas Hofstadter)
>>>
>>>
>

Received on Wednesday, 23 April 2008 12:29:06 UTC