Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

Dear Sebastian, dear Kay,

I'm not sure this has already been mentioned in the thread, but another 
service providing bulk download is GLEIF (Global Legal Entity Identifier 
Foundation):

https://www.gleif.org/en/lei-data/gleif-concatenated-file/lei-download

They now have around 400K records.

GLEIF is operating also HTTP URIs for companies / organisations with a 
LEI (Legal Entity Identifier). Those URIs return just HTML, but other 
formats are available from their discovery page - the search result can 
be exported in .xls, .csv. .xml, and .json - see:

https://www.gleif.org/lei/search

Cheers,

Andrea


On 03/11/2015 16:17, Sebastian Hellmann wrote:
> [Apologies for cross-posting]
>
> Dear all,
> this message is part announcement of an open data initiative and part
> call for feedback and support.
>
> We are considering to work on creating a free, open and interoperable
> dataset on companies and organisations, which we are planing to
> integrate into DBpedia+ and offer as dump download. As we are in a very
> early phase of the endeavour, we would like to know whether there is
> existing work in this area.
>
> We are looking for any available datasets which have information about
> companies and other organizations in any language and any country.
> Ideally, the datasets are:
> 1. downloadable as dump
> 2. openly licensed , e.g. CC-BY following the http://opendefinition.org/
> 3. in an easily parseable format, e.g. RDF or CSV and not PDF
>
> But hey! Send around anything you know, and we will look at it and see
> whether we can make use of it. You can reach us either by replying  to
> this email or send feedback directly to me and Kay Müller
> <kay.mueller@informatik.uni-leipzig.de>.
> If you have any private/closed data, please contact us as well. We might
> make use of it to cross-reference and validate public/open data with it.
> Or just learn from it to build a good scheme.
>
> We started a link collection here (and attached the current status at
> the end of this email)
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
> Also we started to collect potential identifiers for linking here:
> https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>
> Regards and thank you for any support on this,
> Sebastian and Kay
>
> ##############################
>
> https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
>
>
> *
>
>
>   Open Company Data
>
> Open Company Data
> <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.buuo7dypfd9a>
>
> Identifiers for companies/organisation
> <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.qs150ivpio94>
>
> URIs (Linked Data/Semantic Web)
> <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.b9yeovqjeglz>
>
> Downloadable Datasets with Company info (confirmed)
> <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.7ihxrlrypp14>
>
> Portals with no bulk downloads
> <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.a95o85lqil72>
>
> Portals, we will still need to investigate
> <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.p50bjh96q3ok>
>
>
>
>     Identifiers for companies/organisation
>
> Table with identifiers:
>
> <https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0>https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
>
>
>       URIs (Linked Data/Semantic Web)
>
>   *
>
>     DBpedia/Wikipedia/Wikidata URIs - <http://dbpedia.org>http://dbpedia.org
>
>   *
>
>     LinkedGeoData - <http://linkedgeodata.org/>http://linkedgeodata.org/
>
>
>     DownloadableDatasets with Company info (confirmed)
>
>   *
>
>     VIAF - <http://viaf.org/viaf/data/>http://viaf.org/viaf/data/
>
>   *
>
>     DBpedia -
>     <http://downloads.dbpedia.org/current/core/>http://downloads.dbpedia.org/current/core/
>
>   *
>
>     Wikidata -
>     <http://downloads.dbpedia.org/current/ext/wikidata/>http://downloads.dbpedia.org/current/ext/wikidata/
>
>   *
>
>     LinkedGeoData -
>     <http://downloads.linkedgeodata.org/releases/>http://downloads.linkedgeodata.org/releases/
>
>   *
>
>     Company Data Index:
>     <http://index.okfn.org/dataset/companies/>http://index.okfn.org/dataset/companies/
>
>       o
>
>         e.g. UK company data:
>         <http://download.companieshouse.gov.uk/en_output.html>http://download.companieshouse.gov.uk/en_output.html
>
>
>     Portals with no bulk downloads
>
>   *
>
>     <https://opencorporates.com/>https://opencorporates.com/
>
>   *
>
>     <http://registries.opencorporates.com/>http://registries.opencorporates.com/
>
>
>     Portals, we will still need to investigate
>
>
>   *
>
>     <https://www.wlw.de/>https://www.wlw.de/
>
>   *
>
>     <https://www.crunchbase.com>https://www.crunchbase.com
>
>   *
>
>     <http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm>http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm
>
>   *
>
>     <http://www.industrystock.de>http://www.industrystock.de
>
>   *
>
>     <http://www.ebr.org/>http://www.ebr.org/
>
>   *
>
>     <https://simfin.com/data/browse/companies>https://simfin.com/data/browse/companies
>
>   *
>
>     <http://c-lei.org/>http://c-lei.org/
>
>   *
>
>     <http://data.imf.org/>http://data.imf.org/
>
>   *
>
>     <http://worldbank.270a.info/.html>http://worldbank.270a.info/.html
>
>   *
>
>     <http://datacatalog.worldbank.org/>http://datacatalog.worldbank.org/
>
>   *
>
>     <http://www.europages.com/>http://www.europages.com/
>
>   *
>
>     <http://www.sec.gov/data>http://www.sec.gov/data
>
>   *
>
>     <http://faculty.philau.edu/russowl/industry.html>http://faculty.philau.edu/russowl/industry.html
>
>   *
>
>     USA: http://www.corporationwiki.com/
>
>   *
>
>     India: http://www.companywiki.in/
>
>   *
>
>     Handelsregister: www.Handelsregister.de
>
>   *
>
>     Creditreform: http://www.creditsafetrial.com/de/?country=DE
>
>   *
>
>     Bürgel: https://www.buergel.de/en
>
>   *
>
>     Factiva:
>     https://global.factiva.com/factivalogin/login.asp?productname=global
>
>   *
>
>
> Interesting Links:
>
>   *
>
>     German
>     <http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/>http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/
>
>   *
>
>     <http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/>http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/
>
> *
>
> --
> Sebastian Hellmann
> AKSW/KILT research group
> Insitute for Applied Informatics (InfAI) at Leipzig University
> DBpedia Association
> Events:
> * *Nov 20th, 2015* Extended Deadline for Quality Management of Semantic
> Web Assets (Data, Services and Systems)
> <http://www.semantic-web-journal.net/blog/call-papers-special-issue-quality-management-semantic-web-assets-data-services-and-systems>
> Venha para a Alemanha como PhD:
> <http://bis.informatik.uni-leipzig.de/csf>http://bis.informatik.uni-leipzig.de/csf
> Projects: http://dbpedia.org, http://nlp2rdf.org,
> <http://linguistics.okfn.org>http://linguistics.okfn.org,
> https://www.w3.org/community/ld4lt <http://www.w3.org/community/ld4lt>
> Homepage: http://aksw.org/SebastianHellmann
> Research Group: http://aksw.org
> Thesis:
> http://tinyurl.com/sh-thesis-summary
> http://tinyurl.com/sh-thesis

-- 
Andrea Perego, Ph.D.
Scientific / Technical Project Officer
European Commission DG JRC
Institute for Environment & Sustainability
Unit H06 - Digital Earth & Reference Data
Via E. Fermi, 2749 - TP 262
21027 Ispra VA, Italy

https://ec.europa.eu/jrc/

Received on Monday, 7 December 2015 14:52:48 UTC