W3C home > Mailing lists > Public > public-lod@w3.org > November 2015

Re: Are there any datasets about companies? ( DBpedia Open Data Initiative)

From: Gannon Dick <gannon_dick@yahoo.com>
Date: Fri, 6 Nov 2015 17:49:57 +0000 (UTC)
To: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>, Kay Müller <kay.mueller@informatik.uni-leipzig.de>, Chris Taggart <countculture@gmail.com>, <public-lod@w3.org>, Rolf Kleef <rolf@openforchange.info>
Message-ID: <138458771.780694.1446832197211.JavaMail.yahoo@mail.yahoo.com>
Hi all,

Organizational Identifiers are a bit dangerous for the little people to talk about :-)

1) First, some food for thought ... if FOAF identifies real people rigorously, one would think complexity less and convergence faster for many fewer organizations.  That would make no sense, unless (reads manual).
2) Second, an observation ... is the "Open World" assumption an HTML ordered list or an HTML unordered list ?  Who decides ? Hint: Moses had 10 Commandments, but plainly meant an unordered list.  Even the most (hardened agnostic) developer should be able to admit that 10 Commandments in an unordered list and 10 Items in an ordered list is not a valid substitution pattern. Learning to love Turtle is not a resolution to this dilemma, BTW.
3) Strategy Markup Language (StratML) Collections resolve these issues by using a compound key for the <StrategicPlanCore>:
a) Organization Name -> Acronym (caps of Proper Case)
b) Subdivision Name -> UUID, Acronym (caps of Proper Case)
c) (StratML (XML) File Name) ->   (Acronym from a) DOT (Acronym from b) DOT xml

This can enable styling within the Core by CSS or XSLT while maintaining Collection integrity because an OUTER JOIN on Organization Name preserves a collection of right-directed graphs.  If this sounds like slavery to you, take a nap, "they" can't own your dreams ;-)

--Gannon
--------------------------------------------
On Thu, 11/5/15, Rolf Kleef <rolf@openforchange.info> wrote:

 Subject: Re: Are there any datasets about companies? ( DBpedia Open Data  Initiative)
 To: "Sebastian Hellmann" <hellmann@informatik.uni-leipzig.de>, "Kay Müller" <kay.mueller@informatik.uni-leipzig.de>, "Chris Taggart" <countculture@gmail.com>, public-lod@w3.org
 Date: Thursday, November 5, 2015, 6:49 AM
 
 Hi Sebastian, Kay,
 
 If you haven't done it
 yet, I suggest getting in touch with Chris
 Taggart of Open Corporates (cc'd). He has
 years of experience doing
 this, and is also
 involved in cross-standards work on "organisational
 identifiers", crucial in the development
 of for instance the Open
 Contracting Data
 Standard and the International Aid Transparancy
 Initiative:
 
 http://www.open-contracting.org/
 http://iatistandard.org/201/organisation-identifiers/
 
 ~~Rolf.
 
 On 03/11/15 16:17, Sebastian Hellmann wrote:
 > [Apologies for cross-posting]
 > 
 > Dear all,
 > this message is part announcement of an
 open data initiative and part
 > call for
 feedback and support.
 > 
 > We are considering to work on creating a
 free, open and interoperable
 > dataset on
 companies and organisations, which we are planing to
 > integrate into DBpedia+ and offer as dump
 download. As we are in a very
 > early
 phase of the endeavour, we would like to know whether there
 is
 > existing work in this area.
 > 
 > We are looking for
 any available datasets which have information about
 > companies and other organizations in any
 language and any country.
 > Ideally, the
 datasets are:
 > 1. downloadable as
 dump
 > 2. openly licensed , e.g. CC-BY
 following the http://opendefinition.org/
 > 3. in an easily parseable format, e.g. RDF
 or CSV and not PDF
 > 
 > But hey! Send around anything you know,
 and we will look at it and see
 > whether
 we can make use of it. You can reach us either by replying 
 to
 > this email or send feedback directly
 to me and Kay Müller
 > <kay.mueller@informatik.uni-leipzig.de>.
 > If you have any private/closed data,
 please contact us as well. We might
 >
 make use of it to cross-reference and validate public/open
 data with it.
 > Or just learn from it to
 build a good scheme.
 > 
 > We started a link collection here (and
 attached the current status at
 > the end
 of this email)
 > https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
 > Also we started to collect potential
 identifiers for linking here:
 > https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
 > 
 > Regards and thank
 you for any support on this,
 > Sebastian
 and Kay
 > 
 >
 ##############################
 > 
 > https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit
 > 
 > 
 > *
 > 
 > 
 >   Open
 Company Data
 > 
 >
 Open Company Data
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.buuo7dypfd9a>
 > 
 > Identifiers for
 companies/organisation
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.qs150ivpio94>
 > 
 > URIs (Linked
 Data/Semantic Web)
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.b9yeovqjeglz>
 > 
 > Downloadable
 Datasets with Company info (confirmed)
 >
 <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.7ihxrlrypp14>
 > 
 > Portals with no bulk
 downloads
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.a95o85lqil72>
 > 
 > Portals, we will
 still need to investigate
 > <https://docs.google.com/document/d/1IaWSSt4_SZVhypvB1QzBlCtBuMQHv-q5Ti0n8xoZFIQ/edit#heading=h.p50bjh96q3ok>
 > 
 > 
 > 
 > 
    Identifiers for companies/organisation
 > 
 > Table with
 identifiers:
 > 
 >
 <https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0>https://docs.google.com/spreadsheets/d/1EMqemA1BlqvyOXGLzYbvY0IcBCAhaRd5XgYLMWIxGsA/edit#gid=0
 > 
 > 
 >       URIs (Linked
 Data/Semantic Web)
 > 
 >   *
 > 
 >     DBpedia/Wikipedia/Wikidata
 URIs - <http://dbpedia.org>http://dbpedia.org
 >
 
 >   *
 >
 
 >     LinkedGeoData - <http://linkedgeodata.org/>http://linkedgeodata.org/
 > 
 > 
 >     DownloadableDatasets with
 Company info (confirmed)
 > 
 >   *
 > 
 >     VIAF - <http://viaf.org/viaf/data/>http://viaf.org/viaf/data/
 > 
 >   *
 > 
 > 
    DBpedia -
 > 
    <http://downloads.dbpedia.org/current/core/>http://downloads.dbpedia.org/current/core/
 > 
 >   *
 > 
 > 
    Wikidata -
 > 
    <http://downloads.dbpedia.org/current/ext/wikidata/>http://downloads.dbpedia.org/current/ext/wikidata/
 > 
 >   *
 > 
 > 
    LinkedGeoData -
 > 
    <http://downloads.linkedgeodata.org/releases/>http://downloads.linkedgeodata.org/releases/
 > 
 >   *
 > 
 > 
    Company Data Index:
 > 
    <http://index.okfn.org/dataset/companies/>http://index.okfn.org/dataset/companies/
 > 
 >   
    o
 > 
 > 
        e.g. UK company data:
 >         <http://download.companieshouse.gov.uk/en_output.html>http://download.companieshouse.gov.uk/en_output.html
 > 
 > 
 >     Portals with no bulk
 downloads
 > 
 >   *
 > 
 >     <https://opencorporates.com/>https://opencorporates.com/
 > 
 >   *
 > 
 > 
    <http://registries.opencorporates.com/>http://registries.opencorporates.com/
 > 
 > 
 >     Portals, we will still
 need to investigate
 > 
 > 
 >   *
 > 
 > 
    <https://www.wlw.de/>https://www.wlw.de/
 >
 
 >   *
 >
 
 >     <https://www.crunchbase.com>https://www.crunchbase.com
 > 
 >   *
 > 
 > 
    <http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm>http://data.crunchbase.com/v3/page/crunchbase-open-data-map-odm
 > 
 >   *
 > 
 > 
    <http://www.industrystock.de>http://www.industrystock.de
 > 
 >   *
 > 
 > 
    <http://www.ebr.org/>http://www.ebr.org/
 >
 
 >   *
 >
 
 >     <https://simfin.com/data/browse/companies>https://simfin.com/data/browse/companies
 > 
 >   *
 > 
 > 
    <http://c-lei.org/>http://c-lei.org/
 >
 
 >   *
 >
 
 >     <http://data.imf.org/>http://data.imf.org/
 > 
 >   *
 > 
 > 
    <http://worldbank.270a.info/.html>http://worldbank.270a.info/.html
 > 
 >   *
 > 
 > 
    <http://datacatalog.worldbank.org/>http://datacatalog.worldbank.org/
 > 
 >   *
 > 
 > 
    <http://www.europages.com/>http://www.europages.com/
 > 
 >   *
 > 
 > 
    <http://www.sec.gov/data>http://www.sec.gov/data
 > 
 >   *
 > 
 > 
    <http://faculty.philau.edu/russowl/industry.html>http://faculty.philau.edu/russowl/industry.html
 > 
 >   *
 > 
 >     USA:
 http://www.corporationwiki.com/
 > 
 >   *
 > 
 > 
    India: http://www.companywiki.in/
 > 
 >   *
 > 
 > 
    Handelsregister: www.Handelsregister.de
 > 
 >   *
 > 
 > 
    Creditreform: http://www.creditsafetrial.com/de/?country=DE
 > 
 >   *
 > 
 > 
    Bürgel: https://www.buergel.de/en
 > 
 >   *
 > 
 > 
    Factiva:
 > 
    https://global.factiva.com/factivalogin/login.asp?productname=global
 > 
 >   *
 > 
 > 
 > Interesting Links:
 >
 
 >   *
 >
 
 >     German
 >     <http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/>http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-1/
 > 
 >   *
 > 
 > 
    <http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/>http://get.torial.com/blog/2014/02/die-besten-quellen-fuer-wirtschaftsjournalisten-teil-2/
 > 
 > *
 > 
 > -- 
 > Sebastian Hellmann
 >
 AKSW/KILT research group
 > Insitute for
 Applied Informatics (InfAI) at Leipzig University
 > DBpedia Association
 >
 Events:
 > * *Nov 20th, 2015* Extended
 Deadline for Quality Management of Semantic
 > Web Assets (Data, Services and Systems)
 > <http://www.semantic-web-journal.net/blog/call-papers-special-issue-quality-management-semantic-web-assets-data-services-and-systems>
 > Venha para a Alemanha como PhD:
 > <http://bis.informatik.uni-leipzig.de/csf>http://bis.informatik.uni-leipzig.de/csf
 > Projects: http://dbpedia.org,
 http://nlp2rdf.org,
 > <http://linguistics.okfn.org>http://linguistics.okfn.org,
 > https://www.w3.org/community/ld4lt
 <http://www.w3.org/community/ld4lt>
 >
 Homepage: http://aksw.org/SebastianHellmann
 > Research Group: http://aksw.org
 >
 Thesis:
 > http://tinyurl.com/sh-thesis-summary
 > http://tinyurl.com/sh-thesis
 
 -- 
 Rolf
 Kleef                Open for Change, network for
 open development
 rolf@openforchange.info
 +31617232772 @rolfkleef www.openforchange.info
 
 Internet trailblazer. Weaving
 the web to help humanity. Implementing
 open
 data, open organisations and online collaboration in civil
 society.
 
 
Received on Friday, 6 November 2015 17:53:17 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:22:27 UTC