Re: Ontology design for UK Parliament

Hi Fabricio

That all sounds excellent. Some thoughts inline

On 24/08/2016, 22:40, "Fabricio Rocha de Sousa" <fabricio.rocha@camara.leg.br> wrote:

Hi Michael and all,

 This was a very opportune message from you, as I had just finished a message to send as well. We at the Brazilian Chamber of Deputies are also in charge of redesigning our “Open Data” service [1] which is not only small but also outdated, buggy and absolutely “unlinked” and “unstandard”. We know that we simply can not reach perfection now, but we want to get as close to it as possible.

Sounds familiar ☺

 I have been talking with Andreas Kuckartz (OpenGov-DL), James McKinney (Popolo), the OParl team and professor Daniel Schwabe (PUC-RJ university, Brazil) about similar issues. I think it would be great to have a vocabulary specifically for describing parliamentary/legislative products and processes – with no compromises about data formats of the representations, and only the inevitable minimum about supposed data types and compositions. I think that ideally this vocabulary would be stored in a collaborative site, as a wiki, so it would be possible to have contributors writing multilingual sections in each page describing, for each country, the usage of the term, peculiarities about the objects referenced by the term in this country, etc. And, much like Schema.org, each term used in our metadata would be dereferenciable to a page of this site. (I admit that I still can´t imagine a better final purpose for concepts than being human-readable and human-oriented).

Working to design multilingual ontology / ontologies on the website would make sense. Though we’d need to publish a formal owl ontology somewhere. Though that would have multiple representations including html and human meaningful labels. Something like:
http://www.bbc.co.uk/ontologies/po
is I think what we have in mind

And as per the previous email making sure that similar labels in different jurisdictions are actually the same thing. Else minting new terms to capture differences

 If I understood it right, Andreas is trying to create a dereferenceable vocabulary website ( opengov: ) working together with the Popolo project, which made a great effort of identifying many common elements in parliamentary objects and agents and describing them with already existant, general-purpose vocabularies. James did not like that much the idea of a wiki-based vocabulary.

Happy with a wiki based vocabulary specification so long as we can export and transform that into a formal owl ontology

 Anyway, I must admit that I am far from being a specialist in this area, and my main question now – the message that I was about to send, and, I believe, that some other people might have as well – is the following:

 “We have already defined some resources, which will be represented as a bunch of different data schemas, which must be delivered in a variety of formats such as CSV, JSON and XML. Some other legislative houses will have different data schemas for similar things, in different languages. To simplify development, our managers decided that we will provide machine-oriented metadata ‒ about the data structures, types and possible values for some properties ‒ as extra, optional resources (instead of mixed with the data), and in the same data formats of the served data (i.e., CSV, JSON, XML, etc). How should we design our metadata representations in a way they become standards-compliant and allow human- and machine-based integrations and comparisons of our data to the data delivered by other parliaments?"

Not sure if this answers your question but…

In the longer term we’d like to end the separation of the data platform and the website. So the website will be built off the data platform but at some point the data platform (currently http://data.parliament.uk) will get switched off to the public and data views will be surfaced alongside html pages in the website

Discovery of rdf, json-ld, csv, ics etc from the html will be done through content negotiation and rel-alternate headers in the html

We’ll also surface the json-ld in a script element in the head of the html document which for now seems to be google’s preferred way of consuming schema.org
https://developers.google.com/search/docs/guides/intro-structured-data

Our machine-oriented metadata about data structures and types will follow the same pattern. We’ll specify it as a formal rdf ontology, then use common tools to transform that to multiple representations (including html) again with conneg and rel-alternate headers to link between them (as per the programmes ontology example from further up the mail)



 Currently I only guess that a good international vocabulary -- a specific one, or a specific extension on Schema.org -- would be 50% of the solution. Anyway, it seems that we are in a favourable moment to look for (at least some) standards in parliamentary data.

Hoping so. Feels like there’s at least a high level abstraction that might work across multiple parliaments (and for schema.org) with more specific implementations for eg parliaments following a “Westminster system” with more specific implements still for individual jurisdictions

Knowing where it’s applicable to agree on terms and where it’s better to allow for flex feels very important. No one wants to accidentally reinvent empire ☺

michael

 Thanks!

Fabricio Rocha
Informatics Center - Chamber of Deputies
Brasilia, Brazil



[1] - http://www2.camara.leg.br/transparencia/dados-abertos/dados-abertos-legislativo

Received on Friday, 26 August 2016 16:05:30 UTC