Re: Use machine-readable standardized data formats / Use non-proprietary data formats

hello makx.

On 2015-08-15 04:11, Makx Dekkers wrote:
> Just for context: my examples of online legislation come from work that I have been doing over the last year or so. Publication services around the world are moving from publishing PDFs on the Web to 'webby' publication with all the aspects that Erik lists.

great, and i think it would be good if you had some form of checklist 
that you could use to address the question: are we nicely webby? you 
might still publish horribly wrong data, but that would be squarely out 
of scope for answering the question "are we webby?"

> It became clear to me that basically all the Best Practices of this group directly apply to that environment too. Persistent identifiers, URI templates, multiple formats (XML, HTML, PDF) metadata based on standard predicate vocabularies, common controlled value vocabularies, versioning, linking within and between acts and between national and supranational level, quality, timeliness, etc. etc., the whole lot.

and that's how it should be. the fact that you might also have to answer 
some tricky questions (such as how to properly publish multilingual 
content and keep identifiers working across language versions) is just 
an added complication that maybe DWBP does not address, but that should 
not mean that any of the existing DWBP guidance doesn't apply.

> They even have more issues that have to do with legacy data, something we don't cover, and I don't suggest we do: moving from legacy identifier systems to persistent URIs taking into account citation practices; converting legacy data formats and PDF to 'webby' formats; scanning and OCR'ing medieval acts.

that's the slippery scope to "data management BP", where the main 
questions have little to do with the web specifically, and only after 
you've sorted out those problems, the question arises how to publish 
whatever you came up with in a webby way.

> Just look at legislation.gov.uk, and it's all there. I would even say that lots of what they do could be used as real examples of several of our best practices.

portals like this exist in many countries now, and many of them are 
rather un-webby. giving them a focused and reasonable checklist of 
things they should do to become more webby, and explaining why, would be 
great guidance coming from the W3C and being applicable to many of the 
e-government activities going on nowadays.

btw, that's exactly how web data started: for document-driven domains, 
telling people to publish their data in RDF is non-sensical. but that's 
what linked data tells them to do. web data is an attempt to encourage 
people to be webby, without prescribing the metamodel they have to use. 
it's about how to be webby without having to be semwebby.

thanks and cheers,

dret.

erik wilde | mailto:dret@berkeley.edu  -  tel:+1-510-2061079 |
            | UC Berkeley  -  School of Information (ISchool) |
            | http://dret.net/netdret http://twitter.com/dret |

Received on Saturday, 15 August 2015 17:46:21 UTC