- From: Doug Schepers <schepers@w3.org>
- Date: Sat, 11 Jan 2014 04:21:08 -0500
- To: WebPlatform Community <public-webplatform@w3.org>
Hi, folks– We need some coding help to convert HTML tables into JSON, for our compatibility data project. As I've explained elsewhere [1], we have several goals for our browser compatibility information: 1) collect the most accurate data we can, from multiple trusted sources 2) store the data all in JSON, available for anyone to use via our API 3) use a MediaWiki extension to automatically populate the right pages with their relevant data We've made some progress on this, such as developing a data model [2], but gotten stalled approaching the holidays. I'd like to find help to bring us across the finish line. We should do this in multiple passes. The first pass will simply be to populate the pages with at least one source of data; the best match for our page structure is MDN. Unfortunately, MDN doesn't expose their compatibility data as JSON, so we'll need to convert their HTML tables into JSON that matches our data model [2]. We already have a script that collects the data (again, as HTML tables) from their site, but we need someone who can reformat and normalize that data. The language used for this task is not important: it could be JavaScript, Python, Ruby, Perl, PHP, or even C. I believe that the best approach may use RegEx, but there might be a better way. So, I'm asking you all to help in one of a few ways: 1) If you think you might know how to do this, and have time and energy to see it through, please let us know! 2) If you think you might know someone who can help, please introduce us! 3) If you can't do the task, nor know someone who could, please help me refine this message so we can put the call out, explaining what we are doing and what we need. [1] http://lists.w3.org/Archives/Public/public-webplatform-tests/2013OctDec/0000.html [2] http://www.ronaldmansveld.nl/webplatform/compat_tables_datamodel.html Regards- -Doug
Received on Saturday, 11 January 2014 09:21:15 UTC