W3C home > Mailing lists > Public > public-webplatform@w3.org > January 2014

Re: Converting MDN Compat Data to JSON

From: Pat Tressel <ptressel@myuw.net>
Date: Sat, 11 Jan 2014 16:10:06 -0800
Message-ID: <CABT-+2pD+O5O2=ca5XEkpcZs++5v198PxG8MPiS4x11TGE9Bbw@mail.gmail.com>
To: Doug Schepers <schepers@w3.org>
Cc: WebPlatform Community <public-webplatform@w3.org>
> Unfortunately, MDN doesn't expose their compatibility data as JSON, so
>> we'll need to convert their HTML tables into JSON that matches our data
>> model [2]. We already have a script that collects the data (again, as HTML
>> tables) from their site, but we need someone who can reformat and normalize
>> that data.
>> The language used for this task is not important: it could be JavaScript,
>> Python, Ruby, Perl, PHP, or even C. I believe that the best approach may
>> use RegEx, but there might be a better way.
> ...
I'd be inclined to use Python and Beautiful Soup.  The latter works on
> intact web pages -- I'm not sure about isolated elements, but it would be
> simple enough to tack on a minimal set of <html>, <head>, <body> tags.
> ...
> (I'd be equally inclined to use JavaScript and jQuery if I were set up to
> use them outside a browser. ...

The "right" tool is probably XSLT.  But it would probably be faster to get
it working in Python / Beautiful Soup.  ;-)

-- Pat
Received on Sunday, 12 January 2014 00:10:34 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:20:56 UTC