- From: Pat Tressel <ptressel@myuw.net>
- Date: Sat, 11 Jan 2014 16:10:06 -0800
- To: Doug Schepers <schepers@w3.org>
- Cc: WebPlatform Community <public-webplatform@w3.org>
Received on Sunday, 12 January 2014 00:10:34 UTC
> Unfortunately, MDN doesn't expose their compatibility data as JSON, so >> we'll need to convert their HTML tables into JSON that matches our data >> model [2]. We already have a script that collects the data (again, as HTML >> tables) from their site, but we need someone who can reformat and normalize >> that data. >> >> The language used for this task is not important: it could be JavaScript, >> Python, Ruby, Perl, PHP, or even C. I believe that the best approach may >> use RegEx, but there might be a better way. >> > > ... > I'd be inclined to use Python and Beautiful Soup. The latter works on > intact web pages -- I'm not sure about isolated elements, but it would be > simple enough to tack on a minimal set of <html>, <head>, <body> tags. > ... > (I'd be equally inclined to use JavaScript and jQuery if I were set up to > use them outside a browser. ... > The "right" tool is probably XSLT. But it would probably be faster to get it working in Python / Beautiful Soup. ;-) -- Pat
Received on Sunday, 12 January 2014 00:10:34 UTC