Re: WebPlatform Browser Support phased approach? from Doug Schepers on 2013-11-01 (public-webplatform-tests@w3.org from October to December 2013)

From: Doug Schepers <schepers@w3.org>
Date: Fri, 01 Nov 2013 13:32:59 -0400
To: Ronald Mansveld <ronald@ronaldmansveld.nl>
CC: public-webplatform-tests@w3.org, Janet Swisher <jswisher@mozilla.com>
Message-ID: <5273E5CB.4000009@w3.org>
Hi, Ronald–

Thanks for the update!

First, I didn't realize that Janet Swisher (Mozilla, and one of the 
founders of this project) didn't know you were working on MDN data, or 
she would have introduced you to someone at Mozilla. Maybe that's the 
contact you made already. In any case, she can confirm that.

I'm okay with keeping the data as tables for now, if that makes your 
life easier. But I do want to note that it would be much better to have 
it as JSON, because that's what the MediaWiki extension is expecting.

If we keep the data as tables, we will need to rewrite the MediaWiki 
extension to deal with that instead; it would also make it difficult 
(maybe prohibitively so) to make the "icon view" at the top of the page, 
since we'd need to parse and reformat the data.

So, there is extra work to be done either way: either we rewrite the 
extension and lose some functionality (for now); or we find a way to 
parse the MDN tables into JSON. I don't know which would be more work. I 
do know that JSON is the final format we want the data in, so I'd like 
to shoot for that if we can.

I don't want to put all the work on you, especially since you've been so 
awesome on driving this forward. How about as a next step, you expose 
the data you've collected, and someone (me?) looks at making a regex 
that normalizes it, or at least assesses which approach will be more work?

Again, if we can get by with tables with not much work, then I agree we 
should do that.

Regards-
-Doug

On 11/1/13 12:50 PM, Ronald Mansveld wrote:
> It have been some pretty productive days, with both ups and downs.
>
> the data from both CIU and H5T have been pretty easy to parse, mostly
> because this data is already available in JSON-format. MDN-data is a
> different story though.
>
> At this point, the MDN-data is _not_ available as JSON. I can get a
> JSON-feed, but that only states that a compatibility-section is
> available. It doesn't give the data. So, I had to resort to scraping.
>
> However, even though the data may be in a table, which makes the general
> parsing pretty easy, some of the data actually isn't that nice embedded
> in tags. For instance: the version-numbers for prefixed use and
> non-prefixed use are only separated by a line-break.
>
> I've come a long way, but it most certainly isn't yet at the level I
> want it to be. So I'm actually thinking of not even trying to parse the
> MDN-data, and just use the HTML-table as is.
>
> By parsing the CIU and H5T data into tables of the same formatting, we
> still can have a uniformed layout on the site.
>
>
> I have been given a contact within MDN, so I'll try to work with them to
> make the data available as JSON, so we can do a better integration after
> this first phase.
>
>
> Any thoughts/comments?
>
> I'll continue working once I'm back in NL, if no-one objects, I'd like
> to go for the table option, which could be up and running pretty soon. I
> don't see too many downsides, given the fact this is just a temporary
> solution so we can go live with the CSS-part of the site, and a more
> future-proof solution will be build once this is up and running.
>
>
> Ronald
>
>
>
>
> Doug Schepers schreef op 2013-10-30 17:59:
>> Hi, Ronald–
>>
>> Thanks for the update! Looking forward to seeing it.
>>
>> Since we eventually plan to have tests for each assertion, and
>> results based on running those tests against browsers (versions, OSs,
>> etc.), it makes the most sense to expand the data from MDN to a
>> version-range, if that's doable. That will be the most consistent with
>> our plans.
>>
>> Note that in reality, there are regressions. For example, Chrome has
>> dropped support for MathML, and other browsers have dropped features
>> as well (e.g. some SVG stuff). But we'll deal with that once the
>> infrastructure for reporting test results is more mature.
>>
>> Regards-
>> -Doug
>>
>>
>> On 10/30/13 11:29 AM, Ronald Mansveld wrote:
>>> OK, I've come a long way so far. There is just one decision to be
>>> made:
>>>
>>> MDN provides the compat data not per version, but rather a
>>> since-version.
>>>
>>> Both caniuse and html5test provide the data per version (where
>>> available).
>>>
>>>
>>> What do we want to use? I can collapse the data from caniuse and
>>> html5test to a since version pretty easily. Expanding the data from
>>> MDN from a since-version up to a complete version-range might be
>>> doable as well, although I have to rely on the browser-data provided
>>> in the feeds from CIU and H5T to determine what versions are
>>> available.
>>>
>>> Anyone with arguments towards or against either option?
>>>
>>>
>>>
>>> Ronald
>>>
>>>
>>> Doug Schepers schreef op 2013-10-29 06:18:
>>>> Hi, Ronald–
>>>>
>>>> Since we're going with this phased approach (which I fully
>>>> support), I think we should do 2 things:
>>>>
>>>> 1) Use the MDN data as the baseline, since they have fairly
>>>> complete data and a similar feature level as WPD (e.g., they have
>>>> basically the same page names as we do); this means you'll have to
>>>> collect this data via MDN's API;
>>>>
>>>> 2) Supplement that baseline data with CanIUse and HTML5Test data
>>>> where there is an equivalent feature name (e.g. "border-radius");
>>>> we'll have to wait for QuirksMode and MobileHTML5 data until we
>>>> have the source for that, but we will launch an "explainer" page
>>>> that tells about all our data sources and our timeline.
>>>>
>>>> Does this seem like a doable approach?
>>>>
>>>> Regards- -Doug
>>>>
>>>> On 10/23/13 9:24 PM, Julee wrote:
>>>>> Thanks much, Ronald! And everyone who is sharing their data as
>>>>> is!
>>>>>
>>>>> I've sent feelers out regarding a work space in London next week.
>>>>>  Will let you know if I hear anything.
>>>>>
>>>>> In the meantime, do you have a sense of how long it might take
>>>>> to normalize this phase-1 data? No biggie, just looking to fill
>>>>> out the CSS-properties schedule.
>>>>>
>>>>> Regards!
>>>>>
>>>>> Julee ---------------------------- julee@adobe.com @adobejulee
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> -----Original Message----- From: Ronald Mansveld
>>>>> <ronald@ronaldmansveld.nl> Date: Tuesday, October 22, 2013 3:47
>>>>> PM To: Alex Komoroske <komoroske@google.com> Cc: Niels Leenheer
>>>>> <info@html5test.com>, julee <julee@adobe.com>,
>>>>> "public-webplatform-tests@w3.org"
>>>>> <public-webplatform-tests@w3.org> Subject: Re: WebPlatform
>>>>> Browser Support phased approach?
>>>>>
>>>>>> Alex Komoroske schreef op 2013-10-22 17:48:
>>>>>>> I strongly support a phased approach. I'm very excited about
>>>>>>> the prospect of having a more robust system set up, but as
>>>>>>> far as the CSS Properties launch goes, it's more important to
>>>>>>> have _something_, even if it's just a one-time import from a
>>>>>>> couple of sources.
>>>>>>>
>>>>>>
>>>>>> I feel like there is support to do a phased approach, plus we
>>>>>> have access to a (basic) set of data to get started. Coupled
>>>>>> with the urgency to get CSS live (which I absolutely support,
>>>>>> we've been in alpha long enough now ;) ), I think this is
>>>>>> indeed the right path to follow. Plus, this buys us time to
>>>>>> come up with a good plan and schemata for the data-exchange we
>>>>>> want to use in the future.
>>>>>>
>>>>>>
>>>>>> Next week I'll be in London, if anyone knows a place to work
>>>>>> for me I can start building the first scripts to parse the
>>>>>> data. I've checked out the Mozilla Open Office, but to me it's
>>>>>> pretty unclear whether that is still in use, and if so: if and
>>>>>> how I can use it. Do we have any Mozilla-employees on the list?
>>>>>> Or do we have Googlers that know if perhaps the Google office
>>>>>> can be used? Or any Londoners that know of a place?
>>>>>>
>>>>>> Worst case scenario I think I can use the City Business
>>>>>> Library, but my experience is that libraries are not always the
>>>>>> best place to work from, especially not if you try to make full
>>>>>> office hours.
>>>>>>
>>>>>>
>>>>>> Ronald
>>>>>
>>>>>
>>>>>
>>>
>
Received on Friday, 1 November 2013 17:33:22 UTC