Core Mobile Web Platform Community Group Face to Face

03 Oct 2012


See also: IRC log


Bryan_Sullivan, Alan_Bird, Mounir_Lamouri, Wonsuk_Lee, Josh_Soref, Elika_Etemad(fantasai), Dominique_Hazael-Massieux, Jo_Rabin, Tomomi_Imura_(girlie_mac), Matt_Kelly, Shuhei_Hub, Gavin_Thomas, Tobie_Langel, Jean-Francois_Moy, Giridhar_Mandyam, Jonathan_Watt, Markus_Leutwyler_(hptomcat), Jennifer_Leong, Max_NTT, Natasha_GSMA, Robert_Shilston, Dan_Sun, Jet_Villegas


<scribe> Scribe: Josh_Soref

<trackbot> Date: 03 October 2012

<jo> Tobie's Paper on Test Frameworks

Brief Agenda

jo: to let Alan have some flexibility, we'll start with him
... and then we'll discuss our agenda

W3C Thank You

alan: Alan, W3C
... I came here today to talk about open platforms shared among mobile carriers
... my main purpose is to attend apps world
... surprisingly yesterday, most people asked me how they could develop portable-html based apps without upsetting Apple/Google
... and I responded that in fact, Apple and Google are both members of W3C
... I'm responsible for recruiting for W3C
... I'm a member of this CG
... To make something into a standard, it will have to move into a WG
... I spoke with GSMA yesterday afternoon
... If you have a question about W3C, membership, who they are, who they aren't, please feel free to ask me
... dom has more history, i'm the new kid on the block

jo: Thank you very much, alan


jo: i'd like to try to come back to CoreMob 2012 mid afternoon
... i'd like to do issue/action bashing
... it's always fun,
... the best part is when we say "oh, yeah, you did that, didn't you?"
... i hope everyone has read tobie's document

Test Approach

<dom> Coremob Test Approach

tobie: I was asked to do Robin's action, to discuss some options for building a test runner
... So I wrote this draft paper over the last few weeks
... I learned a lot in the process, so at least for me it was a useful exercise. Hope it's useful for you, too.
... First thing I had to spend time on was terminology
... Mostly because JS library that W3C is using is referred all over the place
... testharness.js
... Given the structure of testing on the web in general, comes way after testing for other environments
... Hard to understand how the pieces fit together
... So I decided on some arbitrary wording
... Any questions?

fantasai: I read it, and the terminology doesn't work for reftests or self-describing tests
... e.g. says a "test" is a single JS function, which isn't true for CSS tests
... and the test framework being defind as a JS program probably won't work for CSS tests

tobie: So, there are 3 categories of tests

fantasai: testharness.js tests (automated JS)
... reftests (automated visual)
... self-describing tests (manual)

tobie: This document only discusses automated JS tests

dom: The terminology section seems to be very specific to what this group is doing, not general to W3C
... e.g. a test being a single JS function is true of many W3C tests, but not all
... My question is whether you are trying to define these terms in the broader sense, or trying to focus on what's covered in the document
... So maybe you need a section that lists assumptions, scopes the document to what coremob will be testing

jo: Need to discuss that, e.g. seems premature to me to exclude non-automated testing
... So assumption to state for today, that we're only talking about automated testing

dom: What is the main goal for this?
... Is it for something like Ringmark, which is an impactful way to evaluate whether a browser conforms to things this group wants
... Or is it a generalized mobile test runner

tobie: It's a spectrum, e.g. Ringmark runs in under a minute
... On the other hand, Jet was saying that e.g. Mozilla test suite takes 24 hours to run.
... And maybe good to have a subset that runs in 15 minutes

jo: Maybe have a framework can do both

tobie: Not sure you can do that with the same software
... I don't know what it is that vendors run for 24 hours

gmandyam: Talk about uploading results, if we're tyring to crowdsource, then doing something that doesn't involve an automated test framework, will be very diffficult
... Some tests can be automated, some can't
... For this we can say it's just existence tests, not functional tests
... Anyone can download these tests and upload results

tobie: Because testharness.js is asynchronous, and because test runners are async, it is possible to add human intervention in there, or even reftests
... It needs to be plugged in
... this paper does not concentrate on explaining that

gmandyam: If you're talking about professional testers ...
... You have to minimize the intervention elements for common people

Josh_Soref: You don't have to eliminate human intervention to get results from untrained people
... e.g. a lot of CSS tests are asking is this red, is it green, click the right button. Anyone can do that
... We can have a system where you can load up the results, and mostly be looking at cached results
... And we can have a system where we tag things that are fast
... Can say that this set of tests, run these tests quickly
... It's certainly possible for tests to be self-instrumenting
... And say which ones are fast/slow

<dom> a "lite" version of test would run in a split second, and the more consequential would require more time/intervention

fantasai: you probably want to say which tests which you want to do for the fast run
... you may have fast edge cases
... but you want to define which core, possibly slow tests are key

tobie: it's pretty easy to decide which class of tests to run

jo: we agree that we need bundles of tests that can be runnable in different circumstances

rob_shilston: ... some basic things like position: fixed, where vnedors implement it on android phones
... We've not been able to determine that programmatically, whereas it's an important test that a human can determine
... Your'e talking about fast tests and slow tests
... Out of the last F2F, the ideal test framework would allow anyone to define a profile of tests to run
... There might be fast tests, or slow tests, or FB tests, or coremob tests, but the framework should be able to handle all these.

Gavin_: We dived into this, wondering whether we have a clear view on the requirements and customers from the test results
... That would help shape some of these topics

fantasai: your audience is web developers
... and browser vendors
... for the latter to compete
... for the former to see what's supported where

mattkelly: One thing to keep in mind is the amount of effort it's going to take to do something like 100,000 test test suite
... A ringmark-style thing vs. 100,000-test test suite
... Want to create a framework that can be added to to create the latter eventually
... The nice thing about ringmark is that it got traction
... One ...
... A lot of people have been using it, fixing bugs in browsers, ealuating products, etc.
... I think we should start with something simple like ringmark
... and then go towards having a large test suite, with device certification
... etc.

dom: what you're saying is the focus should be on browser vendors, and people selecting browsers?

<gmandyam> Response to Rob - QuIC's Vellamo automated framework automates testing such as scrolling which could also be verified using human intervention. There are tests that are not capable of being automated, but we can provide a clear delineation in an automated framework (e.g. differentiating between existence tests and functional tests).

mattkelly: yes, I'd like the group to focus on taking developers priorities and informing browsers
... Going back to ringmark, we're already doing that
... we've added to other groups that build frameworks to determine what ppl can build on with different browsers

jo: Informing devs is an essential part of the job, just not what we might do ourselves

tobie: Hearing lots of good ideas, great feedbac, but haven't heard yet of anyone willing to commit resources to work on this
... Opening up the conversation and requirements to things liek crowdsourcing reftests,
... I'm not going to do it
... Whoever wants that will have to step up and do it

<hptomcat> zooniverse.org is great for science crowd-sourcing

<hptomcat> slightly related

tobie: Given resources committed here to do this, most of the propositions I made are out-of-reach, and they're a 10th of what we're discussing right now
... So either we refocus discussion, or ppl step up and commit resources

<hptomcat> crowd-sourcing the analysis of science data (e.g astronomy)

dom: hearing that we should start simple, automatable, leave more complex stuff for later

fantasai: That works for other technologies, but not for CSS. Anything you can test via JS for CSS will be limited and not really testing the layout

dom: Would be limited ot existence tests.

jo: First we need a formal statement of requirements that are in-scope for the charter
... Second is architecture that matches the requirements, but will not be implemented in full yet due to resources
... Third is something that works, soon

<hptomcat> if you could automate the testing, record the results and then have the "crowd" analyze it and report the result in an automated fashion

Gavin_: I can take an action to draft what I think the requirements should be
... I think our role is to apply from structure to this topic

<dom> ACTION: Gavin to start a draft of requirements of what our testing efforts should be [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action01]

<trackbot> Created ACTION-56 - Start a draft of requirements of what our testing efforts should be [on Gavin Thomas - due 2012-10-10].

Gavin_: We should be discussing the structure, and figure out [..] then resources will flow
... Requirements need to be real
... If this isn't the group that funds 100,00 tests, suppose they exist somewhere and we just need to find them and run them
... There's an assumption that this group will create a technical framework that will handle all this
... Not sure tha'ts needed
... Would love to see a table of all the features in coremob, does a test for this exist?

tobie: This is my assumption of the requirements

fantasai: i'd like to raise an issue
... existing ringmark sometimes tests for existence of features
... but not correctness
... we've had issues with ACID
... and with other tests
... where people do a half assed implementation just to satisfy the test

mattkelly: we have limited resources at Facebook
... it's why we open sourced ringmark
... and contributed it to w3
... hoping that people would contribute to the tests

jo: i'm hearing css can't be tested

fantasai: it can be, but not with javascript

jo: i'm hearing the framework needs to go beyond existence tests

fantasai: you can't use the framework here to do it

tobie: when you talk about being able to automate ref tests
... it has hardware constraints
... i can't go to an arbitrary site

fantasai: you need something outside the browser
... it can be automated

tobie: there are two categories
... automated within the browser, automated outside the browser
... it seems it might be better not to test css at all than to test it
... since we're focusing on the former, not the latter here
... There are requirements that were expressed i nthe charter, which was to have a test suite for the coremob spec

<dom> fantasai, some CSS can be automatically tested beyond existence, I would think; like you can check that settings a CSS rule does affect the right DOM properties (which could reasonably be assumed to directly reflect how the browser would render the DOM)

tobie: Implicitly, this test suite uses testharness.js
... And then there's a bunch of nice-to-haves, that came up at the lsat F2F, which are 1. being able to run non-W3C tests
... we could do the shim ourselves
... otherwise, all the architecures I'm suggesting make it possible for a third party to write a shim

<hptomcat> sweet :)

<fantasai> dom, you can test the cascade and parsing that way, sure, but not layout and rendering which is the most significant part of the implementation

tobie: Run the test suite in a closed environment, e.g. if a carrier wants to test a non-released device -- internal QA

<fantasai> dom, as well, CSS has a number of implementations that don't support JS, so we wouldn't want to encourage JS tests over reftests, because that excludes those clients from being able to use those tests

<dom> fantasai, right re JS, but I think the focus of this group is very much on JS-enabled implementations

<fantasai> dom, which is fine, but doesn't enable sharing tests with the CSSWG so much

<dom> agreed :/

tobie: Testing at W3C, there's a massive Mercurial repostisitory, and a lot of WGs have their own sub-directory in there
... in which they have hteir own organization of tests
... Every group has a very different architecture
... Group with clearest process I've seen is WebApps group
... A company submits tests, they end up in a repo, they need to get reviewed by someone from a different company
... then they become the test suite for a particular test

<jo> W3C Test Repository referred to by Tobie

tobie: The problem is this review process usually doesn't happen
... So very difficult to figure out which specs are tested, which aren't
... which tests rae good, which are not
... ...
... Given a name of a test suite, need to figure out where the tests are
... Run the test suite
... Log the results and ge thtem back

jo: What state is the JSON API today?

dom: ...

tobie: I played with it, it's alpha [the JSON API works, it is used as a back-end to the main UI of the test framework]
... I have a suggesto to bypass this API altogether
... Most basic thing, the way the current test runner at W3C works
... First you navigate to the test runner
... testrunner has an iframe
... First you go to testrunner, then you get the JSON list of resources, then pull the resources,
... then testrunner automatically figures out results and sends them back to W3C servers
... or you can post to .e.g browserscope.org
... Then someone can look at those results afterwards
... First basic proposition I'm making is that, to fulfill the most important req we have
... would be to just create the test suite on the w3c servers themselves
... and that just rquires us to have a repo there
... pretty easy to do, low-cost, serves that requirement very well

dom: If it does, why looking at alternative solutions

tobie: It only suits our must-have reqs, not the nice-to-have ones
... wouldn't allow something liek ringmark to work, e.g.
... and doesn't allow running on a closed netowrk
... and makes it difficult to run 3rd-party tests
... eg. ECMAScript
... Option 2

gmandyam: why can't it run ringmark?

tobie: this system only works due to same-origin constraints
... b/c of iframe
... W3C would have to modify its stuff
... Option 2
... But instead of hosting testharnes son w3c server, can host it anywhere
... e.g. coremob.org
... This architecture allows for the requirement to have tests hosted on different domains
... testrunner queries JSON API, runs the test pages in an iFrame
... but the iframe is not same-origin
... What it contains comes from W3C repo
... Solution there is to use post-message
... These are widely-supported
... but would require W3C servers to either whitelist the origin of the testrunner
... or allow posting messages to whatever server
... security implications are small
... Opening some form of DOS attack
... but basically, it requires a change in w3c server that needs to be vetted by someone
... My understanding is that the security implications are limited to leaking data of whoever is connected to the server at that time
... but that can be assessed

rob_shilston: I thought it was just a difference in communicating data vai iframe

tobie: It is pretty safe, just needs to be done right
... For it to be completely safe, need to whitelist targets
... but not necessary to do that in the particular case of what we're doing here
... because only message we're sending out is the results of running the tests
... Just need to make sure in that data, there is no personally-identifiable data
... W3C needs to do some work to make that happen and make sure it's not opening a security hole
... But it still has to be done, and explained properly

dom: Another consequence is that it measn to pass the test suite you have to implement postmessage
... Makes it an implicit requirement

tobie: There are some iframe tricks to get around this, but horrible mess
... That said if you look at caniuse,

<dom> http://caniuse.com/#feat=x-doc-messaging

tobie: The data for this, it seems safe

bryan: This chart isn't focused on mobile
... You have to really do testing for mobile version of browsers

tobie: And it is a coremob requirement
... Want ringmark to be able to run this test suite
... Want the UI for this runner to be runnable from anywhere
... Other cool thing is it allows you to run tests from other groups via same protocol
... You can run any test on a different origin, as long as its able to send its results across origin in a format that is shimmable or that is standardized
... Then you have complete decoupling between the runner and where the tests are
... use cases: running ECMAScript test
... the only thing 262 has to do is to post results through postmessage API
... in a format that we can parse
... cons are it requires change to w3c code, and that it doesn't allow running on closed networks

dom: Seems simple enough change to make

tobie: Next option is...
... There's aproblem with the JSON Api in its current state
... It lets you figure out where the test page is that you watn to run, but it doesn't tell you what resources that test page relies on
... e.g. external CSS or JS or whatever
... or whether it's stati or dynamic
... This is why it's important to run the tests where they are
... instead of getting them
... This third solution is to proxy requests through a remote server
... You go visit coremob server with your device
... it returns the runner
... asks for "give me address of test pages I want to run"
... Then instead of visiting those pages through the iframe, you provide a proxy that's hosted on coremob.org
... with the detail of which page it's supposed to proxy
... So it proxies youre test page, and all subsequent requests made by that page
... that requires a bit of juggling
... an my understanding is it would only be possible because all resources on the w3c test framework are relative and are not absolute URLs

dom: Any testcase that doesn't match that requirement would be buggy
... Any testcase that doesn't use relative URLs won't work

fantasai: That would be true of some tests of relative URL / absolute URL handling

Josh_Soref: There are some testcases like this, but you have to run a custom harness for them anyway

tobie: The benefit of this is that the device itself never has to visit the W3C server, or any other server that's hosting tests
... Which leads us to architecture where the tests can be run on a closed network
... The device itself never visits the W3C webserver
... So that enables testing devices without the device's UA string being all over the web
... Might notice that e.g. weeks before iPhone 5 was released, browserscope already had results for it
... this avoids that problem

dom: Problem with specs that require specific server-side components

tobie: Tests that require server-side components would not [pause]
... Why could those not be proxied too?

dom: Could, but the proxy needs to be aware of this

<jo> ACTION: Tobie to investigate tests that requie server side components [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action02]

<trackbot> Created ACTION-57 - Investigate tests that requie server side components [on Tobie Langel - due 2012-10-10].

tobie: Just getting resources. Except for websockets

dom: WebRTC
... Also if you're dealing with websockets, also dealing with absolute URLs

tobie: Ok, and then last solution is to avoid the JSON api altogether
... You just download the whole repository
... and then have a runner that's able to do that on the local server
... that satisfies all requirements, excep tyou do need a websocket server on your local server
... but you lose the JSON Api capability
... so you're stuck with understanding the W3C test repo and maintaining that understanding
... I think browser vendors already do this

fantasai: mozilla's runs tests on every checkin
... We pull some of the tests into our repo, but they lose sync over time

Josh_Soref: mozilla/microsoft don't like running tests against remote servers, they prefer to have isolated machines w/ no network links (it avoids latency / connectivity variance)

fantasai: if we could automate the runs, we could probably run them against the w3c server on every release, to have the test results recorded there

[discussion of keeping things in sync]

tobie: the cron job is the easy part
... hard part is keeping up with changes to structure and systems
... Having the JSON API report all the resources required by a test would be a nice step up

dom: Think your other solutions rely on JSON API to get the right pieces of the tests
... Which assume that .. has the right organization of tests, which I don't believe is true today

jo: Is that a fundamental flaw?

dom: If it doesn't get solved, the other solutions [...]
... Not fundamentally unsolveable

<fantasai> fantasai, Josh_Soref: what was the rpoblem

Josh_Soref: +1

dom: JSON API exposes what you see, which is a list of 40 test suites
... if you look at the details of that page
... THere is no direct match between an entry on that page and a feature
... Coremob is interested in features, or specs,
... but it's not even at granularity of specifications here
... So we would face issue of matching whatever we wanted to what's in the JSON api

tobie: I agree with that
... that the JSON API is half-baked

Josh_Soref: Seems like a bug everyone wants fixed, though

tobie: Would be sad to build a solution to that problem with a slightly different API

dom: It is a fundamental flaw that if not fixed, we can't rely on this API
... But fundamental to so many use cases that we can surely bribe Robin to fix it
... First thing would be to discuss with Robin
... But we do depend on it

<dom> PROPOSED ACTION: SOMEBODY to work with Robin on getting the right granularity exposed through the JSON API of the W3C Test Framework

<dom> ACTION: Dom to work with Robin on getting the right granularity exposed through the JSON API of the W3C Test Framework [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action03]

<trackbot> Created ACTION-58 - Work with Robin on getting the right granularity exposed through the JSON API of the W3C Test Framework [on Dominique Hazaël-Massieux - due 2012-10-10].

tobie: First, we need an answer to whether W3C can make the changes we need
... Second issue is, I believe the JSON API is the right way to go, but we have to make sure it has the data we need

dom: agree
... Thinking about the JSON API again, what do we expect to get from that API that we would reuse?
... One thing we talked about this morning is that some version of coremob would target to be runnable in very short amount of time
... which is not something that I expect you would get from JSON API

Josh_Soref: I think it's ok for JSON API to be designed for that feature
... Should have a way to ask for the key tests to be run
... and by default, for databases that don't have it, default to none or all

dom: Big assumption is that whoever decides what are key tests
... have same definition of key tests as we want

fantasai: Just have tags. Different groups can tag different subsets

dom: Does the definition of key test have relationship to what we consider key tests?
... The picking of tests that we'd want to run woudl go through various parameters
... They are key in some sense of that word, that they are fast, that they don't overlap with other contexts
... Not convinced the selection of tests can be automated

tobie: My understanding from JSON APIs is they'll at least define what kind of test it is, testharness.js vs reftetst
... I like to take ECMAScript test suite as an example
... It has 27,000 tests, and runs in ~10min
... Until we have such amount of tests in the W3C repo

dom: we have ~3000 tests right now

tobie: Don't think we're going much beyond 15 minutes

Josh_Soref: Animation tests might run you over pretty quickly

tobie: If it's just those, then it's easy to not do
... I'm not sure if we need to solve this subsetting problem right now.

fantasai: I agree. Let's just run all the automatable tests.
... Maybe it doesn't take too long. Maybe it's not a problem that it takes long.
... Maybe ppl who want fast results are satsified with pulling cached results, and ppl who want live results are ok waiting

mattkelly: Small use case of showing off devices at Mobile Web Congress

bryan: QA depts dont' care if it takes 10 or 30 minutes

Josh_Soref: For the MWC, could solve that use case by priming the repo with data, and just pulling the cached data

<Zakim> dom, you wanted to note that whether subsetting can be done usefully at the JSON-API level is a fairly important aspect on deciding whether the JSON API is the right approach (and

dom: ... so that we should invest resources in it)

tobie: We don't even know if timing is a problem, so why discuss it now. Let's discuss it later.

dom: My concern is spending resources fixing JSON API for something we don't need

tobie: This can be a req down the line
... If we need this data later, we can figure it out later

<bryan> We have test departments responsible for app regression testing on devices, and while the difference between 10 min and 1 hour is significant, it's not a show stopper in reality, especially if the expectation is closer to 15-30 min.


Josh_Soref: You do need the API to say which tests belong to which specs

dom: So the action item I took is not a showstopper

RESOLUTION: we will subset only to the extent that we want to test the specs we're interested in, but not subset testing within the spec for time etc. until/unless it's shown that the tests take too long and it's a problem

jo: So which option do we take? I suggest getting cracking asap and not closing off options 3/4

<jo> PROPOSED RESOLUTION: We will attempt to proceed with Tobie's option2 on the basis that we want to do something soon, we don't close off Options 3 and 4 because we may want to come back to them later

jo: So how do we proceed with option 2? Saying that some changes are required in current API to do that

RESOLUTION: We will attempt to proceed with Tobie's option2 on the basis that we want to do something soon, we don't close off Options 3 and 4 because we may want to come back to them later

<tobie> ACTION: tobie to talk to Robin to get cross-origin messages baked into testharness.js/ test report. [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action04]

<trackbot> Created ACTION-59 - Talk to Robin to get cross-origin messages baked into testharness.js/ test report. [on Tobie Langel - due 2012-10-10].

tobie: There are 2 steps, depending on whether we support only W3C tests or others too
... Let's focus on W3C tests
... There has to be a piece of test runner that talks to JSON API
... to know which test cases to run
... then the runner has to create iframe, and stick test pages in there
... then runner has to listen to result of running that test, and collect the results
... then it needs to ship the results out

jo: So someone needs to spec such a test runner, and someone needs to write it

dom: So first task is understanding in more detail what Tobie descripts as option 2
... In practice I assume this is going to be using a JS library that takes data from the web, processes it, and sends it back
... Someone has to analyze what this means.
... Then someone has to write the JS code
... to make that actually happen
... For the second action item, need someone who can write code. First action item doesn't need that, but is a deeper dive into how to interact with the tests

jo: The task is to look into what this task entails
... in sufficient detail
... so that someone can then write the code

<JenLeong> .


dom: Neither me nor Tobie should be the bottleneck in understanding what's going on, but I'm fine taking this initial task ...

tobie: Just to give a ibt of bg on this, when Jo asked me to write this document. I thought [...]
... It's time I spend not working on other things.
... Because ppl have the knowledge of things, they end up being the bottleneck

[more discussion of resources, not minuting this]

bryan: Who are the experts on the W3C infrastructure?

Josh_Soref: just ask around

tobie: Not good for same person to do all the work. What if I die tomorrow?

[ mattkelly volunteers to help ]

<hptomcat> i'll help too with coding (javascript stuff)

jo: We'll hope to have some forward motion in a month

<jo> ACTION: dom to write a wiki page with a breakdown of the tasks required to build the initial test frameworkj [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action05]

<trackbot> Created ACTION-60 - Write a wiki page with a breakdown of the tasks required to build the initial test frameworkj [on Dominique Hazaël-Massieux - due 2012-10-10].

<jo> ACTION: bryan to find resources to implement what Dom writes in ACTION-60 within 1 month [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action06]

<trackbot> Created ACTION-61 - Find resources to implement what Dom writes in ACTION-60 within 1 month [on Bryan Sullivan - due 2012-10-10].

<jo> ACTION: shilston to find resources to implement what Dom writes in ACTION-60 within 1 month [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action07]

<trackbot> Created ACTION-62 - Find resources to implement what Dom writes in ACTION-60 within 1 month [on Robert Shilston - due 2012-10-10].

<jo> ACTION: kelly to find resources to implement what Dom writes in ACTION-60 within 1 month [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action08]

<trackbot> Created ACTION-63 - Find resources to implement what Dom writes in ACTION-60 within 1 month [on Matt Kelly - due 2012-10-10].

<dom> ACTION-60: http://www.w3.org/community/coremob/wiki/Todos#Defining_the_requirements_for_the_Test_Runner

<trackbot> ACTION-60 Write a wiki page with a breakdown of the tasks required to build the initial test frameworkj notes added

<dom> bryan, rob_shilston, mattkelly, if you can look at http://www.w3.org/community/coremob/wiki/Todos#Defining_the_requirements_for_the_Test_Runner and see if that helps you

[ jet asks about ringmark ]

jet: Here's what I'd like to see. Currently ringmark.io lives in FB chunk of github
... Would like the tests accepted as coremob, with a policy of pulls accepted by coremob

tobie: ringmark tests, or ringmark runner?

jet: What is coremob taking from ringmark.io?

tobie: I think the project of what we've talked about testing here, and what the charter of the group says,
... First step is for this group to go look through the W3C test suites and find tests
... When tests are missing, go to WGs to write tests in that context
... Because that's the only way that the W3C accepts us
... The tests are related to a WG
... It's the WG that wnats to decide whether a test is a good quality test or not

jet: Right now, rng.io is associated with coremob, whether correctly or not
... Would like group to say publicly whether rng.io is or is not associated with this group

tobie: The answer is no.

jet: Question then is, if it's not related to coremob, would coremob be, if they forked rng.io as the Level 0 of this

tobie: It could be done. it's open source
... It's perfectly ok for this group to fork the testrunner and use it however it wants
... for the tests themselves, a bit more complicated
... but the tests are licensed for use in w3c
... Not saying the runner we're working on should be a fork. Or not.

jo: that would be up to the people coding the testrunner, maybe some of it is useful maybe not

[ discussion of indexedDB test suites ]

tobie: Short term, make a decision, longer term bitch at WGs to decide which is the actual suite

mattkelly: This group essentially gathers requirements and tests and builds the test suite, and then ringmark would be the results page on top of that
... People could have their own results page
... Might want to brand pages with it, whatever
... Give flexibility in how ppl want to show results
... while maintaining consensus around the tests

jet: Why not start with what you've got?

Gavin_: The conversation has been focused around W3C tests, using that as center of gravity
... Wondering why dismissed the ringmark tests, then fill with more tests

fantasai: Because there are a lot more W3C tests, don't want to recreate them

jo: And they are authoritative

fantasai: You might want to reimplement ringmark using the testrunner we'll use, and convert the ringmark tests to W3C tests, and just have a W3C implementation of ringmark

tobie: The decision that this grou pmade was to process this way

<dom> "High-quality, comprehensive and automated test suites are important interoperability drivers. The CG will compile accompanying test suites for each specification it releases. Where appropriate, the test suites will draw on pre-existing tests developed for the feature’s original specification."

<dom> Core Mobile Web Platform Community Group Charter

tobie: If we start to take tests from elsewhere, and these get included into the test suites of the relevant specifications
... then whoever made those tests will have to give proper licensing to do that
... and we don't want to handle all of the cross-licensing issues

Gavin_: For us to add value to all the different audiences
... Strikes me there's more than just availability of testcases
... That's the nuts and bolts of it, but where can they go to see the iOS level of support?

jo: Should be able to go run the tests yourself, or go to a repo and look up existing rsults

tobie: The results are getting collected, and some work at W3C to make those displayable
... We want to make that all much more discoverable and package it nicely
... the cool thing about Ringmark is that people can relate to it
... The W3C tests and how they're presented, it's crappy and ugly and no one knows where they are

Gavin_: Ringmark and HTML5test

tobie: the two are very different hings

Gavin_: Would this group have more value by focusing on the front end

jo: the conclusion we came to was that it was sufficient for there to be a front end, but htis group didn't have to write on eitself
... The place to most expeditiously osurce tests is the w3c repos
... our job is to build the bridge between the tests and the ppl who want to use them
... The FB ringmark front end is their front end

Gavin_: And it's not the Coremob front end. But we could to that.

tobie: If the test runner is done properly, I can add a front end on it easily.

jo: I think whether or not this group has its own front end. We need a framework you could put a front end on.


<rob_shilston> Travel status for London


tobie: testing...

<dom> Standards for Web Applications on Mobile: current state and roadmap

tobie: dom has a document about testing
... once we've identified the gaps
... the work is to prioritize them
... i like the plan2014 document
... not bothering about writing tests for stuff that's known to be interoperable in existing implementations
... but instead focus on the things which aren't known for this

jo: how do we know where there's interop without tests?

tobie: that's a hard problem
... but there are areas where there's agreement
... such as the html 5 parser

jo: we should record that we have a question here
... although it's desirable to not write tests for things known good
... it's hard to know that without tests

bryan: do we have a way to record which things we think are good

tobie: we should rely on html wg for html
... we should liase for ietf/ecma
... for http, there's no tests

jo: can we treat http as known working?

Josh_Soref: there are browsers which get http things wrong

tobie: there's an instance in iOS 6 where POSTs are cached

<jo> TFL service status for folks planning to rely on getting to airports etc. on time

tobie: there are areas of http which have surprising behaviors
... where vendors are evolving their behaviors in ways that were unexpected
... the ECMAScript test suite is massive
... i don't know if we want to include it
... and there's pretty good interop
... there's little point in asking them to run those tests, they're already doing well
... you don't necessarily need to join a WG
... but for licensing, it's best if a Company and not the CG that writes a test

jo: it's desirable that individual contributors create a test to say "this is what i mean that X doesn't work"

tobie: i gave the example of TestTheWebForward with fantasai

jo: do we see a need for a repository for people to contribute outside the w3c provided test repository?

tobie: i think that implies we'd handle the licensing

jo: which would be horrible

dom: there's a requirement that any test contribution be contributed by the contributor under the w3c license

jo: there's an element to "find the WG that owns the test"

tobie: and to get them to accept the test

dom: i agree with finding the group
... i'm not sure i agree on advocacy
... if it's wrong, then it won't be accepted but then we have no reason to push to get it adopted

jo: So if I write a test [...]
... I'd like to see if it passes or fails

tobie: Unless you're doing something special, you should be able to run the test on your own
... Main problem I've seen is that every group has some documentation on how to write and submit a test
... There is no central place where all of this is explained
... Maybe we should just point to what is the best explanation elsewhere

dan_: wrt the relationship between ringmark and w3c tests
... we talked about having ringmark be the front end for w3c, now I understand we need a test runner to be in between
... ... writing the test runner

tobie: effort needed to do that?

<jo> notes that explanation as to how to go about creating and submitting tests to the W3C repo may be something we need to address

tobie: to write the test runner? Depends on how much process you stick on top of it and how good the engineers are
... I would suggest it would take a reasonably good engineer a week to do
... There's a lot of extra complexity in the ringmark runner mainly because it does something we don't want to do anymore
... which is to compile a test page from sources
... the front end itself is ~ 2000 lines of JS
... it's reasonably well-written
... it could be reused pretty effectively
... need to wire it to cross-origin

jo: the existing ringmark tests are not referenced here

tobie: Impossible to know if they're useful until we have gap analysis against W3C tests

jo: So let's do the gap analysis and then see what needs to be ported over from ringmark
... what is the status of the ? that contains ringmark toay
... Status is pending further investigation of what tests are covered and which are not

tobie: Best person to answer that question is Robin, he had an action to do the assessment
... His assessment was that the tests are not good enough to spend analyzing them
... better to assess the W3C tests and see what the gaps are
... He was very polite about it, but said most of the tests were doing feature testing and some of them rae really not good

<hptomcat> can we do the gap analysis here and now to avoid food coma? :)

tobie: This has to be from someone else

[ rob_shilston volunteers to look at some of the tests ]

rob_shilston: I'll do half of them

<mattkelly> hptomcat: jetlag+gap analysis = sleep for me

jo: we need a test meister here, who is going to basically knock this into some form of plan
... Can i convert your offer into a greater scope?

rob_shilston: There are only certain interest to us. we're not interested in 2D gaming, so canvas stuff, I don't have any knowledge of

<hptomcat> can the gap analysis be crowd-sourced?

rob_shilston: Could say these are areas of tests and ...

<hptomcat> or split into parts and assigned to different people?

rob_shilston: Reading through existing w3c test suites, selecting ones that are pertinent to coremob 2012
... I can do that for the areas that I know about, but hard for me to do areas I don't know about

[ discussion of finding expertise per spec ]

fantasai: Can we do that today, look at the list of specs and split it up?

<hptomcat> +1

tobie: Just need to flatten the list and put names on it
... Talk with Dom, he's done similar work together

rob_shilston: I'll just put a list together then, and we can go over it in 10-15 minutes

tobie: Then we have the gap analysis of W3C tests
... we can do the same with ringmark, and if there are tests there that will fill the W3C gaps, get those tests incldued in the relevant repositories

dan_: [... outside test suites ]

tobie: Given these test suites would have to be licensed in a way we can use

dan_: Can cover that, maybe look at the license issue later
... Need to identify what's available to be used

tobie: I don't know of other test suites that cover W3C tech and are not public

dom: For audio stuff, the guy behind [some service] has a fairly advanced set of audio testcases that don't use testharness.js, but are more useful and important than what we have today
... I expect that some projects have developed testcases in different formats

tobie: if there's stuff and it's available, we can try to shim it

<rob_shilston> https://docs.google.com/spreadsheet/ccc?key=0At0Ot6R4q4ZadHZ0QWdVakpZWHE0QmdoM3BOMXdsSVE&pli=1#gid=0

tobie: Integrating 3rd party tests is a good thing to do
... either we need to host them ourselves, or we can convicne the 3rd parties to use a system that we can use to run the tests directly from their servers
... to convince them to post their results with postmessage
... would make it possible to integrate with coremob and other test suites around the web
... that would be really cool
... If we talk about small projects like Dom mentioned, value for those projects to
... Bring traction to their work
... Write a tests suite to get problems fixed, so this is win win situation
... wrt Mozilla's tests, can't run them directly from Mozilla's servers

fantasai: no, they're in the mercurial repo

<dom> (the tests I was thinking of are the one from soundcloud https://github.com/soundcloud/areweplayingyet )

mounir: Easy to get ahold of and run, though

fantasai: Ideally would get them cron-synched to the WG repositories, then write shims or slowly convert them

[looking at spreadsheet]

tobie: Let's be clear, this is a W3C spreadsheet
... Do ringmark analysis later if necessary
... The authoritative source is W3C. Where that is lacking, then we intervene

<hptomcat> is there an example we can go through?

<hptomcat> a simple spec&test

<fantasai> CSS Style Attributes

<fantasai> CSS Style Attributes Module Test Suite By Chapter

jo: So let's have 4 levels: no tests, poorly tested, widely tested, comprehensive (done)

fantasai: You'll want to have this analysis per spec section, not just per spec

jo: the process is ...
... you open the sheet
... for each line in the summary, assigned to you
... you update your contact name to you
... you copy the template sheet
... rename the sheet to match the spec
... and for each section of the spec
... you assess test coverage
... as a row in your sheet
... and then you put a summary status on the summary sheet

<bryan> Scalable Vector Graphics (SVG) Tiny 1.2 Specification

<jo> ACTION: Bryan to supply reasons to reference SVG 1.2 Tiny rather than SVG 1.1 [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action09]

<trackbot> Created ACTION-64 - Supply reasons to reference SVG 1.2 Tiny rather than SVG 1.1 [on Bryan Sullivan - due 2012-10-10].

[ question of HTTP tests, do any exist anywhere ]

[ discussion of data urls ]

[ and http testing ]

<fantasai> http://www.hixie.ch/tests/adhoc/data/

<Gavin_> Standards for Web Applications on Mobile: current state and roadmap

<fantasai> So for cases where W3C doesn't have tests, we should survey vendors to see if they have tests

jo: Tobie and I chatted during the break and decided not to resume on coremob2012 as there's other items that ought to be done first. Once we're clearer on use-cases and requirements, we can revisit it.

dom: Several people were uncomfortable yesterday about the lack of progress on coremob-2012 that had been made. Can we get some feedback?

jo: What we've agreed is that, within a month, the requirements should be established. So, I'm hoping that everyone will be happy if the group is back on track with those by the end of October.

Gavin_: I've a concern that the requirements docs will be redoing work that should have been doing in the working groups, and that that'll then need to be revisited. I see us as an umbrella spec and don't want us to get tied into the individual specs

jo: I think the spreadsheet initiated by Matt is the starting point for building the requirements document.
... I think we'll find that the features mentioned there should broadly map to existing specs. I think on a pragmatic basis, the majority of the things we come up with are broadly matched to the specs we're working on
... I think it'll be hard to work with each working group to find their requirement gathering process. Instead, we're producing the requirements for mobile use cases

Gavin_: Do we get the working groups to validate the requirements?

jo: I don't think so.

tobie: Having done some work in this area, I've found that it's difficult to express what exactly we want in the form of use cases. The high level is "we want stuff that works on other devices to work on mobile" - it's hard to have a specific use case for (say) CSS 2.1
... The second problem is the risk of getting stuck in a rat hole to dig into the specs on individual working groups, and that must be avoided. We need to define some high level scenarios / apps, and hopefully that'll avoid the problem.
... And this has to be done in a short amount of time, which also helps avoid Gavin's concerns.
... Overall, until there's a requirements document listing the use cases, then there'll be an ongoing argument as to whether a given spec is included or not.

Gavin_: But surely that's just moving the question

dom: No, because it's then clearer against the coremob charter

jo: I think we've been overly focusing on the standards doc as a primary deliverable, but I think all deliverables are important. We should realise that there's the possibility that there will only be specs for a certain percentage of our requirements, and tests for only some of those specs
... I'm pleased we've got names against actions, clarity over the methodology, and tight timescales against these actions.
... Speaking personally, the group is only six months old and I imagine that as the group continues and learns from its experiences, then the direction may change, which is fine. However, for now, we've got a route and we should follow that for the coming months.
... "You learn to swim by getting into the swimming pool and trying, not by just standing on the side looking at it"
... I'm happy that we're well positioned right now.
... Dom - would you like to comment on where we are?

dom: To me, this face to face has clarified a lot of things, and I now understand the obstacles that exist and the strategy for solving them.
... and there are many WG in less strong state.

jo: I'll take an action to summarise this meeting, aiming to have done that within a week.
... I propose that we basically spend the remaining time tidying up, rather than bashing through pending ACTIONs, which we can instead pick up in a forthcoming teleconference.

gmandyam: [Presenting about Vellamo - system level benchmarking from Qualcomm Innovation Center"
... This is a project that's been going on for about four years.
... The first version focussed on web runtime benchmarking
... It's evolved since then, and we're now looking into four broad categories: Page load, user experience (eg touch), video (performance, assessment of simultaneous playback0 and device APIs
... We tried to consider multiple parts of the hardware platform - from the CPU, GPU, Modem and multimedia components.
... We use this tool extensively within our organisation. It is publicly available, and we're hoping to make it available as an open source project.
... You can go to Google Play and get Vellamo right now.
... First version (strictly HTML5 testing) included rendering, javascript engine benchmarking (eg SunSpider), user experience tests (automated tests with simulated user interactions for flinging text, images and complex webpages)
... and basic networking (such as 2G and wifi)

dom: How does this compare with EMBC (Embedded Microprocessor Benchmark Consortium)

gmandyam: We're doing some metal-level testing (eg Dhrystone) which are closer to EMBC whereas we started with HTML5
... Webviews are not the same as the browser on the device, so you always need to test in multiple different ways.
... Our challenge was that, particularly for rendering, we had to be native to do accurate measurements. This is probably an action item we'll probably have to take from this group to explore how web performance benchmarks can be made available from the browser.
... We do crowdsource data. When people run the benchmarks, there's a bit of variability even on identical hardware platform.
... This can be run on any Android devices - contrary to blogosphere comments it's not just Qualcomm hardware that's supported.
... [Demos the app inside an emulator and shows the metal tests. HTML5 apps take a bit longer to run]
... There's also a series of additional tests that do involve user-interaction, where you start measuring touch screen responses and simultaneous media playback.
... We're willing to make this coremob compatible tests. But, rendering tests might be hard because of javascript issues and variability in Date() implementations.

Josh_Soref: Web performance has created things similar to Date().now. If you're just looking for a clock for timing, then it's not really related to the Date() object, and you can use the Navigation Timing API.

Gavin_: Is there a browser-based implementation?

gmandyam: No, and we're hoping to work with coremob to deliver that.

dan_: What's the relationship between this test if run between the webview and in the browser

gmandyam: I don't currently have quantitative information about the performance. We know they run in the browser. This'll also change with Chrome on Android with Jellybean 4.1

dan_: What's the minimum supported Android version?

gmandyam: I don't believe there's anything specific.

dom: Google Play says Android 2.3

jo: Thank you Giri.

<girlie_mac> Showcase demo app URL

<girlie_mac> QR Code

<girlie_mac> Try it on your mobile browser :-)

tobie: I wrote a post on performance issues on the coremob mailing list. John Nealan(sp?) from Nokia said wouldn't it be good to build a real web app to assess the real performance issues. Rebuilding existing apps was discussed, but we decided in the end to build a camera app.

<hptomcat> John Kneeland

tobie: Building an app like this would be a good showcase to developers of what they could do, and how they can use the different specs. We could then explore the gaps and challenges of cross browser implementation

<girlie_mac> warning: it does not work on: < iOS 6, IE, etc

<girlie_mac> tap the "instagram" icon to take a pic

tobie: The TodoMVC project was considered to be a model of how a common app can be built on top of multiple frameworks.
... and so it's hoped that the camera app can in future be built on top of multiple frameworks, as each will also highlight performance issues that become apparent.
... This app covers a lot of use cases whilst being fairly simple. From media capture, swiping, upload, index DB etc. It covers a lot of things that I think the coremob group cares about.

girlie_mac: It's just working on a few devices, mainly because of the media capture API support.

dom: It's a great project and really of excellent use. It seems to work fine in Chrome on Android, but not currently on Firefox. Is this supposed to be develoepd for cross-browser work, or targeted at a particular browser?

tobie: Tamomi had developed a bit of a prototype, and then we've hacked it into a mobile project over the last four days. So, it's very rough and ready around the edges.
... It's very much a proof-of-concept rather than even an alpha
... the goal is to have this open sourced on the coremob github account.
... We're looking to use Docco. It's a documentation tool and a good way of walking someone through a tutorial in code.

<tobie> docco.coffee

dan_: Going back to the test running framework - doesn't option 2 (which we've chosen to do) prevent the private running of tests.

tobie: Yes, we've not got promises of further resource to actually do the other options.

jo: It should be noted that we're not ruling out doing options three or four, but two is the starting point.

tobie: There's lots of unknowns and analysis that needs to be done - will the W3 JSON API support what's needed? How will the proxying work for some of the more unusual tests?
... Option 4 is definitely the long term desirable, but it's not something that's achievable right now.

dan_: How easy is it to scale from option 2 to option 4

tobie: There'll be reusable components.

jo: Merits of option 2 is that it's a constrained engineering problem that can be solved with the available resources and can deliver useful results.

tobie: Until the W3 JSON API is enhanced, it's hard to determine whether to pursue option 3 or 4.

dan_: How can we use option 2 with pre-release devices?

tobie: You can't. Your device details will get out. You'll need to work out how to do this, and it'll probably depend on exactly the agreement you've got with OEMs.
... If user-agent can be seen to be visiting websites, then option 2 can work fine by copying the test runner and using it on your own servers.

jo: That's a wrap.

Josh_Soref: I'll try to have the minutes tidied in one week's time.

jo: Follow up meetings - we've been holding teleconferences approximately every fortnight. Shall we continue?

[implicit yes from the group]

jo: Time is 2pm UK time, as that's equally inconvenient for east Asia and west coast US. Should we vary the times?

[implicit 2pm on Wednesdays]

<hptomcat> Asia for the next F2F?

jo: Let's plan to meet again in January.

<hptomcat> since HP is all over the place, i'll try to find out for a location in Asia

Wonsuk: I'll see if Samsung can host in Seoul in late January.

<jo> ACTION: Jo to look at organising the next F2F late Jan in Aisa [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action10]

<trackbot> Created ACTION-65 - Look at organising the next F2F late Jan in Aisa [on Jo Rabin - due 2012-10-10].

RESOLUTION: Thank Mozilla for generous hosting and excellent catering.

[ Applause ]

RESOLUTION: to thank Tobie for preparing all the documentation in preparation for the meeting.

[ Applause ]

RESOLUTION: to thank the scribes for writing

[ Applause ]

RESOLUTION: to thank the chair for his excellent chairing.

Summary of Action Items

[NEW] ACTION: bryan to find resources to implement what Dom writes in ACTION-60 within 1 month [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action06]
[NEW] ACTION: Bryan to supply reasons to reference SVG 1.2 Tiny rather than SVG 1.1 [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action09]
[NEW] ACTION: Dom to work with Robin on getting the right granularity exposed through the JSON API of the W3C Test Framework [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action03]
[NEW] ACTION: dom to write a wiki page with a breakdown of the tasks required to build the initial test frameworkj [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action05]
[NEW] ACTION: Gavin to start a draft of requirements of what our testing efforts should be [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action01]
[NEW] ACTION: Jo to look at organising the next F2F late Jan in Aisa [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action10]
[NEW] ACTION: kelly to find resources to implement what Dom writes in ACTION-60 within 1 month [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action08]
[NEW] ACTION: shilston to find resources to implement what Dom writes in ACTION-60 within 1 month [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action07]
[NEW] ACTION: Tobie to investigate tests that requie server side components [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action02]
[NEW] ACTION: tobie to talk to Robin to get cross-origin messages baked into testharness.js/ test report. [recorded in http://www.w3.org/2012/10/03-coremob-minutes.html#action04]
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.135 (CVS log)
$Date: 2009-03-02 03:52:20 $