[![W3C][1]][2] # Core Mobile Web Platform Community Group Teleconference ## 26 Jun 2012 See also: [IRC log][3] ## Attendees Present Robin_Berjon, Jo_Rabin, Josh_Soref, Wonsuk_Lee, Andrea_Trasatti, Andrew_Hubbs, Brian_Kelley, Dan_Sun, Dong-Young_Lee, David_Dehghan, Eunjoo_Lim, Itai_Dadon, James_Pearce, Jet_Villegas, Yan_Yu, Jean-François_Moy, Soohong_Daniel_Park_(Daniel_Samsung), Julian_Shen, Harrison_Wu, Marcos_Lara, Tobie_Langel, Vidhya_Gholkar, Wes_Johnston, Koichi_Takagi, Chihiro_Ono, Robert_Shilston, Tomomi_Imura_(girlie_mac), Mansoor_Chistie, Wai_Seto, Chris_Ramos, Lars_Erik_Bolstad, Yinghau_Tsai, Matt_Kelly, Ming_Jin, Nima_Ghanavatian, Elika_(fantasai) Regrets Chair Jo Rabin, Robin Berjon Scribe Josh_Soref ## Contents * [Topics][4] 1. [Testing][5] 2. [todo today][6] 3. [Testing Goals][7] 4. [Vendor Prefixes][8] 5. [Beyond Level 1][9] 6. [QoI Testing][10] 7. [Wrap][11] * [Summary of Action Items][12] * * * Date: 26 June 2012 Scribe: Josh_Soref ### Testing darobin: topic for today is Testing Testing Testing ... with maybe a little on vendor prefixing ... yesterday we talked about QoI tests ... conformance tests ... prioritizing interop issues ... testing the untestable ... we had a notion of testing for areas ... "categorizing testing/levels" [ darobin live edits a text file ] Robert_Shilston: you might be interested in building a web app that's primarily an audio player ... you might really care about ring 2+3 and only ring 1 of typography tobie: Robert_Shilston's point goes in the direction of the point that Josh_Soref made yesterday ... leveling doesn't make sense for extra features Dehghan: polling app developers ... "what features do you need for these themes" DanSun: we might want a video category [ Scribe isn't going to transcribe the text file ] mattkelly: the need to automate tests.... [ chairs bicker at eachother over testing the untestable ] tobie: categorization is useful ... but a goal of this project is to fight fragmentation ... having a device that's a good fit for some apps and not others ... is a problem ... i want to raise a flag about this jo: surely it's legitimate to have devices with a specific purpose in mind tobie: for the vast majority of mobile devices people are interested in ... I'd argue it's less so jo: say you're building a navigation - car app tobie: it's not mobile jo: it's "mobile scoped, not mobile specific" ... rob, why don't you lead us on QoI? Robert_Shilston: i don't know how to do this ... it's the thing that causes us the most problems: ... browsers not quite behaving right jo: give us an example Robert_Shilston: there are 2 examples that sum up the problems ... 1. password field ... if it has lots of DOM elements before it, it hangs when you press backspace ... we attach a DOM listener and clear it if it had one character ... 2. browser crashes if you have a thing to define a schema ... 3. browser clears local storage if you get a large calendar invite ... it took us 6 months to reach what we think is a reproducible test case for that last one darobin: some of the tests you mention are egregious corner cases of one browser ... hopefully in a single version of the browser ... we could have a test suite for that ... but it would require automation driving ... and it's more in the field of regression testing ... than QoI tobie: i agree w/ darobin ... you end up w/ test suites targeted at existing browser bugs ... and browser vendors don't like that Robert_Shilston: absolutely ... and it makes the browsers you build for look like they're the worst ... conformance to spec is something we don't pay attention to ... we need to focus on real devices ... nuances that don't quite work ... we need to deliver now ... waiting for things to improve isn't an option darobin: conformance testing brings a lessening ... of problems with time ... there's a reason no one's asking about GIFs or Tables Josh_Soref: only in the last 5 years (gifs were crashing before) ... (tables may have been problematic more recently) darobin: performance... not hardware accelerated graphics ... CSS animations ... where the frame rate suddenly drops to 1/5 s ... those are more common ... i think fixing those things can help Robert_Shilston: i think we're close to the problem of defining what a device is capable of ... and detecting if it's doing well enough ... or doing badly ... we have flags to detect "fastish" or "slowish" ... and vary how much we do based on how fast we perceive the device to be ... that isn't correlated to the absolute performance of the hardware ... it correlates to the browser darobin: there's a relationship ... part of what we've talked about before wrt QoI ... is whether it's doable ... and people get performance testing wrong most of the time ... I'd like to find out if this group wants to do it ... and has the right resources to do it right Josh_Soref: i want to praise FT for doing the right thing ... namely to detect performance ... and then adjusting what they do based on it tobie: among the QoI issues ... are those that i added to the spec yesterday ... asked on and on again by game makers ... speed of canvas ... speed of CSS animation ... multiple sounds together ... latency ... - which is really terrible on some devices ... -- close to a second on some devices ... things which prevent the game industry from building html games mattkelly: I'd add physics performance ... and GC pauses ... what i was focusing on in Ringmark early ... was page scrolling ... which affects everyone ... I'd assume including FT darobin: page scrolling performance ... touch responsiveness is delayed to handle clicks jo: people use native for touch reasons darobin: it's deliberate and can be hackily disabled Robert_Shilston: yet: can you talk about testing video output jet: Mozilla has backdoors into firefox to do testing ... for fps ... for e.g. animations darobin: there's the Browser Testing and Tools WG jet: it may well be ... i haven't seen a proposal from them darobin: the scope is anything related to testing a browser ... they'd be allowed to produce technology we're not tobie: we could write a note to that group darobin: if you have requirements around that ... then talk to them jet: for our needs, are requirements are largely met ... for this group you want to be able to test across all ... browsers itai: just wondering if the answer to these tests is highly dependent on the hardware perf ... to test one compared to another ... maybe we need a way to have a combined grade for a hardware platform ... combining memory bandwidth, computing power, ... ... say "I'm a class B platform" darobin: that's possible, but it's hard ... we talked about yesterday ... to draw a line and say "this is a typical platform" ... on anything like this or better, you need to do this or better ... if you do something piggishly on a high end hardware, good for you ... for feature phones, you can say you're below that itai: the idea is captured mattkelly: my opinion is in line with darobin ... we should have a baseline and go from there ... for level 1, 50 sprites @30fps, any phone should run ... even an iPhone 3 ... no Device Capabilities are in the fold ... e.g. NFC ... no one is building apps for that darobin: we're about to get an NFC WG ... i hear interest in this ... how do we make it actionable ... does someone want to pick a baseline hardware ... i want speed of CPU/GPU bkelley: you can't quantify performance with a couple of numbers ... different architectures ... memory bandwidth ... cache size darobin: can we cut corners in a way to be meaningful ... we know it's wrong, but good enough for our purposes bkelley: by establishing that baseline, we exclude devices tobie: one issue at the bottom of this is whether we can look at a browser outside the device it's running on ... as an end user, i care about how quickly it runs on my browser on my phone ... they're tied together in a way much deeper than on desktop ... the other aspect is who the audience of these tests is ... for browser vendors, being able to compare matters ... for developers, it matters whether you can build to a phone mattkelly: 500mhz, no memory ... and completely awesome browser, and does 50fps, and it passes ... maybe we can go w/ numbers for individual target bits ... don't worry about hardware darobin: say targets for browser-device Dong-Young: what matters is the combination of browser-hardware darobin: we can test that ... it just makes more test results tobie: you can do analysis to compare browsers on 200 different devices jo: this conversation is going in the direction i want to talk about ... setting a particular hardware spec is the road to ruin ... many a young man has fallen on that road ... it's important to not talk about mobile phone ... say your purpose is to make a "video player" ... it should be testable ... relativistic measures ... are probably the only sensible way of testing ... if i produce a thing and it works abysmally on a device ... it's not useful mattkelly: I'd argue we need very clear focus ... at least short term ... my opinion is the group should focus on where the market is ... to catch up w/ native ... enable 2d games ... and where people will buy in new markets ... when we hit critical mass ... then it's much easier to talk about more aspirational issues ... focus on current market ... where they're sold and why ... 2d games ... a/v apps ... camera apps jo: i don't disagree ... I'd say categorizing in a limited and extensible way is a good thing ... i think relativistic measures is a good way Josh_Soref, you wanted to say target UCs Josh_Soref: I don't know if it's technically possible to count how many sprites are on the screen in Angry Birds, but a survey of the top N apps in the market, 2d games, video players... ... Top 3 devices, top 10 apps for a thing, see what they're using ... Maybe 25 sprites at 30 frames per second ... You test at 15 frames, 30 frames, 60 frames ... Figure out how many sounds, test for that ... you build tests so it can test more than the target, so it can report that ... then the tests can naturally scale up ... you can go back and say "This year, we need twice as many sprites" ... we don't need to rewrite the tests, just change the benchmarks ... I don't think it's very hard to do most of this. Might be boring. Might be fun jo: mattkelly you have done sprite counting, or you haven't don sprite counting? mattkelly: we did this 8 months ago ... we were building jsgamebench ... we built a 2d game bench ... we launched sprite counting in ringmark about 2 weeks ago ... we measure sprites rendering @30fps ... bare minimum ... high games need @60fps ... but that's rare, even on XBox ... it's definitely testable ... but on devices, push notices inbound can lead to a pause ... causing a fail, same for GC() ... from my perspective, if the pause happens, fail the test anyway ... we're definitely doing sprite counting tobie: jo, you were asking about type of sprites in a game darobin: jo was asking if sprite counting was done tobie: the answer to that was "yes" jo: mattkelly just answered that at more length tobie: a point of cory's research for jsgamebench ... was to define types of games and sprites per game ... cards have max of 5 sprites concurrently ... 25 for 2d platform games jo: action to tobie to chat this into the public domain **ACTION:** Tobie to provide numbers for required sprites/fps in games [recorded in [http://www.w3.org/2012/06/26-coremob- minutes.html#action01][13]] Created ACTION-26 - Provide numbers for required sprites/fps in games [on Tobie Langel - due 2012-07-03]. jo: it seems publishing the numbers you're talking about ... it tells developers you need to target this ... and to browser vendors ... the test's job ... is to see if you can do 1fps, 2fps, 6... ... until it barfs ... at that point, you say "you did 25fps", "but you can't do X/Y/Z @fps" ... that's all it should say, not pass/fail ... but there are external qualifiers ... it doesn't matter if you haven't reached that ... external contemporaneous events on a device ... in the event you get an SMS during audio, what happens ... ok, you can do 60fps ... but what happens to the battery ... there's a range of metrics that are testable ... no Pass/Fail criteria ... but perfectly testable tobie: cory's jsgamebench ... brought to this discussion ... to have anything smooth enough, you need 30fps ... you don't need more than that, except hard core 3d games ... and less doesn't work ... about Battery ... how badly running a game drains the battery ... it goes back to browser-hardware combo ... good browser on bad hardware ... will have the same perf on bad browser on good hardware ... but good browser will probably drain the battery less than bad browser ... adding that would be good to test jo: and you can directly compare to find 'good' / 'bad' browser on a single device darobin: trying to summarize to reach actions ... anyone want to write tests? ... since you joined this group to do testing jo: i joined this group to talk about testing mattkelly: the question is who wants to write these tests ... I'm happy to port over what we've done w/ ringmark jo: can we reverse out the underlying bits ... to codify the tests we want to accomplish mattkelly: we've done a bit of research for jsgamebench an interesting study on browser battery consumption: [http://www2012.org/proceedings/proceedings/p41.pdf][14] mattkelly: GC pauses can be guessed based on dramatic framerate drops vidhya: what's a GC pause **ACTION:** mattkelly to document JSGameBench and the approach behind it [recorded in [http://www.w3.org/2012/06/26-coremob-minutes.html#action02][15]] Sorry, couldn't find user - mattkelly mattkelly: sorry, Garbage Collection pause Josh_Soref: GC pauses run a bit on the main thread ... historically heavily there, recently less so mattkelly: for