Minutes: WSC WG face-to-face 30 May 2007

Minutes from our face-to-face on 30 May were approved and are
available online:

  http://www.w3.org/2007/05/30-wsc-minutes

Regards,
-- 
Thomas Roessler, W3C  <tlr@w3.org>



   [1]W3C

                              WSC WG face-to-face
                                  30 May 2007

   [2]Agenda

   See also: [3]IRC log

Attendees

   Present (in order of registration)
          Thomas Roessler
          Mary Ellen Zurko
          Daniel Schutzer
          Johnathan Nightingale
          Phillip Hallam-Baker
          Hal Lockhart (by phone)
          Mike Beltzner
          Shawn Duffy
          Tyler Close
          Serge Egelman
          Maritza Johnson
          Luis Barriga
          Robert B. Yonaitis
          Audian Paxson
          Stephen Farrell
          Tim Hahn
          Stuart Schechter (by phone)
          Rachna Dhamija
          George Staikos (by phone)
          Bill Doyle
          Yngve Pettersen

   Chair
          Mez

   Scribe
          stephenF, beltzner, tjh

Contents

     * [4]Topics
         1. [5]Agenda Bashing
         2. [6]Conformance model
         3. [7]Intermediate agenda bashing
         4. [8]Robustness Testing
         5. [9]security usability testing
         6. [10]implementation / testing / etc
         7. [11]functional and conformance testing
         8. [12]InScopeByCategory
         9. [13]day One wrap-up
     * [14]Summary of Action Items
     __________________________________________________________________

Agenda Bashing

   agenda bashing

   next f2f dicussion moved to later on

Conformance model

   tlr on what we need to do for our specs

   tlr:what does it mean for us to recommend something?
   .. wants us to make testable statements ...
   ... by structuring the rec text into requirements, good practice and
   ...
   ... implementation techniques ...
   ... also need to specify assumptions, e.g. icon display doesn't ...
   ... work with screen reader, have to say that visual display ...
   ... is an assumption ...

   robert: wants no device dependence
   ... we should assume everything is usable device independent including
   for accessibility, mobility etc

   tlr: yes, but some implementation techniques are device dependent and
   we do want to document those
   ... WCAC does content accessibility guidelines...
   ... but we're not talking so much about content, more about user agents
   (chrome etc)

   robert: US govt. accessibility rules != w3c ones

   tlr: let's consider some rec-text as a group now...
   ... the stuff about favicons.

   1st rec is that sites should not incorporate favicons at this time

   phb: for display that informs trust decisions, only display
   authenticated information

   text here is at:
   [15]http://www.w3.org/2006/WSC/drafts/rec/#favicon-favicons-rec

   beltzner: who are we expecting to read/conform to REC?

   (basically are we addressing UAs and/or sites)

   phb: some types of site may pay attention (e.g. FIs)

   mez: charter allows us to include both

   serge: any evidence about favicons being trusted?

   marzita: yep

   tyler: still in the dark about what's a useful REC, hard to have
   abstract discussion
   ... current bank practice shows that FIs may drag their feet anyway

   ynvge: maybe move favicon to somewhere user normally doesn't trust?

   sean: favicons being used, how likely is it that something else would
   be accepted?

   mez: back to what needs to be in the text for FPWD?

   tlr: shows that there are lots of favicons in use
   ... shows example of padlock favicon
   ... this applies to UAs that display bitmaps
   ... and for which favicon uses (e.g bookmarks, desktop, location
   bar...)
   ... address bar is where favicon shows where we are now

   mike: tlr is asking us to be very specific about MUST/SHOULD/etc

   tlr: level of abstraction needs to be right, current text very far from
   being specific enough

   tyler: would like to document stuff we're planning to do to get
   feedback

   mez: we can do that

   tlr: says what REC means

   tyler: we're not competent yet to do MUST/SHOULD

   tlr: ok to use MUST/SHOULD in FPWD even if its likely to change (don't
   worry)

   mez: FPWD is important to get attention/feedback, important that status
   is clear

   rob: MUST/SHOULD/MAY all exist?

   tlr: yes, we can use rfc 2119 or something else

   audian: back to tlr's bullets, is what tlr said what we want?

   tlr: types on screen

   phb: text equivalent is <title>
   ... lots of chat...

   serge: we need text about definitions, e.g. saying what we mean...
   ... when we say "verified sites"

   mez: we have a glossary

   phb: different logos should have different authentication types/levels

   <ses> Hi. Stuart isn't really awake right now but he'll be recording
   what he sees in the jabber room until he wakes up (most likely for the
   post-lunch discussion.)

   break for n/w

   <beltzner> tlr, [16]http://beltzner.ca/webdav/forthomas.txt

   <beltzner> Mez_,
   [17]http://www.w3.org/2002/09/wbs/39814/f2f3sched/results

   <Mez_> serge and sduffy, here is the glossary, which should include
   chrome

   <Mez_> [18]http://www.w3.org/2006/WSC/wiki/Glossary

   back now...

   tlr: postpone rec formatting discussion until usability testing

   tyler: why change the template?

   tlr: to be able to have conformance requirements
   ... tlr typing a new template..
   ... new template's most important bits are applicability, requirement
   and techniques

   <tlr> ACTION: thomas to update template with material from discussion;
   notify e-mail list [recorded in
   [19]http://www.w3.org/2007/05/30-wsc-minutes.html#action01]

   <trackbot> Created ACTION-227 - Update template with material from
   discussion; notify e-mail list [on Thomas Roessler - due 2007-06-06].

   <tlr> [20]http://www.w3.org/2002/09/wbs/39814/f2f3sched/

   for next f2f - fill in questionaire before lunch (next 2 hrs)

   <beltzner> scribe is beltzner

   <beltzner> tlr, ^ make that happen

   <johnath> scribenick beltzner

   <johnath> does that do it?

   <tlr> ScribeNick: beltzner

   <tlr> Chair: MEZ

Intermediate agenda bashing

   (there is some agenda bashing happening as we mitigate for
   technology-availability)

   Mez_: call to order, Rachna to start session on "Usability Testing"

   <Mez_> tlr, please action rachna to share the slides somehow; tx

   <scribe> ACTION: rachna to share slides about usability testing from
   dublin f2f [recorded in
   [21]http://www.w3.org/2007/05/30-wsc-minutes.html#action02]

   <trackbot> Created ACTION-228 - Share slides about usability testing
   from dublin f2f [on Rachna Dhamija - due 2007-06-06].

   Mez_: agenda reordering: s/Rachna/BillD

Robustness Testing

   billd: Robustness Testing
   ... current browser environments from a user standpoint bring a bunch
   of technologies together ...

   <Mez_> tlr, please action bill to share his slides too. tx.

   <scribe> ACTION: bill to share his slides on robustness testing from
   the dublin f2f [recorded in
   [22]http://www.w3.org/2007/05/30-wsc-minutes.html#action04]

   <trackbot> Created ACTION-229 - Share his slides on robustness testing
   from the dublin f2f [on Bill Doyle - due 2007-06-06].

   rachna: do we have a definition for robustness?

   Mez_: no, one should be added to the glossary

   <scribe> ACTION: bill to define robustness for WSC glossary [recorded
   in [23]http://www.w3.org/2007/05/30-wsc-minutes.html#action05]

   <trackbot> Created ACTION-230 - Define robustness for WSC glossary [on
   Bill Doyle - due 2007-06-06].

   <Mez_> tx beltzner

   bill: in IT circles, a "tiger team" was used to test effectiveness of
   IT security measures ...
   ... one side would attack, another side would try to detect and
   evaluate effectiveness of process and procedures being tested ..

   Mez_: I see some potential overlap between items to be tested for
   robustness vs. user undestanding or usability ...

   <Mez_> things that web content can do that exactly emulates the
   security context information displays in our technical report are
   "pure" robustness attacks

   <Mez_> they will leverage either user agent vulnerabilities, or design
   gaps or issues in the user agents

   bill: the attacking team might go after the OS, plugins, user agent,
   network layer

   <stephenF> tlr has the magic zakim trick for the phone

   <tlr> stephen, context?

   bill: your browser actually sends a lot of information when you visit
   websites

   <stephenF> tlr - Mez wants the phone on

   bill: [demonstrates metasploit]

   <Mez_> jan vidar, is that you?

   <Mez_> can you hear us?

   beltzner: this seems to be out of scope, though, since we're talking
   about user agent and system exploits

   bill: well, exploits are out of scope, but the UI for security context
   is in scope

   beltzner: in so far as the user agent isn't exploited, yes

   bill: so if the patches are out of date, at the user agent, OS or
   plugin level, and the user is at risk

   PHB: so, this used to be a concern with things like macromedia, where
   the browser allowed plugins to take control easily and thus become
   exploited

   <Zakim> tlr, you wanted to ask what our question is

   tlr: our deliverables are about how browsers and site authors should do
   things, and I wonder what are we asking of a robustness testing
   process?

   rachna: it's hard to enumerate the robustness tests in advance, really

   <tlr> beltzner: should we have indicators that help people ensure they
   have the latest browser?

   rachna: thus far we've talked about security indicators in chrome about
   the web content, but not indicators about ensuring that the user is
   running with an unexploited browser
   ... is that in scope?

   Mez_: or in our goals? I don't have an immediate reaction

   ???: I'm hearing a lot of "the user can't determine", and want to
   remind people that the user isn't an administrator or security
   professional

   sduffy: this feels like an odd road to me, in terms of whether or not
   the user has an up-to-date browser

   rachna: but I think the up-to-date-ness has a larger impact on security

   Rob: when we look at testing (robustness, user, etc) we're talking
   about user-agents ...
   ... users are doing a lot of different things in the content-area, not
   just content, but applications, and sometimes applications in those
   applications ..
   ... shouldn't we be breaking out our testing per area/recommendation
   developed by this group?

   Mez_: The reason we have three one-hour discussions about testing is to
   lead into planning those three categories of testing.

   stephenF: if you're talking about applications updates and such,
   there's an ITF workgroup on network assessment, they'll address most of
   these issues

   <tlr> [24]http://www.ietf.org/html.charters/nea-charter.html

   sduffy: my main objection to including user agent updates in scope is
   that it doesn't end up solving the problem since the OS could be out of
   date

   tlr: I don't even understand what it means for us to recommend that a
   user has the latest user agent, since the user isn't the compliance
   target

   <tlr> ... or is he?

   tlr: we have a bunch of items that describe current robustness
   practises that have not been migrated to the format required by our
   recomendation template
   ... in my opinion that should be a priority so that we can recommend
   appropriate robustness tests

   Mez_: that's not what we're talking about, IMO, so I'll call that out
   of scope for the current conversation as is all of browser-updating

   bill: how about the second issue of user agents divluging information
   about the user?

   Mez_: how does that fit in our scope?

   beltzner: it does in that if the user doesn't mean to provide
   information to non-trusted websites, and isn't aware of the information
   being provided by default

   Mez_: if we make a recommendation about privacy, then we should ensure
   that the recommendation is robust
   ... until we make a recommendation about that, though, discussing how
   to test its robustness seems wrong
   ... more conversation about scope definition ...

   tlr: what I hear you (Rob) saying is that robustness testing can help
   us identify weaknesses in all sorts of web applications, and while
   that's valuable, it's not within the charter of this working group

   <johnath> :)

   tlr:tlr continues his point, driving it home with the force of a pile
   driver ...

   Mez_: anything else to share, billd?

   bill-d: these were the points I wanted to raise

   sduffy: we decided that some aspects of web aps were in scope ...
   ...
   ... there's a difference between SQL injection and website
   vulnerability where clicking on a URL results in XSS/untrusted content
   ...
   ... the former is out of scope, the latter is in scope ...
   ... so web-apps as a whole aren't out of scope, are they?

   johnath: it feels like we have enough work testing the robustness of
   user facing display of web security context
   ... so I'm excited enough without taking on extra worries about user
   agent/plugin/OS robustness

   bill-d: I'm still trying to lock down discussions of what's
   in-scope/out-of-scope, which is why I wanted to bring this up again

   Mez_: well, now you know: it's testing the recommendations, not the
   entire system

   bill-d: still not clear where we are on dilvuging of information to
   websites

   beltzner: that's not part of our problem statement yet, let alone our
   goals, let alone ...

   Mez_: proposes an action on starting a discussion about the information
   divluged to websites by user agents?

   tlr: IMO, it stretches the notion from communicating web security
   context to one about privacy

   johnath: I think it's doomed for other reasons, but I don't think that
   precludes the discussion on the group

   <scribe> ACTION: bill to start a discussion about including
   descriptions of the information divulged to websites by user-agents
   [recorded in
   [25]http://www.w3.org/2007/05/30-wsc-minutes.html#action06]

   <trackbot> Created ACTION-231 - Start a discussion about including
   descriptions of the information divulged to websites by user-agents [on
   Bill Doyle - due 2007-06-06].

   <johnath> meeting adjourned till 12:50pm local for lunch

   <ses> When does local lunch end?

   <johnath> ses: 5 minutes left

   Mez_: Rachna, take it away!

   <johnath> ses: starting back up now

   <tlr> jvkrey, we're restarting

security usability testing

   rachna: so, I'll start by defining what I mean by usability testing ...
   ... traditional security methodology of robustness is good but not
   sufficient ...
   ... HCI methrodology isn't sufficient since the attackers are modifying
   along with us ...
   ... so I propose "red team" usability testing, where we actively attack
   the user
   ... so both "can we use the system" and "how can we attack the user to
   confound them"
   ... I have a bunch of questions
   ... 1. Will we test ideas or specific implementations of ideas?

   <ses> I just joined

   Mez_: how would we test a concept?

   rachna: so for example, we could test a variety of implementations
   instead of a specific one

   <maritzaj> [26]http://www.w3.org/2006/WSC/wiki/SharedBookmarks

   tyler: the tricky thing is once they make their implementation, they
   stopped testing the concept

   <maritzaj> #2 under usability studies about internet security

   <Mez_>
   [27]http://www.simson.net/ref/2006/CHI-security-toolbar-final.pdf

   serge: instead of testing a toolbar, the study in question (see
   maritzaj's link) tested the effectiveness of each indicator

   <ses> Is anyone else on the phone? I could barely hear Rachna and I
   can't hear Serge at all.

   ses, sec, I'll move the phone

   <ses> And he's got the phone so we should be able to hear him :)

   beltzner: one could test the concepts on which a design is founded,
   instead of the design itself

   maritzaj: the answers will likely vary per recommendation

   Mez_: so I don't understand how we'd decide whether to test a concept
   or a design

   rachna: yeah, I think we'll need to figure that out

   tyler: part of this might be recognizing patterns in our
   recommendations and test abstractions that cover aspects in each
   recommendation
   ... to what extent do you think we can/should rely on the literature
   instead of retesting some of those findings?

   johnath: we have a huge body of research that has led to some of these
   recommendations, I think it should be up to us to point to the
   foundation and identify areas for follow up testing

   rachna: well, where huge = from 2005 onwards
   ... 2. At what level of fidelity should we be testing?

   <ses> Calling the existing body of research huge is the kind of
   statement that could lead a small research area to turn downright
   anorexic.

   rachna: low-fi prototyping is sketches, medium-fi is flash or web
   mockups, high-fi is extensions or browser modifications

   maritzaj: previous research will come into play as references

   rachna: using lower-fidelity prototypes will increase our bandwidth
   ... 3. What should we be testing?
   ... learnability, efficiency, skills required, flexibiliy,
   satisfaction, errors, compliance rates

   Mez_: remembers a study, sort of, that might be about how instructions
   affected the results ...

   maritzaj: the Jackson/MSR one?

   Mez_: yes! and I haven't heard anything about that dimension

   rachna: I think you're referring to a problem that exists in that they
   had to describe EV certificates to one group

   tyler: yes, but I thought they controlled for that

   <PHB> PHB thought he was on the queue before, had disconnected when I
   powered up the VPN

   Audian: often helps to start with a list of assertions and
   verify/validate those first, then move onto low-fi mockups, and use
   those for the validation

   rachna: yes, that's an excellent way to do designs

   <ses> Can't hear Phil.

   <ses> I can't stay awake without some cursing in my direction.

   <ses> :)

   <Audian> what = a quality test base?

   <Audian> statistical relavance

   rachna: we need to identify the goals of a study, as well: why do users
   behave they way they do? what are users reliably capable of performing?
   does technology X protect against attack A? etc, etc.

   Mez_: 100% usable security is a dream, not a goal, and we should make
   sure that the studies aren't tasked with finding the 100% solution ...
   ... people who believe in a recommendation will always argue that the
   hit rate is "good enough", though, which worries me.

   tyler: do we want to provide a target hit rate, then?

   <Zakim> johnath, you wanted to respond to Mez, tyler, on quality of
   data

   johnath: it's easy to say "20%" is worse than "40%", but the trick will
   be showing statistical significance in the effectiveness of the
   recommendation
   ... the test you want is "this creates _an_ improvement"

   johnath: at which point we can defend the assertion

   <Mez_> better for the majority of web users

   beltzner: once a recommendation is established as significant, if it
   competes against another recommendation, then we can do a comparison
   test

   tyler: so what does it take for something like that to get put into
   Mozilla?

   beltzner: bring it forward to the Firefox product team (it's an open
   meeting) and propose it; in my experience, you need to prove the worth
   and value for the majority of the web

   rachna: should we test in-lab or in the wild?

   tlr: do we need to answer this now, or leave this to you?

   rachna: we can come back to this later, but it depends on the
   resources?

   PHB: if we can work out a way of doing it in the wild, it would be much
   better

   <Mez_> +1 to phb

   <Mez_> from an industry pov

   PHB: oftentimes in-lab participants already know something about
   security

   tyler: this intersects with the fidelity; depolyable add-ons are more
   easily tested "in the wild"

   maritzaj: tradeoffs for both, in-wild experiments aren't easy to
   organize

   <ses> Everyone forgot to project. I haven't heard any of the speakers.

   <ses> Beltzner and Rachna are coming in clear.

   dan: in lab can be used to filter and then in the wild can be used to
   test larger populations

   tyler: bias is obviously a concern, but studies (like rachna and
   stuart's) have shown a non significant correlation between bias and
   effectiveness

   rachna: well, that doesn't mean it didn't exist, since we were
   controlling for it, not testing for it

   serge: we didn't find any correlation either
   ... we discussed some of these issues at CHI, and the differences
   seemed to break down as wild being good for quantitative, lab being
   good for qualitative

   tim: there are various audiences for the user agents which will affect
   the demographics of in the wild testing

   dan: we could do tests through deploying in various banks, etc, to get
   a good cross section

   Audian: I've often seen domain experts being far more critical of a
   solution until the "would you recommend this to a friend" at which
   point they all decided they would

   rachna: another complication is that each proposal can be attacked in a
   different way

   stephenF: how do you figure out the workload?

   rachna: depends on the attack, and whether or not there are
   easily-reused exploits we can copy and paste

   stephenF: there's some shortcuts we can take

   PHB: we should be tactical about some things, like we know that w.r.t.
   phishing, a takedown service shifts the problem to another target, but
   doesn't kill the overall problem
   ... or in crypto, adding a bit to the key doubles the work for the
   attacker

   <PHB> Well it would be rather nice to know if the solution was intended
   to be tactical or strategic

   tlr: does this imply a different set of scenarios or can we just rely
   on our existing ones?

   <PHB> Rather a lot of solutions being sold a year ago as strategic
   turned out to be distinctly tactical

   tyler: our recommendations should each address the threat that they
   attempt to defeat

   <Mez_> [28]http://www.w3.org/2006/WSC/wiki/ThreatTrees

   <PHB> Sitekey

   rachna: does that mean modifying our threat trees to include these (on
   her slides) attacks?

   tyler: I think it would be a different section

   rachna: I'm very confused

   dan: we need to make sure the recommendations are significant at
   attacking the problems (or something? help?)

   rachna: another question is do we want to discover the usability
   problems, or do we want to assert significant effect?

   tlr: that is a question which we should try to answer today

   <johnath> actually

   <johnath> Serge is saying it

   serge: my feeling is that we should be trying to assert effectiveness

   <johnath> oh okay...

   maritzaj: small N might be useful in the early stages as we try to
   filter down

   <Mez_> +1 to johnath

   johnath: sprinkles +1s all over the PhD students

   serge: I'm on a grant, we might be able to get similar resources from
   other groups

   <Zakim> Mez_, you wanted to tell serge that we're testing
   recommendations

   Mez_: we're gonna be testing our recommendations, so you should be
   testing those

   <johnath> serge: I think Mez misinterpreted your "I have a grant to
   work on this stuff" to mean "I have stuff that I am already testing,
   that maybe I haven't mentioned here yet" instead of "I have money, I
   can probably help out with testing our recs."

   <serge> no, what I meant is that recommendations can be implemented and
   tested

   <johnath> right, yes, but not that you were just incidentally
   mentioning unrelated thesis work :)

   rachna: testing requires IRB approval for me, this adds to overhead

   Mez_: the W3C staffers won't be doing the testing, so there's no
   MIT/IRB requirement

   <serge> hey, if the recommendations can be included in that, great.
   Likewise, those in the corporate world have a vested interest in making
   the user studies happen because the results can be incorporated into
   products.

   <ses> Many IRBs demand to be involved if anyone in their institution
   will be an author on findings. If the note is considered a finding,
   that could cause issues.

   <PHB> No human subjects oversight? time to redo Milgram?

   <Mez_> tim andI would never redo milgram

   <johnath> Mez is too nice for Milgram

   <johnath> aww

   <Mez_> ses: author/editors only I presume; I pesume acknowledgements
   wouldn't be an issue

   <ses> I'm the one who couldn't spell.

   <Mez_> hahaha

   <tlr> ses, MEZ, I'd think it's most useful to aim at scholarly
   publication of original results and summarize / cite that in the W3C
   deliverables

   <ses> My spalling suks.

   <serge> I'm currently working on a related study, if anyone's
   interested I can summarize it

   <Mez_> we're running a bit long; how about in email serge?

   <serge> okay, in that case I might just wait until I have more results.

   <Mez_> what's the eta on that?

   <serge> maybe 2-3 weeks.

   <Mez_> ok, sounds just fine

   <Mez_> beltzner; want to give serge the action item?

   <scribe> ACTION: serge to share results from his study once he has them
   [recorded in
   [29]http://www.w3.org/2007/05/30-wsc-minutes.html#action07]

   <trackbot> Created ACTION-232 - Share results from his study once he
   has them [on Serge Egelman - due 2007-06-06].

   <tlr> ACTION: rdhamija2 to make sure Jagatic et al on social phishing
   is in SharedBookmarks [recorded in
   [30]http://www.w3.org/2007/05/30-wsc-minutes.html#action08]

   <trackbot> Created ACTION-233 - Make sure Jagatic et al on social
   phishing is in SharedBookmarks [on Rachna Dhamija - due 2007-06-06].

   <tlr> ACTION-232 due 2007-06-30

   <serge> 6/6 is not 2-3 weeks, unless we're on some crazy new calendar
   system...

   <Mez_>
   [31]http://www.indiana.edu/~phishing/social-network-experiment/phishing
   -preprint.pdf

   <tlr> serge, see my remark about due date

   <Mez_> is the jagatic study, and it's in our shared bookmarks

   <serge> thanks

   <Mez_> beltzner, give rachna an action on the ebay www2006 jakobsson
   paper; I can't find it in shared bookmarks or on the web

   <scribe> ACTION: rdhamija2 to add www2006 jakobsson, Florencio &
   Hursley MSR paper to our shared bookmarks list [recorded in
   [32]http://www.w3.org/2007/05/30-wsc-minutes.html#action09]

   <trackbot> Created ACTION-234 - Add www2006 jakobsson, Florencio &
   Hursley MSR paper to our shared bookmarks list [on Rachna Dhamija - due
   2007-06-06].

   <Audian> battery almost dead

   tyler: so it looks like there's two in the wild tactics: actively
   attack and measure effectiveness, or insrument existing browsers or our
   solutoins

   <tlr> ACTION-234 confuses several papers

   <tlr> Jakobsson Ratkiewicz is what I meant, it's from WWW 2006.
   [33]http://www2006.org/programme/item.php?id=3533

   <Mez_> tlr, give whatever other actions are needed

   beltzner: it seems to me like active attacks are way of validating our
   threat trees, not our solutions

   tlr, want to take the action from rachna? I'm sure she woudln't mind

   bill-d: if we make changes to the user experience, how do you test in
   the wild?

   rachna: right, and that's tough in lab as well, since sometimes users
   need to be trained, or the act of them being in the lab ends up
   training them

   Audian: there's ways of doing it almost at random by pulling people
   aside in schools and malls

   stephenF: possibility that some of the "hits" are false positives of
   users entering wrong passwords on purpose

   <Mez_> tlr or beltzner - an action on rachna to update timeline with
   things like irb turnaround

   <Mez_> please

   rachna: once we have proposals, we need to enter low-fi prototyping
   phase, then figure out what we're trying to prove, set up the studies,
   set up the infrastructure, etc, and this all requires resources and
   time

   <scribe> ACTION: rdhamija2 to update / create a user testing timeline
   with things like IRB turnaround, setup, etc. [recorded in
   [34]http://www.w3.org/2007/05/30-wsc-minutes.html#action10]

   <trackbot> Created ACTION-235 - Update / create a user testing timeline
   with things like IRB turnaround, setup, etc. [on Rachna Dhamija - due
   2007-06-06].

   serge: there have been cases of people in studies entering real
   information, in counterpoint to stephenF

   maritzaj: reiterating an earlier point, we should tie references in the
   shared bookmarks and tie them to recommendations

   <maritzaj>
   [35]http://www.w3.org/2006/WSC/wiki/StatusQuoUserStudyResults

   Audian: how do we know who's submitting resources for testing

   <scribe> ACTION: rdhamija2 to track donations of time and resources for
   usability testing [recorded in
   [36]http://www.w3.org/2007/05/30-wsc-minutes.html#action12]

   <trackbot> Created ACTION-236 - Track donations of time and resources
   for usability testing [on Rachna Dhamija - due 2007-06-06].

   <scribe> ACTION: maritza to drive process of tying recommendations to
   references in SharedBookmarks [recorded in
   [37]http://www.w3.org/2007/05/30-wsc-minutes.html#action13]

   <trackbot> Created ACTION-237 - Drive process of tying recommendations
   to references in SharedBookmarks [on Maritza Johnson - due 2007-06-06].

   <scribe> ACTION: rhdamija2 create and document user testing plan (with
   links to timeline, donations, prototypers, etc) [recorded in
   [38]http://www.w3.org/2007/05/30-wsc-minutes.html#action14]

   <trackbot> Sorry, couldn't find user - rhdamija2

   <scribe> ACTION: rdhamija2 create and document user testing plan (with
   links to timeline, donations, prototypers, etc) [recorded in
   [39]http://www.w3.org/2007/05/30-wsc-minutes.html#action15]

   <trackbot> Created ACTION-238 - Create and document user testing plan
   (with links to timeline, donations, prototypers, etc) [on Rachna
   Dhamija - due 2007-06-06].

   <ses> I'm singing off. I've had too hard a time keeping up and staying
   focused given the quality of the call. (I also need to drive into work
   at some point.)

   <johnath> break until 2:30 local time (12 minutes)

implementation / testing / etc

   beltzner: big believer in prototyping, sketching, whiteboarding - to
   build wireframes
   ... use that as a way of expressing things better than text
   ... allows communication and discussion
   ... but once that finishes, there should be no limits on what
   technology is used.
   ... should be something that enables testers to get what they want out
   of it - HTML, Flash, Firefox extensions
   ... all the way to an installable browser client which can be
   downloaded.

   tlr: might include changes to other browsers (e.g. Opera)
   ... on the one hand - prototypes for testing
   ... on the other hand - things taken up by user agent implementers.
   ... what recommendations are sufficiently spelled out to allow for
   implementation?
   ... what more do the browser vendors need to understand for each of the
   recommendations?
   ... what are the reasonable expectations for how long it would take to
   implement?

   beltzner: kind of putting the cart before the horse ... lets get the
   prototypes available so that anyone can run time (any browser vendor
   included)

   tyler: in terms of time it takes - have experience with add-ons for
   Firefox and IE.
   ... much easier (in Tyler's experience) in Firefox than in IE.
   ... IE requires use of the COM libraries
   ... Firefox lacks some documentation (source browsing required in some
   cases to understand Firefox operation)
   ... IE has support for .html and .hta - where HTA provides ALL window
   format to be under the control of the HTML (HTA) file. This allows
   ... testing out of toolbars and such.

   mez: not sure this covers all the testing types

   serge: everything that requires attacking the user will require HIGH
   fidelity prototype

   RobY: hoping that what we want to test for can be tested
   programmatically
   ... all should be programmatically testable (i.e. in a high-fidelity
   agent)

   serge: clarify "doing tests" vs. "doing studies"

   RobY: example: assure that a C# file did not contain X ... this can be
   done by a program testing for this.

   mez: hold off on questions on how to do conformance testing for a half
   hour.
   ... have been assuming that someone will be testing EVERY
   recommendation we produce

   <serge> so how do we make claims about the recommendation without doing
   studies?

   tyler: the model is likely to be the "champion" model - whoever is most
   interested, will take up the flag

   <beltzner> serge, you must be new to W3C recommendations!

   <serge> I was on the P3P 1.1 group

   <beltzner> for your wounds, sir

   <serge> there's a difference between turning existing privacy standards
   into electronic form and recommending arbitrary design guidelines for
   user agents.

   <serge> if we're going to make recommendations, we need some data to
   support them.

   <beltzner> serge, I reject your argument entirely, but that's a topic
   for when we're drinking

   tlr: there is a candidate recommendation step of going ahead further
   ... this could include something that requires an implementation step
   in order to go further
   ... implementation might require coaxing and encouragement
   ... who, at this point, is in a position to say that they might be able
   to start doing something?

   RobY: one of the things mentioned was getting people to write the test.
   ... for best practices, we don't have to write the tests, we just need
   to let developers know what to test.

   <stephenF> dinner: looking at www.boxtyhouse.ie for 6pm

   <stephenF> coffee: more outside now

   Yngve: when there are more particulars, we can look at getting a team
   involved.
   ... cannot say when we will be able to test.

   mez: as part of going forward - there needs to be something about
   implementation/prototype and conformance, robustness, and usability
   test
   ... and for these three, there's going to have to be some sort of
   implementation
   ... would like to demonstrate something by our next face to face

   bill-d: what does it take to get someone to sign up?

   mez: people who are champions and are capable of doing it themselves
   (or coax someone else) are going to get their recommendations through
   first.

   <Zakim> johnath, you wanted to point out that this is a helpful step
   anyhow - particularly under a champion model

   jonath: echoing that some implementation is going to be a healthy
   thing. And having the champion make someone write it gives this a > 1
   collaboration effort

   serge: what's the point of doing conformance testing if we're only
   using this to bolster the recommendation

   mez: hold conformance testing for 15 minutes

   tlr: I agree on the importance of implementation
   ... we should not create an environment in which there are proposals
   which correspond to a person's personal burden to implement/push.
   ... must do this as a group - advance recommendations which WE agree
   should move forward

   <johnath> staikos: conversation is already underway - backscroll should
   give you a good enough idea as to whether you object horribly :)

   <staikos> one thing I hope will be covered, if it hasn't, is html5. I'm
   not sure how many of you have read the draft spec for this but it kind
   of turns our work on its head

   mez: is there someone who thinks we can get to concensus on a
   recommendation without appropriate testing - please speak up.

   <hal> dropping off to attend ws-sx tc call - back in 30 mins or less

   <staikos> you know, with things like web pages able to open files,
   sockets, register themselves as protocol handlers, etc

   tyler: we have to be sensitive that there are limited developer
   resources in the sky to work on this

   <Audian> maybe not "invisible" but the lack of a favicon is hard to
   test...its been there for a long time and now it isn't

   tlr: there may be concensus on recommendations which should be looked
   at further
   ... don't want to get bogged down on waiting for an implementation step

   tyler: should we issue recommendations for things that have NOT been
   tested?

   tlr: recommendation is the final stage on the recommendation track ...
   and this means it HAS been subjected to tests
   ... what we're working on right now is "drafts for recommendation" ...
   which we can document now ... without testing having been done.

   <Zakim> johnath, you wanted to say that my support for the champion
   model didn't imply that implementation was *necessary*

   johnath: if no one touches a recommendation and it drops on the floor
   ... then maybe that is OK. Likes the idea of champions for a
   recommendation

   serge: seems weird to come up with draft recommendations before testing
   them out to see if they're useful.

   <Zakim> tlr, you wanted to make an ontological point

   tlr: perhaps we are again having some terminology conflicts

   mez: we need a noun for what was discussed in Lightening Discussions

   rachna: PROPOSALs is offered up

   stephenf: not everyone is in the room - public draft is useful to get
   wider input

   mez: believes we can put out a first draft with only expert opinions
   ... and hopefully by June/July
   ... this last discussion has been quite good
   ... break for 30 minutes.

   <johnath> bill-d: [40]http://www.boxtyhouse.ie/

   <johnath> staikos: any recommendations for or against?

   <staikos> well

   <staikos> the best food I had in Dublin was at an italian cafe :)

   starting up again

   stephenF: 6PM - Temple Bar (10-15 minute walk)

functional and conformance testing

   <stephenF> staikos: they're just getting what they asked for:-)

   RobY: conformance testing is for telling testers how to test what has
   been defined
   ... do we want to test compliance to this standard - or do we just want
   to put it out? It is a significant amount of work to do all of this
   testing.
   ... do we need conformance testing for the developers who implement
   these recommendations?
   ... unless we put things in that say what a developer cannot do, Rob
   doesn't see the need for adding in conformance testing

   mez: the bulk of our recommendations will be towards user agent
   developers ... though some recommendations will be pointed towards
   content providers
   ... but conformance testing not required for user agents/user agent
   developers?

   RobY: the set of user agent developers is not a huge community

   serge: agree with Rob that we should not focus on conformance testing

   <tlr> thomas: testability of work product vs. broad-scale testing vs.
   ability to test a limited population

   <tlr> ... when limited population, then need some tests, but don't need
   ability to automate that testing ...

   <Mez_> [41]http://www.w3.org/2005/10/Process-20051014/process.html

   <Mez_> "Part of a Working Group's activities is developing code and
   test suites "

   <Mez_> [42]http://www.w3.org/QA/WG/2005/01/test-faq

   <Mez_> Two types of testing are particularly helpful:

   <Mez_> Conformance testing

   <Mez_> Focuses on testing only what is formally required in the
   specification in order to verify whether an implementation conforms to
   its specifications. Conformance testing does not focus on performance,
   usability, the capability of an implementation to stand up under
   stress, or interoperability; nor does it focus on any
   implementation-specific details not formally required by the
   specification.

   <Zakim> stephenF, you wanted to ask can we have some examples of such
   tests? (in a minute)

   tyler: there are some well known test cases - how the browser renders
   certain things in certain ways
   ... one thing we may want to specify is that key sequences used on
   first authentication with a site should be different from a second or
   subsequent authentication to the same site.

   <serge> This seems to be a matter of charter, from 1.3 in the Process
   Document: "The Working Group charter sets expectations about each
   group's deliverables (e.g., technical reports, test suites, and
   tutorials)."

   <tlr> ... or not. ;-)

   tlr: as far as conformance testing is concerned - we are not expected
   to build an automated test suite
   ... we would be required to formulate a test suite that could be
   followed to evaluate conformance. the test MAY consist of manual work
   (like examining a user interface)

   dan: conformance testing is likely not as big a deal as some of the
   other parts of our recommendation

   tlr: this is conformance testing work, but not as detailed or involved
   as it has seemed to be implied so far today
   ... lets get to writing recommendations and examples of using these
   (which should lead to conformance tests)

   stephenF: there could be quite a bit of testing needed - lots of
   configuration settings and such

   RobY: as it becomes more and more defined, more and more
   folks/companies will take interest

   tlr: critical piece is to have tests and examples. More critical to
   have an example and a test with it than to have an implementation.
   ... requirement -> example + testcase -> then implementation

   tyler: one place of potential problem - if something makes a
   request/requirement of a third party on the authenticity of something
   ... this would be difficult to find a non-conformant and conformant
   example.

   tlr: two ways around that:

   <stephenF> ways 1 & 2?

   tlr: 1) if you speak about conformance, give a definition of "trusted".
   ... phrase like "There shall be a phrase or outside-managed list which
   is consulted."
   ... 2) the other way is to declar that "trusted" is defined as follows
   ....

   <johnath> ping

InScopeByCategory

   <Mez_> [43]http://www.w3.org/2006/WSC/Group/track/actions/179

   hal: should be able to just walk through the information in the wiki

   <tlr> [44]http://www.w3.org/2006/WSC/wiki/InScopebyCategory

   <johnath> rachna: don't forget your DVI->VGA donglything - tlr just
   unplugged it

   <Mez_> [45]http://www.w3.org/2006/WSC/drafts/note/#filters

   <Mez_> 5.5 Content based detection

   <Mez_> Techniques commonly used by intrusion detection systems, virus
   scanners and spam filters to detect illegitimate requests based on
   their content are out of scope for this Working Group. These techniques
   include recognizing known attacks by analyzing the served URLs,
   graphics or markup. The heuristics used in these tools are a moving
   target and so not a suitable subject for standardization. The Working
   Group will not recommend any checks on the content served by web si

   <Mez_> 5.5 is part of out of scope

   tyler: some of these seem to line up with what has been proposed in PII
   Editor work

   Bill-d: thought identity management systems are out of scope

   stephenF: there are things for Semantic Approaches that could suggested
   which are out of scope (so they won't be suggested)

   group: first two under semantic approaches are deemed IN scope

   <hal> I am not able to hear most of the discussion

   <Mez_> hold on a bit

   rachna: even the third item (federated identity management) has
   elements in-scope (as a form-filler extension seems in-scope)

   <johnath> I won't be the one to recommend OpenID as a proposal. :)

   mez: our intent was to look at these today to see if there were
   concrete proposals which should be put forth
   ... hearing nothing from the group, it appears NO.
   ... onto the next category - What doesn't work

   hal: this has had much discussion already, so let us skip
   ... move on to Education category
   ... it is unresolved whether users understand that they are making
   "risk management" decisions
   ... next category General Principles
   ... some of these are conflicting
   ... next category New Indicators

   tyler: has Firefox reserved any "drawing modes" for itself?
   ... such as transparency?

   beltzner: only thing we've reserved is chrome
   ... one way is to have the element cross the information boundary

   tlr: we have talked about existing robustness practices
   ... still needs to be pulled together from raw material in the wiki
   ... existing practices need to be written up

   rachna: history and petnames still in?

   tyler: petnames are still in

   tlr: see here antipatterns for SSL certificate ... but not patterns

   mez: some of the positive is wrapped up in Jonathan's proposal

   johnath: both identity and what is a secure page are in the
   recommendations

   <tlr> +1 to skipping over process recommendations

   hal: skip over process indicators
   ... final section - technical recommendations
   ... comprehensive architecture for web authentication is out of scope
   ... incorporate viable authentication techniques - should be covered
   ... next several are really "motherhood"
   ... extensibility so authentication can be continuously improved - not
   sure how to write a recommendation
   ... specify infrastructure is out of scope
   ... metadata has already been discussed.

   <johnath> hal - ping

   tlr: if there are recommendations around trusted attention sequences -
   then there might be a deployment recommendation that sites include
   certain instructions

   <Mez_> welcome back hal

   <Mez_> taking a minute

   hal: petnames is in play
   ... matching certificate contents is in play
   ... user controlled notation is in the same vein
   ... default blocking mode - is like safe browsing mode proposal that is
   under discussion right now

   <hal> hello

   tyler: SSL can detect a suspected MITM attack - currently user agent
   pops a dialog box. Should this be switched to just being a Error 404
   not found?

   yngve: opera indicates that potential "eavesdropping" may be underway -
   so similar dialog

   tlr: there is stuff in the wiki that needs to be pulled together

   <tlr> ACTION-177 closed without doing

   <tlr> ACTION: farrell to pick up on ACTION-177, complement with review
   of TLS spec and exceptions given there; goal is to limit user
   interaction when not needed - due 2007-06-19 [recorded in
   [46]http://www.w3.org/2007/05/30-wsc-minutes.html#action16]

   <trackbot> Created ACTION-240 - pick up on ACTION-177, complement with
   review of TLS spec and exceptions given there; goal is to limit user
   interaction when not needed [on Stephen Farrell - due 2007-06-19].

   <tlr> ACTION-240 due 2007-06-26

   <staikos> what time are you wrapping up?

   <johnath> tlr: what's the urlhack to allow editing when I don't have
   the "edit this action" link?

   <tlr> append /edit

   <beltzner> staikos: 12:30 EDT

   <staikos> heh guess it's not worth calling now

   <tlr> but if you don't have that link, it means you're looking at the
   public version

   hal: secure letterhead is something still in play

   <johnath> tlr: so how do I log in to the action tracker?

   hal: Service Security Requirement (SSR) record in DNS proposal - should
   we work on it?

   mez: appears to be no interest from the group.

   hal: leverage new features (from workshop)

   beltzner: xul:browsermessage - it's the mark up which indicates what
   comes up when a "pop-up" is blocked or if something should be installed

   <Mez_> rachna

   <Mez_> is talking

   rachna: APIs for anti-phishing? This could be APIs for third party
   services

   mez: no comments or interest reflected by the group

day One wrap-up

   mez: agenda for tomorrow - lead off on logistics for next face-to-face
   ... tyler on remaining Note issues that we have
   ... bulk of the day walking through the editor's draft

   <beltzner> it doesn't appear that the xul:notificationbox can be used
   in content

   <beltzner> reference is here
   [47]http://developer.mozilla.org/en/docs/XUL:notificationbox

   adjourned for the day

   <Mez_> [48]http://www.boxtyhouse.ie/

Summary of Action Items

   ACTION-227 - Update template with material from discussion; notify
   e-mail list [on Thomas Roessler - due 2007-06-06].

   ACTION-228 - Share slides about usability testing from dublin f2f [on
   Rachna Dhamija - due 2007-06-06].

   ACTION-229 - Share his slides on robustness testing from the dublin f2f
   [on Bill Doyle - due 2007-06-06].

   ACTION-230 - Define robustness for WSC glossary [on Bill Doyle - due
   2007-06-06].

   ACTION-231 - Start a discussion about including descriptions of the
   information divulged to websites by user-agents [on Bill Doyle - due
   2007-06-06].

   ACTION-232 - Share results from his study once he has them [on Serge
   Egelman - due 2007-06-06].

   ACTION-233 - Make sure Jagatic et al on social phishing is in
   SharedBookmarks [on Rachna Dhamija - due 2007-06-06].

   ACTION-234 - Add www2006 jakobsson, Florencio & Hursley MSR paper to
   our shared bookmarks list [on Rachna Dhamija - due 2007-06-06].

   ACTION-235 - Update / create a user testing timeline with things like
   IRB turnaround, setup, etc. [on Rachna Dhamija - due 2007-06-06].

   ACTION-236 - Track donations of time and resources for usability
   testing [on Rachna Dhamija - due 2007-06-06].

   ACTION-237 - Drive process of tying recommendations to references in
   SharedBookmarks [on Maritza Johnson - due 2007-06-06].

   ACTION-238 - Create and document user testing plan (with links to
   timeline, donations, prototypers, etc) [on Rachna Dhamija - due
   2007-06-06].

   ACTION-240 - pick up on ACTION-177, complement with review of TLS spec
   and exceptions given there; goal is to limit user interaction when not
   needed [on Stephen Farrell - due 2007-06-19].

   [End of minutes]
     __________________________________________________________________


    Minutes formatted by David Booth's [49]scribe.perl version 1.127
    ([50]CVS log)
    $Date: 2007/06/17 22:03:44 $

References

   1. http://www.w3.org/
   2. http://lists.w3.org/Archives/Public/public-wsc-wg/2007May/0158.html
   3. http://www.w3.org/2007/05/30-wsc-irc
   4. http://www.w3.org/2007/05/30-wsc-minutes.html#agenda
   5. http://www.w3.org/2007/05/30-wsc-minutes.html#item01
   6. http://www.w3.org/2007/05/30-wsc-minutes.html#Conformanc
   7. http://www.w3.org/2007/05/30-wsc-minutes.html#Intermediat
   8. http://www.w3.org/2007/05/30-wsc-minutes.html#Robustness
   9. http://www.w3.org/2007/05/30-wsc-minutes.html#item02
  10. http://www.w3.org/2007/05/30-wsc-minutes.html#item03
  11. http://www.w3.org/2007/05/30-wsc-minutes.html#item04
  12. http://www.w3.org/2007/05/30-wsc-minutes.html#item05
  13. http://www.w3.org/2007/05/30-wsc-minutes.html#item06
  14. http://www.w3.org/2007/05/30-wsc-minutes.html#ActionSummary
  15. http://www.w3.org/2006/WSC/drafts/rec/#favicon-favicons-rec
  16. http://beltzner.ca/webdav/forthomas.txt
  17. http://www.w3.org/2002/09/wbs/39814/f2f3sched/results
  18. http://www.w3.org/2006/WSC/wiki/Glossary
  19. http://www.w3.org/2007/05/30-wsc-minutes.html#action01
  20. http://www.w3.org/2002/09/wbs/39814/f2f3sched/
  21. http://www.w3.org/2007/05/30-wsc-minutes.html#action02
  22. http://www.w3.org/2007/05/30-wsc-minutes.html#action04
  23. http://www.w3.org/2007/05/30-wsc-minutes.html#action05
  24. http://www.ietf.org/html.charters/nea-charter.html
  25. http://www.w3.org/2007/05/30-wsc-minutes.html#action06
  26. http://www.w3.org/2006/WSC/wiki/SharedBookmarks
  27. http://www.simson.net/ref/2006/CHI-security-toolbar-final.pdf
  28. http://www.w3.org/2006/WSC/wiki/ThreatTrees
  29. http://www.w3.org/2007/05/30-wsc-minutes.html#action07
  30. http://www.w3.org/2007/05/30-wsc-minutes.html#action08
  31. http://www.indiana.edu/~phishing/social-network-experiment/phishing-preprint.pdf
  32. http://www.w3.org/2007/05/30-wsc-minutes.html#action09
  33. http://www2006.org/programme/item.php?id=3533
  34. http://www.w3.org/2007/05/30-wsc-minutes.html#action10
  35. http://www.w3.org/2006/WSC/wiki/StatusQuoUserStudyResults
  36. http://www.w3.org/2007/05/30-wsc-minutes.html#action12
  37. http://www.w3.org/2007/05/30-wsc-minutes.html#action13
  38. http://www.w3.org/2007/05/30-wsc-minutes.html#action14
  39. http://www.w3.org/2007/05/30-wsc-minutes.html#action15
  40. http://www.boxtyhouse.ie/
  41. http://www.w3.org/2005/10/Process-20051014/process.html
  42. http://www.w3.org/QA/WG/2005/01/test-faq
  43. http://www.w3.org/2006/WSC/Group/track/actions/179
  44. http://www.w3.org/2006/WSC/wiki/InScopebyCategory
  45. http://www.w3.org/2006/WSC/drafts/note/#filters
  46. http://www.w3.org/2007/05/30-wsc-minutes.html#action16
  47. http://developer.mozilla.org/en/docs/XUL:notificationbox
  48. http://www.boxtyhouse.ie/
  49. http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
  50. http://dev.w3.org/cvsweb/2002/scribe/

Received on Sunday, 17 June 2007 22:11:51 UTC