                              WSC WG face-to-face
                                  30 May 2007


   Present (in order of registration)
          Thomas Roessler
          Mary Ellen Zurko
          Daniel Schutzer
          Johnathan Nightingale
          Phillip Hallam-Baker
          Hal Lockhart (by phone)
          Mike Beltzner
          Shawn Duffy
          Tyler Close
          Serge Egelman
          Maritza Johnson
          Luis Barriga
          Robert B. Yonaitis
          Audian Paxson
          Stephen Farrell
          Tim Hahn
          Stuart Schechter (by phone)
          Rachna Dhamija
          George Staikos (by phone)
          Bill Doyle
          Yngve Pettersen


          stephenF, beltzner, tjh


Agenda Bashing

   agenda bashing

   next f2f dicussion moved to later on

Conformance model

   tlr on what we need to do for our specs

   tlr:what does it mean for us to recommend something?
   .. wants us to make testable statements ...
   ... by structuring the rec text into requirements, good practice and
   ... implementation techniques ...
   ... also need to specify assumptions, e.g. icon display doesn't ...
   ... work with screen reader, have to say that visual display ...
   ... is an assumption ...

   robert: wants no device dependence
   ... we should assume everything is usable device independent including
   for accessibility, mobility etc

   tlr: yes, but some implementation techniques are device dependent and
   we do want to document those
   ... WCAC does content accessibility guidelines...
   ... but we're not talking so much about content, more about user agents
   (chrome etc)

   robert: US govt. accessibility rules != w3c ones

   tlr: let's consider some rec-text as a group now...
   ... the stuff about favicons.

   1st rec is that sites should not incorporate favicons at this time

   phb: for display that informs trust decisions, only display
   authenticated information

   text here is at:

   beltzner: who are we expecting to read/conform to REC?

   (basically are we addressing UAs and/or sites)

   phb: some types of site may pay attention (e.g. FIs)

   mez: charter allows us to include both

   serge: any evidence about favicons being trusted?

   marzita: yep

   tyler: still in the dark about what's a useful REC, hard to have
   abstract discussion
   ... current bank practice shows that FIs may drag their feet anyway

   ynvge: maybe move favicon to somewhere user normally doesn't trust?

   sean: favicons being used, how likely is it that something else would
   be accepted?

   mez: back to what needs to be in the text for FPWD?

   tlr: shows that there are lots of favicons in use
   ... shows example of padlock favicon
   ... this applies to UAs that display bitmaps
   ... and for which favicon uses (e.g bookmarks, desktop, location
   ... address bar is where favicon shows where we are now

   mike: tlr is asking us to be very specific about MUST/SHOULD/etc

   tlr: level of abstraction needs to be right, current text very far from
   being specific enough

   tyler: would like to document stuff we're planning to do to get

   mez: we can do that

   tlr: says what REC means

   tyler: we're not competent yet to do MUST/SHOULD

   tlr: ok to use MUST/SHOULD in FPWD even if its likely to change (don't

   mez: FPWD is important to get attention/feedback, important that status
   is clear

   rob: MUST/SHOULD/MAY all exist?

   tlr: yes, we can use rfc 2119 or something else

   audian: back to tlr's bullets, is what tlr said what we want?

   tlr: types on screen

   phb: text equivalent is <title>
   ... lots of chat...

   serge: we need text about definitions, e.g. saying what we mean...
   ... when we say "verified sites"

   mez: we have a glossary

   phb: different logos should have different authentication types/levels

Intermediate agenda bashing

   (there is some agenda bashing happening as we mitigate for

   Mez_: call to order, Rachna to start session on "Usability Testing"

   <Mez_> tlr, please action rachna to share the slides somehow; tx

   <scribe> ACTION: rachna to share slides about usability testing from
   dublin f2f [recorded in

   <trackbot> Created ACTION-228 - Share slides about usability testing
   from dublin f2f [on Rachna Dhamija - due 2007-06-06].

   Mez_: agenda reordering: s/Rachna/BillD

Robustness Testing

   billd: Robustness Testing
   ... current browser environments from a user standpoint bring a bunch
   of technologies together ...

   <Mez_> tlr, please action bill to share his slides too. tx.

   <scribe> ACTION: bill to share his slides on robustness testing from
   the dublin f2f [recorded in

   <trackbot> Created ACTION-229 - Share his slides on robustness testing
   from the dublin f2f [on Bill Doyle - due 2007-06-06].

   rachna: do we have a definition for robustness?

   Mez_: no, one should be added to the glossary

   <scribe> ACTION: bill to define robustness for WSC glossary [recorded
   in [23]http://www.w3.org/2007/05/30-wsc-minutes.html#action05]

   <trackbot> Created ACTION-230 - Define robustness for WSC glossary [on
   Bill Doyle - due 2007-06-06].

   <Mez_> tx beltzner

   bill: in IT circles, a "tiger team" was used to test effectiveness of
   IT security measures ...
   ... one side would attack, another side would try to detect and
   evaluate effectiveness of process and procedures being tested ..

   Mez_: I see some potential overlap between items to be tested for
   robustness vs. user undestanding or usability ...

   <Mez_> things that web content can do that exactly emulates the
   security context information displays in our technical report are
   "pure" robustness attacks

   <Mez_> they will leverage either user agent vulnerabilities, or design
   gaps or issues in the user agents

   bill: the attacking team might go after the OS, plugins, user agent,
   network layer

   <stephenF> tlr has the magic zakim trick for the phone

   <tlr> stephen, context?

   bill: your browser actually sends a lot of information when you visit

   <stephenF> tlr - Mez wants the phone on

   bill: [demonstrates metasploit]

   <Mez_> jan vidar, is that you?

   <Mez_> can you hear us?

   beltzner: this seems to be out of scope, though, since we're talking
   about user agent and system exploits

   bill: well, exploits are out of scope, but the UI for security context
   is in scope

   beltzner: in so far as the user agent isn't exploited, yes

   bill: so if the patches are out of date, at the user agent, OS or
   plugin level, and the user is at risk

   PHB: so, this used to be a concern with things like macromedia, where
   the browser allowed plugins to take control easily and thus become

   <Zakim> tlr, you wanted to ask what our question is

   tlr: our deliverables are about how browsers and site authors should do
   things, and I wonder what are we asking of a robustness testing

   rachna: it's hard to enumerate the robustness tests in advance, really

   <tlr> beltzner: should we have indicators that help people ensure they
   have the latest browser?

   rachna: thus far we've talked about security indicators in chrome about
   the web content, but not indicators about ensuring that the user is
   running with an unexploited browser
   ... is that in scope?

   Mez_: or in our goals? I don't have an immediate reaction

   ???: I'm hearing a lot of "the user can't determine", and want to
   remind people that the user isn't an administrator or security

   sduffy: this feels like an odd road to me, in terms of whether or not
   the user has an up-to-date browser

   rachna: but I think the up-to-date-ness has a larger impact on security

   Rob: when we look at testing (robustness, user, etc) we're talking
   about user-agents ...
   ... users are doing a lot of different things in the content-area, not
   just content, but applications, and sometimes applications in those
   applications ..
   ... shouldn't we be breaking out our testing per area/recommendation
   developed by this group?

   Mez_: The reason we have three one-hour discussions about testing is to
   lead into planning those three categories of testing.

   stephenF: if you're talking about applications updates and such,
   there's an ITF workgroup on network assessment, they'll address most of
   these issues

   <tlr> [24]http://www.ietf.org/html.charters/nea-charter.html

   sduffy: my main objection to including user agent updates in scope is
   that it doesn't end up solving the problem since the OS could be out of

   tlr: I don't even understand what it means for us to recommend that a
   user has the latest user agent, since the user isn't the compliance

   <tlr> ... or is he?

   tlr: we have a bunch of items that describe current robustness
   practises that have not been migrated to the format required by our
   recomendation template
   ... in my opinion that should be a priority so that we can recommend
   appropriate robustness tests

   Mez_: that's not what we're talking about, IMO, so I'll call that out
   of scope for the current conversation as is all of browser-updating

   bill: how about the second issue of user agents divluging information
   about the user?

   Mez_: how does that fit in our scope?

   beltzner: it does in that if the user doesn't mean to provide
   information to non-trusted websites, and isn't aware of the information
   being provided by default

   Mez_: if we make a recommendation about privacy, then we should ensure
   that the recommendation is robust
   ... until we make a recommendation about that, though, discussing how
   to test its robustness seems wrong
   ... more conversation about scope definition ...

   tlr: what I hear you (Rob) saying is that robustness testing can help
   us identify weaknesses in all sorts of web applications, and while
   that's valuable, it's not within the charter of this working group

   <johnath> :)

   tlr:tlr continues his point, driving it home with the force of a pile
   driver ...

   Mez_: anything else to share, billd?

   bill-d: these were the points I wanted to raise

   sduffy: we decided that some aspects of web aps were in scope ...
   ... there's a difference between SQL injection and website
   vulnerability where clicking on a URL results in XSS/untrusted content
   ... the former is out of scope, the latter is in scope ...
   ... so web-apps as a whole aren't out of scope, are they?

   johnath: it feels like we have enough work testing the robustness of
   user facing display of web security context
   ... so I'm excited enough without taking on extra worries about user
   agent/plugin/OS robustness

   bill-d: I'm still trying to lock down discussions of what's
   in-scope/out-of-scope, which is why I wanted to bring this up again

   Mez_: well, now you know: it's testing the recommendations, not the
   entire system

   bill-d: still not clear where we are on dilvuging of information to

   beltzner: that's not part of our problem statement yet, let alone our
   goals, let alone ...

   Mez_: proposes an action on starting a discussion about the information
   divluged to websites by user agents?

   tlr: IMO, it stretches the notion from communicating web security
   context to one about privacy

   johnath: I think it's doomed for other reasons, but I don't think that
   precludes the discussion on the group

   <scribe> ACTION: bill to start a discussion about including
   descriptions of the information divulged to websites by user-agents
   [recorded in

   <trackbot> Created ACTION-231 - Start a discussion about including
   descriptions of the information divulged to websites by user-agents [on
   Bill Doyle - due 2007-06-06].

security usability testing

   rachna: so, I'll start by defining what I mean by usability testing ...
   ... traditional security methodology of robustness is good but not
   sufficient ...
   ... HCI methrodology isn't sufficient since the attackers are modifying
   along with us ...
   ... so I propose "red team" usability testing, where we actively attack
   the user
   ... so both "can we use the system" and "how can we attack the user to
   confound them"
   ... I have a bunch of questions
   ... 1. Will we test ideas or specific implementations of ideas?

   Mez_: how would we test a concept?

   rachna: so for example, we could test a variety of implementations
   instead of a specific one

   <maritzaj> [26]http://www.w3.org/2006/WSC/wiki/SharedBookmarks

   tyler: the tricky thing is once they make their implementation, they
   stopped testing the concept

   <maritzaj> #2 under usability studies about internet security


   serge: instead of testing a toolbar, the study in question (see
   maritzaj's link) tested the effectiveness of each indicator

   beltzner: one could test the concepts on which a design is founded,
   instead of the design itself

   maritzaj: the answers will likely vary per recommendation

   Mez_: so I don't understand how we'd decide whether to test a concept
   or a design

   rachna: yeah, I think we'll need to figure that out

   tyler: part of this might be recognizing patterns in our
   recommendations and test abstractions that cover aspects in each
   ... to what extent do you think we can/should rely on the literature
   instead of retesting some of those findings?

   johnath: we have a huge body of research that has led to some of these
   recommendations, I think it should be up to us to point to the
   foundation and identify areas for follow up testing

   rachna: well, where huge = from 2005 onwards
   ... 2. At what level of fidelity should we be testing?

   rachna: low-fi prototyping is sketches, medium-fi is flash or web
   mockups, high-fi is extensions or browser modifications

   maritzaj: previous research will come into play as references

   rachna: using lower-fidelity prototypes will increase our bandwidth
   ... 3. What should we be testing?
   ... learnability, efficiency, skills required, flexibiliy,
   satisfaction, errors, compliance rates

   Mez_: remembers a study, sort of, that might be about how instructions
   affected the results ...

   maritzaj: the Jackson/MSR one?

   Mez_: yes! and I haven't heard anything about that dimension

   rachna: I think you're referring to a problem that exists in that they
   had to describe EV certificates to one group

   tyler: yes, but I thought they controlled for that

   Audian: often helps to start with a list of assertions and
   verify/validate those first, then move onto low-fi mockups, and use
   those for the validation

   rachna: yes, that's an excellent way to do designs

   <Audian> what = a quality test base?

   <Audian> statistical relavance

   rachna: we need to identify the goals of a study, as well: why do users
   behave they way they do? what are users reliably capable of performing?
   does technology X protect against attack A? etc, etc.

   Mez_: 100% usable security is a dream, not a goal, and we should make
   sure that the studies aren't tasked with finding the 100% solution ...
   ... people who believe in a recommendation will always argue that the
   hit rate is "good enough", though, which worries me.

   tyler: do we want to provide a target hit rate, then?

   <Zakim> johnath, you wanted to respond to Mez, tyler, on quality of

   johnath: it's easy to say "20%" is worse than "40%", but the trick will
   be showing statistical significance in the effectiveness of the
   ... the test you want is "this creates _an_ improvement"

   johnath: at which point we can defend the assertion

   <Mez_> better for the majority of web users

   beltzner: once a recommendation is established as significant, if it
   competes against another recommendation, then we can do a comparison

   tyler: so what does it take for something like that to get put into

   beltzner: bring it forward to the Firefox product team (it's an open
   meeting) and propose it; in my experience, you need to prove the worth
   and value for the majority of the web

   rachna: should we test in-lab or in the wild?

   tlr: do we need to answer this now, or leave this to you?

   rachna: we can come back to this later, but it depends on the

   PHB: if we can work out a way of doing it in the wild, it would be much

   <Mez_> +1 to phb

   <Mez_> from an industry pov

   PHB: oftentimes in-lab participants already know something about

   tyler: this intersects with the fidelity; depolyable add-ons are more
   easily tested "in the wild"

   maritzaj: tradeoffs for both, in-wild experiments aren't easy to

   dan: in lab can be used to filter and then in the wild can be used to
   test larger populations

   tyler: bias is obviously a concern, but studies (like rachna and
   stuart's) have shown a non significant correlation between bias and

   rachna: well, that doesn't mean it didn't exist, since we were
   controlling for it, not testing for it

   serge: we didn't find any correlation either
   ... we discussed some of these issues at CHI, and the differences
   seemed to break down as wild being good for quantitative, lab being
   good for qualitative

   tim: there are various audiences for the user agents which will affect
   the demographics of in the wild testing

   dan: we could do tests through deploying in various banks, etc, to get
   a good cross section

   Audian: I've often seen domain experts being far more critical of a
   solution until the "would you recommend this to a friend" at which
   point they all decided they would

   rachna: another complication is that each proposal can be attacked in a
   different way

   stephenF: how do you figure out the workload?

   rachna: depends on the attack, and whether or not there are
   easily-reused exploits we can copy and paste

   stephenF: there's some shortcuts we can take

   PHB: we should be tactical about some things, like we know that w.r.t.
   phishing, a takedown service shifts the problem to another target, but
   doesn't kill the overall problem
   ... or in crypto, adding a bit to the key doubles the work for the

   <PHB> Well it would be rather nice to know if the solution was intended
   to be tactical or strategic

   tlr: does this imply a different set of scenarios or can we just rely
   on our existing ones?

   <PHB> Rather a lot of solutions being sold a year ago as strategic
   turned out to be distinctly tactical

   tyler: our recommendations should each address the threat that they
   attempt to defeat

   <Mez_> [28]http://www.w3.org/2006/WSC/wiki/ThreatTrees

   <PHB> Sitekey

   rachna: does that mean modifying our threat trees to include these (on
   her slides) attacks?

   tyler: I think it would be a different section

   rachna: I'm very confused

   dan: we need to make sure the recommendations are significant at
   attacking the problems (or something? help?)

   rachna: another question is do we want to discover the usability
   problems, or do we want to assert significant effect?

   tlr: that is a question which we should try to answer today

   maritzaj: small N might be useful in the early stages as we try to
   filter down

   <Mez_> +1 to johnath

   johnath: sprinkles +1s all over the PhD students

   serge: I'm on a grant, we might be able to get similar resources from
   other groups

   <Zakim> Mez_, you wanted to tell serge that we're testing

   Mez_: we're gonna be testing our recommendations, so you should be
   testing those

   rachna: testing requires IRB approval for me, this adds to overhead

   Mez_: the W3C staffers won't be doing the testing, so there's no
   MIT/IRB requirement

   <serge> hey, if the recommendations can be included in that, great.
   Likewise, those in the corporate world have a vested interest in making
   the user studies happen because the results can be incorporated into

   interested I can summarize it

   <Mez_> we're running a bit long; how about in email serge?

   <serge> okay, in that case I might just wait until I have more results.

   <Mez_> what's the eta on that?

   <serge> maybe 2-3 weeks.

   <Mez_> ok, sounds just fine

   <Mez_> beltzner; want to give serge the action item?

   <scribe> ACTION: serge to share results from his study once he has them
   [recorded in

   <trackbot> Created ACTION-232 - Share results from his study once he
   has them [on Serge Egelman - due 2007-06-06].

   <tlr> ACTION: rdhamija2 to make sure Jagatic et al on social phishing
   is in SharedBookmarks [recorded in

   <trackbot> Created ACTION-233 - Make sure Jagatic et al on social
   phishing is in SharedBookmarks [on Rachna Dhamija - due 2007-06-06].

   <tlr> ACTION-232 due 2007-06-30

   <serge> 6/6 is not 2-3 weeks, unless we're on some crazy new calendar


   <tlr> serge, see my remark about due date

   <Mez_> is the jagatic study, and it's in our shared bookmarks

   <serge> thanks

   <Mez_> beltzner, give rachna an action on the ebay www2006 jakobsson
   paper; I can't find it in shared bookmarks or on the web

   <scribe> ACTION: rdhamija2 to add www2006 jakobsson, Florencio &
   Hursley MSR paper to our shared bookmarks list [recorded in

   <trackbot> Created ACTION-234 - Add www2006 jakobsson, Florencio &
   Hursley MSR paper to our shared bookmarks list [on Rachna Dhamija - due

   <Audian> battery almost dead

   tyler: so it looks like there's two in the wild tactics: actively
   attack and measure effectiveness, or insrument existing browsers or our

   <tlr> ACTION-234 confuses several papers

   <tlr> Jakobsson Ratkiewicz is what I meant, it's from WWW 2006.

   <Mez_> tlr, give whatever other actions are needed

   beltzner: it seems to me like active attacks are way of validating our
   threat trees, not our solutions

   tlr, want to take the action from rachna? I'm sure she woudln't mind

   bill-d: if we make changes to the user experience, how do you test in
   the wild?

   rachna: right, and that's tough in lab as well, since sometimes users
   need to be trained, or the act of them being in the lab ends up
   training them

   Audian: there's ways of doing it almost at random by pulling people
   aside in schools and malls

   stephenF: possibility that some of the "hits" are false positives of
   users entering wrong passwords on purpose

   <Mez_> tlr or beltzner - an action on rachna to update timeline with
   things like irb turnaround

   <Mez_> please

   rachna: once we have proposals, we need to enter low-fi prototyping
   phase, then figure out what we're trying to prove, set up the studies,
   set up the infrastructure, etc, and this all requires resources and

   <scribe> ACTION: rdhamija2 to update / create a user testing timeline
   with things like IRB turnaround, setup, etc. [recorded in

   <trackbot> Created ACTION-235 - Update / create a user testing timeline
   with things like IRB turnaround, setup, etc. [on Rachna Dhamija - due

   serge: there have been cases of people in studies entering real
   information, in counterpoint to stephenF

   maritzaj: reiterating an earlier point, we should tie references in the
   shared bookmarks and tie them to recommendations


   Audian: how do we know who's submitting resources for testing

   <scribe> ACTION: rdhamija2 to track donations of time and resources for
   usability testing [recorded in

   <trackbot> Created ACTION-236 - Track donations of time and resources
   for usability testing [on Rachna Dhamija - due 2007-06-06].

   <scribe> ACTION: maritza to drive process of tying recommendations to
   references in SharedBookmarks [recorded in

   <trackbot> Created ACTION-237 - Drive process of tying recommendations
   to references in SharedBookmarks [on Maritza Johnson - due 2007-06-06].

   <scribe> ACTION: rhdamija2 create and document user testing plan (with
   links to timeline, donations, prototypers, etc) [recorded in

   <trackbot> Sorry, couldn't find user - rhdamija2

   <scribe> ACTION: rdhamija2 create and document user testing plan (with
   links to timeline, donations, prototypers, etc) [recorded in

   <trackbot> Created ACTION-238 - Create and document user testing plan
   (with links to timeline, donations, prototypers, etc) [on Rachna
   Dhamija - due 2007-06-06].

implementation / testing / etc

   beltzner: big believer in prototyping, sketching, whiteboarding - to
   build wireframes
   ... use that as a way of expressing things better than text
   ... allows communication and discussion
   ... but once that finishes, there should be no limits on what
   technology is used.
   ... should be something that enables testers to get what they want out
   of it - HTML, Flash, Firefox extensions
   ... all the way to an installable browser client which can be

   tlr: might include changes to other browsers (e.g. Opera)
   ... on the one hand - prototypes for testing
   ... on the other hand - things taken up by user agent implementers.
   ... what recommendations are sufficiently spelled out to allow for
   ... what more do the browser vendors need to understand for each of the
   ... what are the reasonable expectations for how long it would take to

   beltzner: kind of putting the cart before the horse ... lets get the
   prototypes available so that anyone can run time (any browser vendor

   tyler: in terms of time it takes - have experience with add-ons for
   Firefox and IE.
   ... much easier (in Tyler's experience) in Firefox than in IE.
   ... IE requires use of the COM libraries
   ... Firefox lacks some documentation (source browsing required in some
   cases to understand Firefox operation)
   ... IE has support for .html and .hta - where HTA provides ALL window
   format to be under the control of the HTML (HTA) file. This allows
   ... testing out of toolbars and such.

   mez: not sure this covers all the testing types

   serge: everything that requires attacking the user will require HIGH
   fidelity prototype

   RobY: hoping that what we want to test for can be tested
   ... all should be programmatically testable (i.e. in a high-fidelity

   serge: clarify "doing tests" vs. "doing studies"

   RobY: example: assure that a C# file did not contain X ... this can be
   done by a program testing for this.

   mez: hold off on questions on how to do conformance testing for a half
   ... have been assuming that someone will be testing EVERY
   recommendation we produce

   <serge> so how do we make claims about the recommendation without doing

   tyler: the model is likely to be the "champion" model - whoever is most
   interested, will take up the flag

   <beltzner> serge, you must be new to W3C recommendations!

   <serge> I was on the P3P 1.1 group

   <beltzner> for your wounds, sir

   <serge> there's a difference between turning existing privacy standards
   into electronic form and recommending arbitrary design guidelines for
   user agents.

   <serge> if we're going to make recommendations, we need some data to
   support them.

   <beltzner> serge, I reject your argument entirely, but that's a topic
   for when we're drinking

   tlr: there is a candidate recommendation step of going ahead further
   ... this could include something that requires an implementation step
   in order to go further
   ... implementation might require coaxing and encouragement
   ... who, at this point, is in a position to say that they might be able
   to start doing something?

   RobY: one of the things mentioned was getting people to write the test.
   ... for best practices, we don't have to write the tests, we just need
   to let developers know what to test.

   Yngve: when there are more particulars, we can look at getting a team
   ... cannot say when we will be able to test.

   mez: as part of going forward - there needs to be something about
   implementation/prototype and conformance, robustness, and usability
   ... and for these three, there's going to have to be some sort of
   ... would like to demonstrate something by our next face to face

   bill-d: what does it take to get someone to sign up?

   mez: people who are champions and are capable of doing it themselves
   (or coax someone else) are going to get their recommendations through

   <Zakim> johnath, you wanted to point out that this is a helpful step
   anyhow - particularly under a champion model

   jonath: echoing that some implementation is going to be a healthy
   thing. And having the champion make someone write it gives this a > 1
   collaboration effort

   serge: what's the point of doing conformance testing if we're only
   using this to bolster the recommendation

   mez: hold conformance testing for 15 minutes

   tlr: I agree on the importance of implementation
   ... we should not create an environment in which there are proposals
   which correspond to a person's personal burden to implement/push.
   ... must do this as a group - advance recommendations which WE agree
   should move forward

   <staikos> one thing I hope will be covered, if it hasn't, is html5. I'm
   not sure how many of you have read the draft spec for this but it kind
   of turns our work on its head

   mez: is there someone who thinks we can get to concensus on a
   recommendation without appropriate testing - please speak up.

   <hal> dropping off to attend ws-sx tc call - back in 30 mins or less

   <staikos> you know, with things like web pages able to open files,
   sockets, register themselves as protocol handlers, etc

   tyler: we have to be sensitive that there are limited developer
   resources in the sky to work on this

   <Audian> maybe not "invisible" but the lack of a favicon is hard to
   test...its been there for a long time and now it isn't

   tlr: there may be concensus on recommendations which should be looked
   at further
   ... don't want to get bogged down on waiting for an implementation step

   tyler: should we issue recommendations for things that have NOT been

   tlr: recommendation is the final stage on the recommendation track ...
   and this means it HAS been subjected to tests
   ... what we're working on right now is "drafts for recommendation" ...
   which we can document now ... without testing having been done.

   <Zakim> johnath, you wanted to say that my support for the champion
   model didn't imply that implementation was *necessary*

   johnath: if no one touches a recommendation and it drops on the floor
   ... then maybe that is OK. Likes the idea of champions for a

   serge: seems weird to come up with draft recommendations before testing
   them out to see if they're useful.

   <Zakim> tlr, you wanted to make an ontological point

   tlr: perhaps we are again having some terminology conflicts

   mez: we need a noun for what was discussed in Lightening Discussions

   rachna: PROPOSALs is offered up

   stephenf: not everyone is in the room - public draft is useful to get
   wider input

   mez: believes we can put out a first draft with only expert opinions
   ... and hopefully by June/July
   ... this last discussion has been quite good
   ... break for 30 minutes.

functional and conformance testing

   <stephenF> staikos: they're just getting what they asked for:-)

   RobY: conformance testing is for telling testers how to test what has
   been defined
   ... do we want to test compliance to this standard - or do we just want
   to put it out? It is a significant amount of work to do all of this
   ... do we need conformance testing for the developers who implement
   these recommendations?
   ... unless we put things in that say what a developer cannot do, Rob
   doesn't see the need for adding in conformance testing

   mez: the bulk of our recommendations will be towards user agent
   developers ... though some recommendations will be pointed towards
   content providers
   ... but conformance testing not required for user agents/user agent

   RobY: the set of user agent developers is not a huge community

   serge: agree with Rob that we should not focus on conformance testing

   <tlr> thomas: testability of work product vs. broad-scale testing vs.
   ability to test a limited population

   <tlr> ... when limited population, then need some tests, but don't need
   ability to automate that testing ...

   <Mez_> [41]http://www.w3.org/2005/10/Process-20051014/process.html

   <Mez_> "Part of a Working Group's activities is developing code and
   test suites "

   <Mez_> [42]http://www.w3.org/QA/WG/2005/01/test-faq

   <Mez_> Two types of testing are particularly helpful:

   <Mez_> Conformance testing

   <Mez_> Focuses on testing only what is formally required in the
   specification in order to verify whether an implementation conforms to
   its specifications. Conformance testing does not focus on performance,
   usability, the capability of an implementation to stand up under
   stress, or interoperability; nor does it focus on any
   implementation-specific details not formally required by the

   <Zakim> stephenF, you wanted to ask can we have some examples of such
   tests? (in a minute)

   tyler: there are some well known test cases - how the browser renders
   certain things in certain ways
   ... one thing we may want to specify is that key sequences used on
   first authentication with a site should be different from a second or
   subsequent authentication to the same site.

   <serge> This seems to be a matter of charter, from 1.3 in the Process
   Document: "The Working Group charter sets expectations about each
   group's deliverables (e.g., technical reports, test suites, and

   <tlr> ... or not. ;-)

   tlr: as far as conformance testing is concerned - we are not expected
   to build an automated test suite
   ... we would be required to formulate a test suite that could be
   followed to evaluate conformance. the test MAY consist of manual work
   (like examining a user interface)

   dan: conformance testing is likely not as big a deal as some of the
   other parts of our recommendation

   tlr: this is conformance testing work, but not as detailed or involved
   as it has seemed to be implied so far today
   ... lets get to writing recommendations and examples of using these
   (which should lead to conformance tests)

   stephenF: there could be quite a bit of testing needed - lots of
   configuration settings and such

   RobY: as it becomes more and more defined, more and more
   folks/companies will take interest

   tlr: critical piece is to have tests and examples. More critical to
   have an example and a test with it than to have an implementation.
   ... requirement -> example + testcase -> then implementation

   tyler: one place of potential problem - if something makes a
   request/requirement of a third party on the authenticity of something
   ... this would be difficult to find a non-conformant and conformant

   tlr: two ways around that:

   <stephenF> ways 1 & 2?

   tlr: 1) if you speak about conformance, give a definition of "trusted".
   ... phrase like "There shall be a phrase or outside-managed list which
   is consulted."
   ... 2) the other way is to declar that "trusted" is defined as follows

   <Mez_> [43]http://www.w3.org/2006/WSC/Group/track/actions/179

   hal: should be able to just walk through the information in the wiki

   <tlr> [44]http://www.w3.org/2006/WSC/wiki/InScopebyCategory

   <Mez_> [45]http://www.w3.org/2006/WSC/drafts/note/#filters

   <Mez_> 5.5 Content based detection

   <Mez_> Techniques commonly used by intrusion detection systems, virus
   scanners and spam filters to detect illegitimate requests based on
   their content are out of scope for this Working Group. These techniques
   include recognizing known attacks by analyzing the served URLs,
   graphics or markup. The heuristics used in these tools are a moving
   target and so not a suitable subject for standardization. The Working
   Group will not recommend any checks on the content served by web si

   <Mez_> 5.5 is part of out of scope

   tyler: some of these seem to line up with what has been proposed in PII
   Editor work

   Bill-d: thought identity management systems are out of scope

   stephenF: there are things for Semantic Approaches that could suggested
   which are out of scope (so they won't be suggested)

   group: first two under semantic approaches are deemed IN scope

   <hal> I am not able to hear most of the discussion

   <Mez_> hold on a bit

   rachna: even the third item (federated identity management) has
   elements in-scope (as a form-filler extension seems in-scope)

   <johnath> I won't be the one to recommend OpenID as a proposal. :)

   mez: our intent was to look at these today to see if there were
   concrete proposals which should be put forth
   ... hearing nothing from the group, it appears NO.
   ... onto the next category - What doesn't work

   hal: this has had much discussion already, so let us skip
   ... move on to Education category
   ... it is unresolved whether users understand that they are making
   "risk management" decisions
   ... next category General Principles
   ... some of these are conflicting
   ... next category New Indicators

   tyler: has Firefox reserved any "drawing modes" for itself?
   ... such as transparency?

   beltzner: only thing we've reserved is chrome
   ... one way is to have the element cross the information boundary

   tlr: we have talked about existing robustness practices
   ... still needs to be pulled together from raw material in the wiki
   ... existing practices need to be written up

   rachna: history and petnames still in?

   tyler: petnames are still in

   tlr: see here antipatterns for SSL certificate ... but not patterns

   mez: some of the positive is wrapped up in Jonathan's proposal

   johnath: both identity and what is a secure page are in the

   <tlr> +1 to skipping over process recommendations

   hal: skip over process indicators
   ... final section - technical recommendations
   ... comprehensive architecture for web authentication is out of scope
   ... incorporate viable authentication techniques - should be covered
   ... next several are really "motherhood"
   ... extensibility so authentication can be continuously improved - not
   sure how to write a recommendation
   ... specify infrastructure is out of scope
   ... metadata has already been discussed.

   tlr: if there are recommendations around trusted attention sequences -
   then there might be a deployment recommendation that sites include
   certain instructions

   hal: petnames is in play
   ... matching certificate contents is in play
   ... user controlled notation is in the same vein
   ... default blocking mode - is like safe browsing mode proposal that is
   under discussion right now

   tyler: SSL can detect a suspected MITM attack - currently user agent
   pops a dialog box. Should this be switched to just being a Error 404
   not found?

   yngve: opera indicates that potential "eavesdropping" may be underway -
   so similar dialog

   tlr: there is stuff in the wiki that needs to be pulled together

   <tlr> ACTION-177 closed without doing

   <tlr> ACTION: farrell to pick up on ACTION-177, complement with review
   of TLS spec and exceptions given there; goal is to limit user
   interaction when not needed - due 2007-06-19 [recorded in

   <trackbot> Created ACTION-240 - pick up on ACTION-177, complement with
   review of TLS spec and exceptions given there; goal is to limit user
   interaction when not needed [on Stephen Farrell - due 2007-06-19].

   <tlr> ACTION-240 due 2007-06-26

   hal: secure letterhead is something still in play

   <johnath> tlr: so how do I log in to the action tracker?

   hal: Service Security Requirement (SSR) record in DNS proposal - should
   we work on it?

   mez: appears to be no interest from the group.

   hal: leverage new features (from workshop)

   beltzner: xul:browsermessage - it's the mark up which indicates what
   comes up when a "pop-up" is blocked or if something should be installed

   rachna: APIs for anti-phishing? This could be APIs for third party

   mez: no comments or interest reflected by the group

day One wrap-up

   mez: agenda for tomorrow - lead off on logistics for next face-to-face
   ... tyler on remaining Note issues that we have
   ... bulk of the day walking through the editor's draft

Received on Sunday, 17 June 2007 22:11:51 UTC