- From: Thomas Roessler <tlr@w3.org>
- Date: Mon, 18 Jun 2007 00:11:43 +0200
- To: WSC WG <public-wsc-wg@w3.org>
Minutes from our face-to-face on 30 May were approved and are
available online:
http://www.w3.org/2007/05/30-wsc-minutes
Regards,
--
Thomas Roessler, W3C <tlr@w3.org>
[1]W3C
WSC WG face-to-face
30 May 2007
[2]Agenda
See also: [3]IRC log
Attendees
Present (in order of registration)
Thomas Roessler
Mary Ellen Zurko
Daniel Schutzer
Johnathan Nightingale
Phillip Hallam-Baker
Hal Lockhart (by phone)
Mike Beltzner
Shawn Duffy
Tyler Close
Serge Egelman
Maritza Johnson
Luis Barriga
Robert B. Yonaitis
Audian Paxson
Stephen Farrell
Tim Hahn
Stuart Schechter (by phone)
Rachna Dhamija
George Staikos (by phone)
Bill Doyle
Yngve Pettersen
Chair
Mez
Scribe
stephenF, beltzner, tjh
Contents
* [4]Topics
1. [5]Agenda Bashing
2. [6]Conformance model
3. [7]Intermediate agenda bashing
4. [8]Robustness Testing
5. [9]security usability testing
6. [10]implementation / testing / etc
7. [11]functional and conformance testing
8. [12]InScopeByCategory
9. [13]day One wrap-up
* [14]Summary of Action Items
__________________________________________________________________
Agenda Bashing
agenda bashing
next f2f dicussion moved to later on
Conformance model
tlr on what we need to do for our specs
tlr:what does it mean for us to recommend something?
.. wants us to make testable statements ...
... by structuring the rec text into requirements, good practice and
...
... implementation techniques ...
... also need to specify assumptions, e.g. icon display doesn't ...
... work with screen reader, have to say that visual display ...
... is an assumption ...
robert: wants no device dependence
... we should assume everything is usable device independent including
for accessibility, mobility etc
tlr: yes, but some implementation techniques are device dependent and
we do want to document those
... WCAC does content accessibility guidelines...
... but we're not talking so much about content, more about user agents
(chrome etc)
robert: US govt. accessibility rules != w3c ones
tlr: let's consider some rec-text as a group now...
... the stuff about favicons.
1st rec is that sites should not incorporate favicons at this time
phb: for display that informs trust decisions, only display
authenticated information
text here is at:
[15]http://www.w3.org/2006/WSC/drafts/rec/#favicon-favicons-rec
beltzner: who are we expecting to read/conform to REC?
(basically are we addressing UAs and/or sites)
phb: some types of site may pay attention (e.g. FIs)
mez: charter allows us to include both
serge: any evidence about favicons being trusted?
marzita: yep
tyler: still in the dark about what's a useful REC, hard to have
abstract discussion
... current bank practice shows that FIs may drag their feet anyway
ynvge: maybe move favicon to somewhere user normally doesn't trust?
sean: favicons being used, how likely is it that something else would
be accepted?
mez: back to what needs to be in the text for FPWD?
tlr: shows that there are lots of favicons in use
... shows example of padlock favicon
... this applies to UAs that display bitmaps
... and for which favicon uses (e.g bookmarks, desktop, location
bar...)
... address bar is where favicon shows where we are now
mike: tlr is asking us to be very specific about MUST/SHOULD/etc
tlr: level of abstraction needs to be right, current text very far from
being specific enough
tyler: would like to document stuff we're planning to do to get
feedback
mez: we can do that
tlr: says what REC means
tyler: we're not competent yet to do MUST/SHOULD
tlr: ok to use MUST/SHOULD in FPWD even if its likely to change (don't
worry)
mez: FPWD is important to get attention/feedback, important that status
is clear
rob: MUST/SHOULD/MAY all exist?
tlr: yes, we can use rfc 2119 or something else
audian: back to tlr's bullets, is what tlr said what we want?
tlr: types on screen
phb: text equivalent is <title>
... lots of chat...
serge: we need text about definitions, e.g. saying what we mean...
... when we say "verified sites"
mez: we have a glossary
phb: different logos should have different authentication types/levels
<ses> Hi. Stuart isn't really awake right now but he'll be recording
what he sees in the jabber room until he wakes up (most likely for the
post-lunch discussion.)
break for n/w
<beltzner> tlr, [16]http://beltzner.ca/webdav/forthomas.txt
<beltzner> Mez_,
[17]http://www.w3.org/2002/09/wbs/39814/f2f3sched/results
<Mez_> serge and sduffy, here is the glossary, which should include
chrome
<Mez_> [18]http://www.w3.org/2006/WSC/wiki/Glossary
back now...
tlr: postpone rec formatting discussion until usability testing
tyler: why change the template?
tlr: to be able to have conformance requirements
... tlr typing a new template..
... new template's most important bits are applicability, requirement
and techniques
<tlr> ACTION: thomas to update template with material from discussion;
notify e-mail list [recorded in
[19]http://www.w3.org/2007/05/30-wsc-minutes.html#action01]
<trackbot> Created ACTION-227 - Update template with material from
discussion; notify e-mail list [on Thomas Roessler - due 2007-06-06].
<tlr> [20]http://www.w3.org/2002/09/wbs/39814/f2f3sched/
for next f2f - fill in questionaire before lunch (next 2 hrs)
<beltzner> scribe is beltzner
<beltzner> tlr, ^ make that happen
<johnath> scribenick beltzner
<johnath> does that do it?
<tlr> ScribeNick: beltzner
<tlr> Chair: MEZ
Intermediate agenda bashing
(there is some agenda bashing happening as we mitigate for
technology-availability)
Mez_: call to order, Rachna to start session on "Usability Testing"
<Mez_> tlr, please action rachna to share the slides somehow; tx
<scribe> ACTION: rachna to share slides about usability testing from
dublin f2f [recorded in
[21]http://www.w3.org/2007/05/30-wsc-minutes.html#action02]
<trackbot> Created ACTION-228 - Share slides about usability testing
from dublin f2f [on Rachna Dhamija - due 2007-06-06].
Mez_: agenda reordering: s/Rachna/BillD
Robustness Testing
billd: Robustness Testing
... current browser environments from a user standpoint bring a bunch
of technologies together ...
<Mez_> tlr, please action bill to share his slides too. tx.
<scribe> ACTION: bill to share his slides on robustness testing from
the dublin f2f [recorded in
[22]http://www.w3.org/2007/05/30-wsc-minutes.html#action04]
<trackbot> Created ACTION-229 - Share his slides on robustness testing
from the dublin f2f [on Bill Doyle - due 2007-06-06].
rachna: do we have a definition for robustness?
Mez_: no, one should be added to the glossary
<scribe> ACTION: bill to define robustness for WSC glossary [recorded
in [23]http://www.w3.org/2007/05/30-wsc-minutes.html#action05]
<trackbot> Created ACTION-230 - Define robustness for WSC glossary [on
Bill Doyle - due 2007-06-06].
<Mez_> tx beltzner
bill: in IT circles, a "tiger team" was used to test effectiveness of
IT security measures ...
... one side would attack, another side would try to detect and
evaluate effectiveness of process and procedures being tested ..
Mez_: I see some potential overlap between items to be tested for
robustness vs. user undestanding or usability ...
<Mez_> things that web content can do that exactly emulates the
security context information displays in our technical report are
"pure" robustness attacks
<Mez_> they will leverage either user agent vulnerabilities, or design
gaps or issues in the user agents
bill: the attacking team might go after the OS, plugins, user agent,
network layer
<stephenF> tlr has the magic zakim trick for the phone
<tlr> stephen, context?
bill: your browser actually sends a lot of information when you visit
websites
<stephenF> tlr - Mez wants the phone on
bill: [demonstrates metasploit]
<Mez_> jan vidar, is that you?
<Mez_> can you hear us?
beltzner: this seems to be out of scope, though, since we're talking
about user agent and system exploits
bill: well, exploits are out of scope, but the UI for security context
is in scope
beltzner: in so far as the user agent isn't exploited, yes
bill: so if the patches are out of date, at the user agent, OS or
plugin level, and the user is at risk
PHB: so, this used to be a concern with things like macromedia, where
the browser allowed plugins to take control easily and thus become
exploited
<Zakim> tlr, you wanted to ask what our question is
tlr: our deliverables are about how browsers and site authors should do
things, and I wonder what are we asking of a robustness testing
process?
rachna: it's hard to enumerate the robustness tests in advance, really
<tlr> beltzner: should we have indicators that help people ensure they
have the latest browser?
rachna: thus far we've talked about security indicators in chrome about
the web content, but not indicators about ensuring that the user is
running with an unexploited browser
... is that in scope?
Mez_: or in our goals? I don't have an immediate reaction
???: I'm hearing a lot of "the user can't determine", and want to
remind people that the user isn't an administrator or security
professional
sduffy: this feels like an odd road to me, in terms of whether or not
the user has an up-to-date browser
rachna: but I think the up-to-date-ness has a larger impact on security
Rob: when we look at testing (robustness, user, etc) we're talking
about user-agents ...
... users are doing a lot of different things in the content-area, not
just content, but applications, and sometimes applications in those
applications ..
... shouldn't we be breaking out our testing per area/recommendation
developed by this group?
Mez_: The reason we have three one-hour discussions about testing is to
lead into planning those three categories of testing.
stephenF: if you're talking about applications updates and such,
there's an ITF workgroup on network assessment, they'll address most of
these issues
<tlr> [24]http://www.ietf.org/html.charters/nea-charter.html
sduffy: my main objection to including user agent updates in scope is
that it doesn't end up solving the problem since the OS could be out of
date
tlr: I don't even understand what it means for us to recommend that a
user has the latest user agent, since the user isn't the compliance
target
<tlr> ... or is he?
tlr: we have a bunch of items that describe current robustness
practises that have not been migrated to the format required by our
recomendation template
... in my opinion that should be a priority so that we can recommend
appropriate robustness tests
Mez_: that's not what we're talking about, IMO, so I'll call that out
of scope for the current conversation as is all of browser-updating
bill: how about the second issue of user agents divluging information
about the user?
Mez_: how does that fit in our scope?
beltzner: it does in that if the user doesn't mean to provide
information to non-trusted websites, and isn't aware of the information
being provided by default
Mez_: if we make a recommendation about privacy, then we should ensure
that the recommendation is robust
... until we make a recommendation about that, though, discussing how
to test its robustness seems wrong
... more conversation about scope definition ...
tlr: what I hear you (Rob) saying is that robustness testing can help
us identify weaknesses in all sorts of web applications, and while
that's valuable, it's not within the charter of this working group
<johnath> :)
tlr:tlr continues his point, driving it home with the force of a pile
driver ...
Mez_: anything else to share, billd?
bill-d: these were the points I wanted to raise
sduffy: we decided that some aspects of web aps were in scope ...
...
... there's a difference between SQL injection and website
vulnerability where clicking on a URL results in XSS/untrusted content
...
... the former is out of scope, the latter is in scope ...
... so web-apps as a whole aren't out of scope, are they?
johnath: it feels like we have enough work testing the robustness of
user facing display of web security context
... so I'm excited enough without taking on extra worries about user
agent/plugin/OS robustness
bill-d: I'm still trying to lock down discussions of what's
in-scope/out-of-scope, which is why I wanted to bring this up again
Mez_: well, now you know: it's testing the recommendations, not the
entire system
bill-d: still not clear where we are on dilvuging of information to
websites
beltzner: that's not part of our problem statement yet, let alone our
goals, let alone ...
Mez_: proposes an action on starting a discussion about the information
divluged to websites by user agents?
tlr: IMO, it stretches the notion from communicating web security
context to one about privacy
johnath: I think it's doomed for other reasons, but I don't think that
precludes the discussion on the group
<scribe> ACTION: bill to start a discussion about including
descriptions of the information divulged to websites by user-agents
[recorded in
[25]http://www.w3.org/2007/05/30-wsc-minutes.html#action06]
<trackbot> Created ACTION-231 - Start a discussion about including
descriptions of the information divulged to websites by user-agents [on
Bill Doyle - due 2007-06-06].
<johnath> meeting adjourned till 12:50pm local for lunch
<ses> When does local lunch end?
<johnath> ses: 5 minutes left
Mez_: Rachna, take it away!
<johnath> ses: starting back up now
<tlr> jvkrey, we're restarting
security usability testing
rachna: so, I'll start by defining what I mean by usability testing ...
... traditional security methodology of robustness is good but not
sufficient ...
... HCI methrodology isn't sufficient since the attackers are modifying
along with us ...
... so I propose "red team" usability testing, where we actively attack
the user
... so both "can we use the system" and "how can we attack the user to
confound them"
... I have a bunch of questions
... 1. Will we test ideas or specific implementations of ideas?
<ses> I just joined
Mez_: how would we test a concept?
rachna: so for example, we could test a variety of implementations
instead of a specific one
<maritzaj> [26]http://www.w3.org/2006/WSC/wiki/SharedBookmarks
tyler: the tricky thing is once they make their implementation, they
stopped testing the concept
<maritzaj> #2 under usability studies about internet security
<Mez_>
[27]http://www.simson.net/ref/2006/CHI-security-toolbar-final.pdf
serge: instead of testing a toolbar, the study in question (see
maritzaj's link) tested the effectiveness of each indicator
<ses> Is anyone else on the phone? I could barely hear Rachna and I
can't hear Serge at all.
ses, sec, I'll move the phone
<ses> And he's got the phone so we should be able to hear him :)
beltzner: one could test the concepts on which a design is founded,
instead of the design itself
maritzaj: the answers will likely vary per recommendation
Mez_: so I don't understand how we'd decide whether to test a concept
or a design
rachna: yeah, I think we'll need to figure that out
tyler: part of this might be recognizing patterns in our
recommendations and test abstractions that cover aspects in each
recommendation
... to what extent do you think we can/should rely on the literature
instead of retesting some of those findings?
johnath: we have a huge body of research that has led to some of these
recommendations, I think it should be up to us to point to the
foundation and identify areas for follow up testing
rachna: well, where huge = from 2005 onwards
... 2. At what level of fidelity should we be testing?
<ses> Calling the existing body of research huge is the kind of
statement that could lead a small research area to turn downright
anorexic.
rachna: low-fi prototyping is sketches, medium-fi is flash or web
mockups, high-fi is extensions or browser modifications
maritzaj: previous research will come into play as references
rachna: using lower-fidelity prototypes will increase our bandwidth
... 3. What should we be testing?
... learnability, efficiency, skills required, flexibiliy,
satisfaction, errors, compliance rates
Mez_: remembers a study, sort of, that might be about how instructions
affected the results ...
maritzaj: the Jackson/MSR one?
Mez_: yes! and I haven't heard anything about that dimension
rachna: I think you're referring to a problem that exists in that they
had to describe EV certificates to one group
tyler: yes, but I thought they controlled for that
<PHB> PHB thought he was on the queue before, had disconnected when I
powered up the VPN
Audian: often helps to start with a list of assertions and
verify/validate those first, then move onto low-fi mockups, and use
those for the validation
rachna: yes, that's an excellent way to do designs
<ses> Can't hear Phil.
<ses> I can't stay awake without some cursing in my direction.
<ses> :)
<Audian> what = a quality test base?
<Audian> statistical relavance
rachna: we need to identify the goals of a study, as well: why do users
behave they way they do? what are users reliably capable of performing?
does technology X protect against attack A? etc, etc.
Mez_: 100% usable security is a dream, not a goal, and we should make
sure that the studies aren't tasked with finding the 100% solution ...
... people who believe in a recommendation will always argue that the
hit rate is "good enough", though, which worries me.
tyler: do we want to provide a target hit rate, then?
<Zakim> johnath, you wanted to respond to Mez, tyler, on quality of
data
johnath: it's easy to say "20%" is worse than "40%", but the trick will
be showing statistical significance in the effectiveness of the
recommendation
... the test you want is "this creates _an_ improvement"
johnath: at which point we can defend the assertion
<Mez_> better for the majority of web users
beltzner: once a recommendation is established as significant, if it
competes against another recommendation, then we can do a comparison
test
tyler: so what does it take for something like that to get put into
Mozilla?
beltzner: bring it forward to the Firefox product team (it's an open
meeting) and propose it; in my experience, you need to prove the worth
and value for the majority of the web
rachna: should we test in-lab or in the wild?
tlr: do we need to answer this now, or leave this to you?
rachna: we can come back to this later, but it depends on the
resources?
PHB: if we can work out a way of doing it in the wild, it would be much
better
<Mez_> +1 to phb
<Mez_> from an industry pov
PHB: oftentimes in-lab participants already know something about
security
tyler: this intersects with the fidelity; depolyable add-ons are more
easily tested "in the wild"
maritzaj: tradeoffs for both, in-wild experiments aren't easy to
organize
<ses> Everyone forgot to project. I haven't heard any of the speakers.
<ses> Beltzner and Rachna are coming in clear.
dan: in lab can be used to filter and then in the wild can be used to
test larger populations
tyler: bias is obviously a concern, but studies (like rachna and
stuart's) have shown a non significant correlation between bias and
effectiveness
rachna: well, that doesn't mean it didn't exist, since we were
controlling for it, not testing for it
serge: we didn't find any correlation either
... we discussed some of these issues at CHI, and the differences
seemed to break down as wild being good for quantitative, lab being
good for qualitative
tim: there are various audiences for the user agents which will affect
the demographics of in the wild testing
dan: we could do tests through deploying in various banks, etc, to get
a good cross section
Audian: I've often seen domain experts being far more critical of a
solution until the "would you recommend this to a friend" at which
point they all decided they would
rachna: another complication is that each proposal can be attacked in a
different way
stephenF: how do you figure out the workload?
rachna: depends on the attack, and whether or not there are
easily-reused exploits we can copy and paste
stephenF: there's some shortcuts we can take
PHB: we should be tactical about some things, like we know that w.r.t.
phishing, a takedown service shifts the problem to another target, but
doesn't kill the overall problem
... or in crypto, adding a bit to the key doubles the work for the
attacker
<PHB> Well it would be rather nice to know if the solution was intended
to be tactical or strategic
tlr: does this imply a different set of scenarios or can we just rely
on our existing ones?
<PHB> Rather a lot of solutions being sold a year ago as strategic
turned out to be distinctly tactical
tyler: our recommendations should each address the threat that they
attempt to defeat
<Mez_> [28]http://www.w3.org/2006/WSC/wiki/ThreatTrees
<PHB> Sitekey
rachna: does that mean modifying our threat trees to include these (on
her slides) attacks?
tyler: I think it would be a different section
rachna: I'm very confused
dan: we need to make sure the recommendations are significant at
attacking the problems (or something? help?)
rachna: another question is do we want to discover the usability
problems, or do we want to assert significant effect?
tlr: that is a question which we should try to answer today
<johnath> actually
<johnath> Serge is saying it
serge: my feeling is that we should be trying to assert effectiveness
<johnath> oh okay...
maritzaj: small N might be useful in the early stages as we try to
filter down
<Mez_> +1 to johnath
johnath: sprinkles +1s all over the PhD students
serge: I'm on a grant, we might be able to get similar resources from
other groups
<Zakim> Mez_, you wanted to tell serge that we're testing
recommendations
Mez_: we're gonna be testing our recommendations, so you should be
testing those
<johnath> serge: I think Mez misinterpreted your "I have a grant to
work on this stuff" to mean "I have stuff that I am already testing,
that maybe I haven't mentioned here yet" instead of "I have money, I
can probably help out with testing our recs."
<serge> no, what I meant is that recommendations can be implemented and
tested
<johnath> right, yes, but not that you were just incidentally
mentioning unrelated thesis work :)
rachna: testing requires IRB approval for me, this adds to overhead
Mez_: the W3C staffers won't be doing the testing, so there's no
MIT/IRB requirement
<serge> hey, if the recommendations can be included in that, great.
Likewise, those in the corporate world have a vested interest in making
the user studies happen because the results can be incorporated into
products.
<ses> Many IRBs demand to be involved if anyone in their institution
will be an author on findings. If the note is considered a finding,
that could cause issues.
<PHB> No human subjects oversight? time to redo Milgram?
<Mez_> tim andI would never redo milgram
<johnath> Mez is too nice for Milgram
<johnath> aww
<Mez_> ses: author/editors only I presume; I pesume acknowledgements
wouldn't be an issue
<ses> I'm the one who couldn't spell.
<Mez_> hahaha
<tlr> ses, MEZ, I'd think it's most useful to aim at scholarly
publication of original results and summarize / cite that in the W3C
deliverables
<ses> My spalling suks.
<serge> I'm currently working on a related study, if anyone's
interested I can summarize it
<Mez_> we're running a bit long; how about in email serge?
<serge> okay, in that case I might just wait until I have more results.
<Mez_> what's the eta on that?
<serge> maybe 2-3 weeks.
<Mez_> ok, sounds just fine
<Mez_> beltzner; want to give serge the action item?
<scribe> ACTION: serge to share results from his study once he has them
[recorded in
[29]http://www.w3.org/2007/05/30-wsc-minutes.html#action07]
<trackbot> Created ACTION-232 - Share results from his study once he
has them [on Serge Egelman - due 2007-06-06].
<tlr> ACTION: rdhamija2 to make sure Jagatic et al on social phishing
is in SharedBookmarks [recorded in
[30]http://www.w3.org/2007/05/30-wsc-minutes.html#action08]
<trackbot> Created ACTION-233 - Make sure Jagatic et al on social
phishing is in SharedBookmarks [on Rachna Dhamija - due 2007-06-06].
<tlr> ACTION-232 due 2007-06-30
<serge> 6/6 is not 2-3 weeks, unless we're on some crazy new calendar
system...
<Mez_>
[31]http://www.indiana.edu/~phishing/social-network-experiment/phishing
-preprint.pdf
<tlr> serge, see my remark about due date
<Mez_> is the jagatic study, and it's in our shared bookmarks
<serge> thanks
<Mez_> beltzner, give rachna an action on the ebay www2006 jakobsson
paper; I can't find it in shared bookmarks or on the web
<scribe> ACTION: rdhamija2 to add www2006 jakobsson, Florencio &
Hursley MSR paper to our shared bookmarks list [recorded in
[32]http://www.w3.org/2007/05/30-wsc-minutes.html#action09]
<trackbot> Created ACTION-234 - Add www2006 jakobsson, Florencio &
Hursley MSR paper to our shared bookmarks list [on Rachna Dhamija - due
2007-06-06].
<Audian> battery almost dead
tyler: so it looks like there's two in the wild tactics: actively
attack and measure effectiveness, or insrument existing browsers or our
solutoins
<tlr> ACTION-234 confuses several papers
<tlr> Jakobsson Ratkiewicz is what I meant, it's from WWW 2006.
[33]http://www2006.org/programme/item.php?id=3533
<Mez_> tlr, give whatever other actions are needed
beltzner: it seems to me like active attacks are way of validating our
threat trees, not our solutions
tlr, want to take the action from rachna? I'm sure she woudln't mind
bill-d: if we make changes to the user experience, how do you test in
the wild?
rachna: right, and that's tough in lab as well, since sometimes users
need to be trained, or the act of them being in the lab ends up
training them
Audian: there's ways of doing it almost at random by pulling people
aside in schools and malls
stephenF: possibility that some of the "hits" are false positives of
users entering wrong passwords on purpose
<Mez_> tlr or beltzner - an action on rachna to update timeline with
things like irb turnaround
<Mez_> please
rachna: once we have proposals, we need to enter low-fi prototyping
phase, then figure out what we're trying to prove, set up the studies,
set up the infrastructure, etc, and this all requires resources and
time
<scribe> ACTION: rdhamija2 to update / create a user testing timeline
with things like IRB turnaround, setup, etc. [recorded in
[34]http://www.w3.org/2007/05/30-wsc-minutes.html#action10]
<trackbot> Created ACTION-235 - Update / create a user testing timeline
with things like IRB turnaround, setup, etc. [on Rachna Dhamija - due
2007-06-06].
serge: there have been cases of people in studies entering real
information, in counterpoint to stephenF
maritzaj: reiterating an earlier point, we should tie references in the
shared bookmarks and tie them to recommendations
<maritzaj>
[35]http://www.w3.org/2006/WSC/wiki/StatusQuoUserStudyResults
Audian: how do we know who's submitting resources for testing
<scribe> ACTION: rdhamija2 to track donations of time and resources for
usability testing [recorded in
[36]http://www.w3.org/2007/05/30-wsc-minutes.html#action12]
<trackbot> Created ACTION-236 - Track donations of time and resources
for usability testing [on Rachna Dhamija - due 2007-06-06].
<scribe> ACTION: maritza to drive process of tying recommendations to
references in SharedBookmarks [recorded in
[37]http://www.w3.org/2007/05/30-wsc-minutes.html#action13]
<trackbot> Created ACTION-237 - Drive process of tying recommendations
to references in SharedBookmarks [on Maritza Johnson - due 2007-06-06].
<scribe> ACTION: rhdamija2 create and document user testing plan (with
links to timeline, donations, prototypers, etc) [recorded in
[38]http://www.w3.org/2007/05/30-wsc-minutes.html#action14]
<trackbot> Sorry, couldn't find user - rhdamija2
<scribe> ACTION: rdhamija2 create and document user testing plan (with
links to timeline, donations, prototypers, etc) [recorded in
[39]http://www.w3.org/2007/05/30-wsc-minutes.html#action15]
<trackbot> Created ACTION-238 - Create and document user testing plan
(with links to timeline, donations, prototypers, etc) [on Rachna
Dhamija - due 2007-06-06].
<ses> I'm singing off. I've had too hard a time keeping up and staying
focused given the quality of the call. (I also need to drive into work
at some point.)
<johnath> break until 2:30 local time (12 minutes)
implementation / testing / etc
beltzner: big believer in prototyping, sketching, whiteboarding - to
build wireframes
... use that as a way of expressing things better than text
... allows communication and discussion
... but once that finishes, there should be no limits on what
technology is used.
... should be something that enables testers to get what they want out
of it - HTML, Flash, Firefox extensions
... all the way to an installable browser client which can be
downloaded.
tlr: might include changes to other browsers (e.g. Opera)
... on the one hand - prototypes for testing
... on the other hand - things taken up by user agent implementers.
... what recommendations are sufficiently spelled out to allow for
implementation?
... what more do the browser vendors need to understand for each of the
recommendations?
... what are the reasonable expectations for how long it would take to
implement?
beltzner: kind of putting the cart before the horse ... lets get the
prototypes available so that anyone can run time (any browser vendor
included)
tyler: in terms of time it takes - have experience with add-ons for
Firefox and IE.
... much easier (in Tyler's experience) in Firefox than in IE.
... IE requires use of the COM libraries
... Firefox lacks some documentation (source browsing required in some
cases to understand Firefox operation)
... IE has support for .html and .hta - where HTA provides ALL window
format to be under the control of the HTML (HTA) file. This allows
... testing out of toolbars and such.
mez: not sure this covers all the testing types
serge: everything that requires attacking the user will require HIGH
fidelity prototype
RobY: hoping that what we want to test for can be tested
programmatically
... all should be programmatically testable (i.e. in a high-fidelity
agent)
serge: clarify "doing tests" vs. "doing studies"
RobY: example: assure that a C# file did not contain X ... this can be
done by a program testing for this.
mez: hold off on questions on how to do conformance testing for a half
hour.
... have been assuming that someone will be testing EVERY
recommendation we produce
<serge> so how do we make claims about the recommendation without doing
studies?
tyler: the model is likely to be the "champion" model - whoever is most
interested, will take up the flag
<beltzner> serge, you must be new to W3C recommendations!
<serge> I was on the P3P 1.1 group
<beltzner> for your wounds, sir
<serge> there's a difference between turning existing privacy standards
into electronic form and recommending arbitrary design guidelines for
user agents.
<serge> if we're going to make recommendations, we need some data to
support them.
<beltzner> serge, I reject your argument entirely, but that's a topic
for when we're drinking
tlr: there is a candidate recommendation step of going ahead further
... this could include something that requires an implementation step
in order to go further
... implementation might require coaxing and encouragement
... who, at this point, is in a position to say that they might be able
to start doing something?
RobY: one of the things mentioned was getting people to write the test.
... for best practices, we don't have to write the tests, we just need
to let developers know what to test.
<stephenF> dinner: looking at www.boxtyhouse.ie for 6pm
<stephenF> coffee: more outside now
Yngve: when there are more particulars, we can look at getting a team
involved.
... cannot say when we will be able to test.
mez: as part of going forward - there needs to be something about
implementation/prototype and conformance, robustness, and usability
test
... and for these three, there's going to have to be some sort of
implementation
... would like to demonstrate something by our next face to face
bill-d: what does it take to get someone to sign up?
mez: people who are champions and are capable of doing it themselves
(or coax someone else) are going to get their recommendations through
first.
<Zakim> johnath, you wanted to point out that this is a helpful step
anyhow - particularly under a champion model
jonath: echoing that some implementation is going to be a healthy
thing. And having the champion make someone write it gives this a > 1
collaboration effort
serge: what's the point of doing conformance testing if we're only
using this to bolster the recommendation
mez: hold conformance testing for 15 minutes
tlr: I agree on the importance of implementation
... we should not create an environment in which there are proposals
which correspond to a person's personal burden to implement/push.
... must do this as a group - advance recommendations which WE agree
should move forward
<johnath> staikos: conversation is already underway - backscroll should
give you a good enough idea as to whether you object horribly :)
<staikos> one thing I hope will be covered, if it hasn't, is html5. I'm
not sure how many of you have read the draft spec for this but it kind
of turns our work on its head
mez: is there someone who thinks we can get to concensus on a
recommendation without appropriate testing - please speak up.
<hal> dropping off to attend ws-sx tc call - back in 30 mins or less
<staikos> you know, with things like web pages able to open files,
sockets, register themselves as protocol handlers, etc
tyler: we have to be sensitive that there are limited developer
resources in the sky to work on this
<Audian> maybe not "invisible" but the lack of a favicon is hard to
test...its been there for a long time and now it isn't
tlr: there may be concensus on recommendations which should be looked
at further
... don't want to get bogged down on waiting for an implementation step
tyler: should we issue recommendations for things that have NOT been
tested?
tlr: recommendation is the final stage on the recommendation track ...
and this means it HAS been subjected to tests
... what we're working on right now is "drafts for recommendation" ...
which we can document now ... without testing having been done.
<Zakim> johnath, you wanted to say that my support for the champion
model didn't imply that implementation was *necessary*
johnath: if no one touches a recommendation and it drops on the floor
... then maybe that is OK. Likes the idea of champions for a
recommendation
serge: seems weird to come up with draft recommendations before testing
them out to see if they're useful.
<Zakim> tlr, you wanted to make an ontological point
tlr: perhaps we are again having some terminology conflicts
mez: we need a noun for what was discussed in Lightening Discussions
rachna: PROPOSALs is offered up
stephenf: not everyone is in the room - public draft is useful to get
wider input
mez: believes we can put out a first draft with only expert opinions
... and hopefully by June/July
... this last discussion has been quite good
... break for 30 minutes.
<johnath> bill-d: [40]http://www.boxtyhouse.ie/
<johnath> staikos: any recommendations for or against?
<staikos> well
<staikos> the best food I had in Dublin was at an italian cafe :)
starting up again
stephenF: 6PM - Temple Bar (10-15 minute walk)
functional and conformance testing
<stephenF> staikos: they're just getting what they asked for:-)
RobY: conformance testing is for telling testers how to test what has
been defined
... do we want to test compliance to this standard - or do we just want
to put it out? It is a significant amount of work to do all of this
testing.
... do we need conformance testing for the developers who implement
these recommendations?
... unless we put things in that say what a developer cannot do, Rob
doesn't see the need for adding in conformance testing
mez: the bulk of our recommendations will be towards user agent
developers ... though some recommendations will be pointed towards
content providers
... but conformance testing not required for user agents/user agent
developers?
RobY: the set of user agent developers is not a huge community
serge: agree with Rob that we should not focus on conformance testing
<tlr> thomas: testability of work product vs. broad-scale testing vs.
ability to test a limited population
<tlr> ... when limited population, then need some tests, but don't need
ability to automate that testing ...
<Mez_> [41]http://www.w3.org/2005/10/Process-20051014/process.html
<Mez_> "Part of a Working Group's activities is developing code and
test suites "
<Mez_> [42]http://www.w3.org/QA/WG/2005/01/test-faq
<Mez_> Two types of testing are particularly helpful:
<Mez_> Conformance testing
<Mez_> Focuses on testing only what is formally required in the
specification in order to verify whether an implementation conforms to
its specifications. Conformance testing does not focus on performance,
usability, the capability of an implementation to stand up under
stress, or interoperability; nor does it focus on any
implementation-specific details not formally required by the
specification.
<Zakim> stephenF, you wanted to ask can we have some examples of such
tests? (in a minute)
tyler: there are some well known test cases - how the browser renders
certain things in certain ways
... one thing we may want to specify is that key sequences used on
first authentication with a site should be different from a second or
subsequent authentication to the same site.
<serge> This seems to be a matter of charter, from 1.3 in the Process
Document: "The Working Group charter sets expectations about each
group's deliverables (e.g., technical reports, test suites, and
tutorials)."
<tlr> ... or not. ;-)
tlr: as far as conformance testing is concerned - we are not expected
to build an automated test suite
... we would be required to formulate a test suite that could be
followed to evaluate conformance. the test MAY consist of manual work
(like examining a user interface)
dan: conformance testing is likely not as big a deal as some of the
other parts of our recommendation
tlr: this is conformance testing work, but not as detailed or involved
as it has seemed to be implied so far today
... lets get to writing recommendations and examples of using these
(which should lead to conformance tests)
stephenF: there could be quite a bit of testing needed - lots of
configuration settings and such
RobY: as it becomes more and more defined, more and more
folks/companies will take interest
tlr: critical piece is to have tests and examples. More critical to
have an example and a test with it than to have an implementation.
... requirement -> example + testcase -> then implementation
tyler: one place of potential problem - if something makes a
request/requirement of a third party on the authenticity of something
... this would be difficult to find a non-conformant and conformant
example.
tlr: two ways around that:
<stephenF> ways 1 & 2?
tlr: 1) if you speak about conformance, give a definition of "trusted".
... phrase like "There shall be a phrase or outside-managed list which
is consulted."
... 2) the other way is to declar that "trusted" is defined as follows
....
<johnath> ping
InScopeByCategory
<Mez_> [43]http://www.w3.org/2006/WSC/Group/track/actions/179
hal: should be able to just walk through the information in the wiki
<tlr> [44]http://www.w3.org/2006/WSC/wiki/InScopebyCategory
<johnath> rachna: don't forget your DVI->VGA donglything - tlr just
unplugged it
<Mez_> [45]http://www.w3.org/2006/WSC/drafts/note/#filters
<Mez_> 5.5 Content based detection
<Mez_> Techniques commonly used by intrusion detection systems, virus
scanners and spam filters to detect illegitimate requests based on
their content are out of scope for this Working Group. These techniques
include recognizing known attacks by analyzing the served URLs,
graphics or markup. The heuristics used in these tools are a moving
target and so not a suitable subject for standardization. The Working
Group will not recommend any checks on the content served by web si
<Mez_> 5.5 is part of out of scope
tyler: some of these seem to line up with what has been proposed in PII
Editor work
Bill-d: thought identity management systems are out of scope
stephenF: there are things for Semantic Approaches that could suggested
which are out of scope (so they won't be suggested)
group: first two under semantic approaches are deemed IN scope
<hal> I am not able to hear most of the discussion
<Mez_> hold on a bit
rachna: even the third item (federated identity management) has
elements in-scope (as a form-filler extension seems in-scope)
<johnath> I won't be the one to recommend OpenID as a proposal. :)
mez: our intent was to look at these today to see if there were
concrete proposals which should be put forth
... hearing nothing from the group, it appears NO.
... onto the next category - What doesn't work
hal: this has had much discussion already, so let us skip
... move on to Education category
... it is unresolved whether users understand that they are making
"risk management" decisions
... next category General Principles
... some of these are conflicting
... next category New Indicators
tyler: has Firefox reserved any "drawing modes" for itself?
... such as transparency?
beltzner: only thing we've reserved is chrome
... one way is to have the element cross the information boundary
tlr: we have talked about existing robustness practices
... still needs to be pulled together from raw material in the wiki
... existing practices need to be written up
rachna: history and petnames still in?
tyler: petnames are still in
tlr: see here antipatterns for SSL certificate ... but not patterns
mez: some of the positive is wrapped up in Jonathan's proposal
johnath: both identity and what is a secure page are in the
recommendations
<tlr> +1 to skipping over process recommendations
hal: skip over process indicators
... final section - technical recommendations
... comprehensive architecture for web authentication is out of scope
... incorporate viable authentication techniques - should be covered
... next several are really "motherhood"
... extensibility so authentication can be continuously improved - not
sure how to write a recommendation
... specify infrastructure is out of scope
... metadata has already been discussed.
<johnath> hal - ping
tlr: if there are recommendations around trusted attention sequences -
then there might be a deployment recommendation that sites include
certain instructions
<Mez_> welcome back hal
<Mez_> taking a minute
hal: petnames is in play
... matching certificate contents is in play
... user controlled notation is in the same vein
... default blocking mode - is like safe browsing mode proposal that is
under discussion right now
<hal> hello
tyler: SSL can detect a suspected MITM attack - currently user agent
pops a dialog box. Should this be switched to just being a Error 404
not found?
yngve: opera indicates that potential "eavesdropping" may be underway -
so similar dialog
tlr: there is stuff in the wiki that needs to be pulled together
<tlr> ACTION-177 closed without doing
<tlr> ACTION: farrell to pick up on ACTION-177, complement with review
of TLS spec and exceptions given there; goal is to limit user
interaction when not needed - due 2007-06-19 [recorded in
[46]http://www.w3.org/2007/05/30-wsc-minutes.html#action16]
<trackbot> Created ACTION-240 - pick up on ACTION-177, complement with
review of TLS spec and exceptions given there; goal is to limit user
interaction when not needed [on Stephen Farrell - due 2007-06-19].
<tlr> ACTION-240 due 2007-06-26
<staikos> what time are you wrapping up?
<johnath> tlr: what's the urlhack to allow editing when I don't have
the "edit this action" link?
<tlr> append /edit
<beltzner> staikos: 12:30 EDT
<staikos> heh guess it's not worth calling now
<tlr> but if you don't have that link, it means you're looking at the
public version
hal: secure letterhead is something still in play
<johnath> tlr: so how do I log in to the action tracker?
hal: Service Security Requirement (SSR) record in DNS proposal - should
we work on it?
mez: appears to be no interest from the group.
hal: leverage new features (from workshop)
beltzner: xul:browsermessage - it's the mark up which indicates what
comes up when a "pop-up" is blocked or if something should be installed
<Mez_> rachna
<Mez_> is talking
rachna: APIs for anti-phishing? This could be APIs for third party
services
mez: no comments or interest reflected by the group
day One wrap-up
mez: agenda for tomorrow - lead off on logistics for next face-to-face
... tyler on remaining Note issues that we have
... bulk of the day walking through the editor's draft
<beltzner> it doesn't appear that the xul:notificationbox can be used
in content
<beltzner> reference is here
[47]http://developer.mozilla.org/en/docs/XUL:notificationbox
adjourned for the day
<Mez_> [48]http://www.boxtyhouse.ie/
Summary of Action Items
ACTION-227 - Update template with material from discussion; notify
e-mail list [on Thomas Roessler - due 2007-06-06].
ACTION-228 - Share slides about usability testing from dublin f2f [on
Rachna Dhamija - due 2007-06-06].
ACTION-229 - Share his slides on robustness testing from the dublin f2f
[on Bill Doyle - due 2007-06-06].
ACTION-230 - Define robustness for WSC glossary [on Bill Doyle - due
2007-06-06].
ACTION-231 - Start a discussion about including descriptions of the
information divulged to websites by user-agents [on Bill Doyle - due
2007-06-06].
ACTION-232 - Share results from his study once he has them [on Serge
Egelman - due 2007-06-06].
ACTION-233 - Make sure Jagatic et al on social phishing is in
SharedBookmarks [on Rachna Dhamija - due 2007-06-06].
ACTION-234 - Add www2006 jakobsson, Florencio & Hursley MSR paper to
our shared bookmarks list [on Rachna Dhamija - due 2007-06-06].
ACTION-235 - Update / create a user testing timeline with things like
IRB turnaround, setup, etc. [on Rachna Dhamija - due 2007-06-06].
ACTION-236 - Track donations of time and resources for usability
testing [on Rachna Dhamija - due 2007-06-06].
ACTION-237 - Drive process of tying recommendations to references in
SharedBookmarks [on Maritza Johnson - due 2007-06-06].
ACTION-238 - Create and document user testing plan (with links to
timeline, donations, prototypers, etc) [on Rachna Dhamija - due
2007-06-06].
ACTION-240 - pick up on ACTION-177, complement with review of TLS spec
and exceptions given there; goal is to limit user interaction when not
needed [on Stephen Farrell - due 2007-06-19].
[End of minutes]
__________________________________________________________________
Minutes formatted by David Booth's [49]scribe.perl version 1.127
([50]CVS log)
$Date: 2007/06/17 22:03:44 $
References
1. http://www.w3.org/
2. http://lists.w3.org/Archives/Public/public-wsc-wg/2007May/0158.html
3. http://www.w3.org/2007/05/30-wsc-irc
4. http://www.w3.org/2007/05/30-wsc-minutes.html#agenda
5. http://www.w3.org/2007/05/30-wsc-minutes.html#item01
6. http://www.w3.org/2007/05/30-wsc-minutes.html#Conformanc
7. http://www.w3.org/2007/05/30-wsc-minutes.html#Intermediat
8. http://www.w3.org/2007/05/30-wsc-minutes.html#Robustness
9. http://www.w3.org/2007/05/30-wsc-minutes.html#item02
10. http://www.w3.org/2007/05/30-wsc-minutes.html#item03
11. http://www.w3.org/2007/05/30-wsc-minutes.html#item04
12. http://www.w3.org/2007/05/30-wsc-minutes.html#item05
13. http://www.w3.org/2007/05/30-wsc-minutes.html#item06
14. http://www.w3.org/2007/05/30-wsc-minutes.html#ActionSummary
15. http://www.w3.org/2006/WSC/drafts/rec/#favicon-favicons-rec
16. http://beltzner.ca/webdav/forthomas.txt
17. http://www.w3.org/2002/09/wbs/39814/f2f3sched/results
18. http://www.w3.org/2006/WSC/wiki/Glossary
19. http://www.w3.org/2007/05/30-wsc-minutes.html#action01
20. http://www.w3.org/2002/09/wbs/39814/f2f3sched/
21. http://www.w3.org/2007/05/30-wsc-minutes.html#action02
22. http://www.w3.org/2007/05/30-wsc-minutes.html#action04
23. http://www.w3.org/2007/05/30-wsc-minutes.html#action05
24. http://www.ietf.org/html.charters/nea-charter.html
25. http://www.w3.org/2007/05/30-wsc-minutes.html#action06
26. http://www.w3.org/2006/WSC/wiki/SharedBookmarks
27. http://www.simson.net/ref/2006/CHI-security-toolbar-final.pdf
28. http://www.w3.org/2006/WSC/wiki/ThreatTrees
29. http://www.w3.org/2007/05/30-wsc-minutes.html#action07
30. http://www.w3.org/2007/05/30-wsc-minutes.html#action08
31. http://www.indiana.edu/~phishing/social-network-experiment/phishing-preprint.pdf
32. http://www.w3.org/2007/05/30-wsc-minutes.html#action09
33. http://www2006.org/programme/item.php?id=3533
34. http://www.w3.org/2007/05/30-wsc-minutes.html#action10
35. http://www.w3.org/2006/WSC/wiki/StatusQuoUserStudyResults
36. http://www.w3.org/2007/05/30-wsc-minutes.html#action12
37. http://www.w3.org/2007/05/30-wsc-minutes.html#action13
38. http://www.w3.org/2007/05/30-wsc-minutes.html#action14
39. http://www.w3.org/2007/05/30-wsc-minutes.html#action15
40. http://www.boxtyhouse.ie/
41. http://www.w3.org/2005/10/Process-20051014/process.html
42. http://www.w3.org/QA/WG/2005/01/test-faq
43. http://www.w3.org/2006/WSC/Group/track/actions/179
44. http://www.w3.org/2006/WSC/wiki/InScopebyCategory
45. http://www.w3.org/2006/WSC/drafts/note/#filters
46. http://www.w3.org/2007/05/30-wsc-minutes.html#action16
47. http://developer.mozilla.org/en/docs/XUL:notificationbox
48. http://www.boxtyhouse.ie/
49. http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm
50. http://dev.w3.org/cvsweb/2002/scribe/
Received on Sunday, 17 June 2007 22:11:51 UTC