implementors first impression notes on the spec from Peter Williams on 2011-11-17 (public-xg-webid@w3.org from November 2011)

From: Peter Williams <home_pw@msn.com>
Date: Wed, 16 Nov 2011 22:34:09 -0800
To: "public-xg-webid@w3.org" <public-xg-webid@w3.org>
Message-ID: <SNT143-W13EF0C01C11EB2796F56FB92C70@phx.gbl>

henry asked me to review the spec, by cmpalining that I have not done so for some considerable while. Find below an unfiltered review of the official "W3C Editor's Draft of Webld 1.0", downloaded from http://www.w3.org/2005/Incubator/webid/spec/. I note that it has the formal status now, within W3C process, of "W3C Editor's Draft". With such labelling, Im expecting higher quality in content, finishing and use of language. Some of the comments are quite critical. Do not take the critical tone too literally: as I have left them unfiltered, so one see an experts mind, at work. I am very expert in my security fields - as redcently indpendently tested, certified and re-examined (but whose topic list exludes cryptography, where I just have fun as a total amateur looking at ancient cryptosystems). Remember that all feedback was welcomed. Dont get too defensive, and dont email a response after drinking. I want webid to work and to take off (since every other use of client certs has failed, nad its great that W3C is having a go). On the evidence presented, the method and the spec itself are both quite some ways off readiness for wider adoption or wider review, even. This is cleary an incubator-grade deliverable (not that thre is anyting appropriate about that status). This is long and detailed, so print it out. If you dont agree, don't bother arguing with me. Just ignore the line. its just peer-peer feedback, issued in a friendly forum. You are getting an implementors view. Peter. ------------ "This URI should be dereference-able" - refering to the URI in the name field of a self-signed cert. To this implementor this phrase means IETF SHOULD, in the absence of any other definition. Perfectly good reasons exist therefore not to make that assumption, that is, and "complete" code should not assume that all certs have dereference-able URIs. If some thing else is meant, it should be stated. "Using a process made popular by OpenID, we show how one can tie a User Agent to a URI by proving that one has write access to the URI" I recommend this be changed to OpenID v1.0. The set of ideas tpo which the text is referring were found to be almost 100% absent in my own, recent production adoption - at national scale - of 2 infamous openID providers (using v2 and later vintages of the OpenID Auth protocol). Neither IDP showcased the URI heritage of the openID movement. Both showcased email identifiers, to be perfectly honest. At W3C level, we should be accurate, and tie the reference of OpenID 1.0. Concerning openid and webid, there the common history over URIs ends. "TODO: cover the case where there are more than one URI entry" is a topic is still pending, 12 months in. I'd expect this defined by now (since its a 3 year old topic). I dont know what to do as an implementor. Im going to take the text literally. That is: it is not an exception to encounter multiple URIs. I will take one at random from the set, since there is no further requirement. It is not stated what to do in case no URIs are present. I will raise a 500 HTTP code, in such case; since no specification is made. (NB text later seems to fill in some details, here, and a ref needs to be made to it). "A URI specified via the Subject Alternative Name extension" - Cert extensions dont specify. Standards groups specify. Choose a better verb. Im sure W3E Editor language is standardized on this kind of topic. The terminology section is sloppy. It uses identification credentials, defines identification certificates, and refers to webid credentials. These are all critical terms, and the language model seems adhoc, at best. "A widely distributed cryptographic key that can be used to verify digital signatures and encrypt data". This is sloppy security speak. A Public Key is very rarely used to encrypt data. A professional would say it encrypts key (key being very specifically typed/distinguished from untyped data). A competent review by W3C of commodity crypto in the web will show that there is almost zero use of public key encryption of data. Keeping the text as is will setup off alarm bells in crypto policy enforcement circles. "A structured document that contains identification credentials for the Identification Agent expressed" - the term "for" came across where I'd expect "of". As an implementor, Im taking the rest of the definition of WebID Profile to mean that its fine for a Validation Agent to just consume only 1 and 1 only, say the XML serialization of RDF. This means that interworking with my implementation will properly fail should the user not have XML form of the RDF identification credentials. And, to this implementor, this is as intended by the design, as stated. The implemention will be conforming, in this failure case. I feel it is inappropriate to be reading such subtle implementor-centric information in a definition, note. Move this to normative text paragraphs. One should not be reading between the lines on such critical matters. "Public Key" is missing capitalization. "The user agent will create a Identification Certificate with aSubject Alternative Name URI entry." Most User Agents do not "create" certificates. They create certificate signing requests, in a form that is not X.509 standardized, for most cases. Mixing "Create" with certificate in the context of user agent is sending confusing signals to implementors. User Agent should be capitalized. The number of SANs should be more clearly defined, at creation time. The text here implies there is only one. "This URI must be one..." contradicts earlier text, that says that a URI SHOULD (be de-referencable). Decide between MUST and SHOULD, and capitalize terms properly per IETF guidelines. Remove the sexist "he" from a W3C document. There is more appropriate, standard W3C phraseology. Remove the commentary sections, sometimes in yellow: this or that is under debate. As an implementor, it confuses me. I dont know what to do, and wonder about doing nothing till the debate settles down - since it was evidently SO vital to point it all out. Not good style for an Editors draft. The example certificate is strange. It has two critical extensions, that Verification Agents MUST enforce. One says its not a CA. the other says the user of the associated private key has a certiicate signing privilege. One who signs certificates is a CA, by definition. A not particularly literal reading of the cert handling standards would require a validation agent to reject that cert, since (a) it MUST enforce the extensions, since they are CRITICAL, and (b) there is a contradiction in what they say, technically. This is just confusing and unecessary. No reference is made to IETF PKIX, and its not clear what cert-related conformance standard MUST be applied, by webid protocol implementations, if any. As an implementor of a Validation Agent, do I or do I not enforce "O=FOAF+SSL, OU=The Community of Self Signers, CN=Not a Certification Authority" in the issuer field? All I have is an open question, and no stricture. Decide one way or the other. If its mandatory, specify the DN properly. As an implementor I dont know if the order or the DN components is left to right , or otherwise. I should not have to read or use openssl code to find out or use to see if my software is conforming. "The above certificate is no longer valid, as I took an valid certificate and change the time and WebID. As a result the Signatiure is now false. A completely valid certificate should be generated to avoid nit-pickers picking nits" Can we fix this, and remove stuff that looks like the spec is half baked? its giving a false impression of the quality level of the group (and W3C, I might add). It looks like there is no quality control, peer review, etc. After 12 months of editing, I expect better. Ok Im going to relax a bit, as its increasingly evident that the spec writers are not all native English speaker, and some of them are struggling to write formal, technical English. It needs some comprehensive editing by one used to very formal, technical writing. "The document can publish many more relations than are of interest to the WebID protocol,"... No it cannot, for the purposes of THIS spec, and this WebID protocol. Protocols are hard code, they dont have "interests" in perhaps doing something. Either I code it or I dont. Define what is out of scope, and perhaps imply that non-normative protocol "extensions" might use such information. However, these are not in scope of this document. "The WebID provider must publish the graph of relations in one of the well known formats,". No. First, WebID "provider" is undefined. The entity publising a WebID Profile ought to be defined, and that entities term be used consistently. Second, the publisher MUST produce XML and/or RDFa, and possibly others. Failure to produce one of RDF/XML or RDFa is non-conforming, according to earlier mandates. 2.3.1 goes on about Turtle representation. Remove it completely, since its not even one of the mandatory cases defined as such in the spec. 2.3.2 is incomplete, missing the doctype. I dont know whether a mislabelled doctype with apparent RDFa is conforming or not. This omission is not helping implementors. I dont know the if example HTML document elements in 2.3.2 verifies. My own efforts to use something similar from the mailing list, made MIcrosoft's visual studio's verification system for HTML complain, bitterly. But then, I was confused, not knowing which doc type and which validation procedure to follow. This seems inappropriate position for W3C to take. "If a WebID provider would rather prefer not to mark up his data in RDFa, but just provide a human readable format for users and have the RDF graph appear" phrasing makes it sound like RDFa is not intended to be machine-readable. This is probably unintended mis-direction. Step 5 in the picture of 3.1 makes it appear that WebID Profile document SHOULD be published on https endpoints. State clearly what the conformance rules are. being over-pedantic (like a code writer), the request in step 5 is issued by the protected resource, but the response to 5 is handled by the webserver hosting the page, in the pseudo-UMLish diagram notation. This is a signal to me. I dont know how the webserver is supposed to tie up the request to the response, since it was not the initiator. Label the diagram as NON_NORMATIVE. Step 3 is missing information - that an web request is issued, over https. Distinguish the SSL handshake (signaling for and delivering the cert) from the SSL channel over which an HTTP request MUST be sent, if I interpret 8 correctly. Again, its not clear if only information retrieval requests, post handshake MUST use https, or not, the same ciphersuite MUST be used as negotiated by the the cert-bearing handshake, or not, whether one can rehandshake or not... While a summary diagram cannot express all those things, I am confused about what is mandatory owing to the omission of a competent step 3. Steps 6 and 7 are poorly done. They imply, from the activity diagram that the protected resource will perform the identity and authorization queries (and not the webserver). Thats what it says (to an implementor). Remove the phrase "the global identity". This is far too grandoise. "The Identification Agent attempts to access a resource using HTTP over TLS [HTTP-TLS] via the Verification Agent." needs rewriting. It comes across as saying (with the via cosntruct) that the Validation agent is a web proxy to the Identification Agent (which is the UA, typically). Note typically, in windows, the kernel intercepts the TLS invocation, performs the SSL handshake, and does NOT know that a given resource is being requested, of a webserver. There are strong implementation biases in this formulation that are improper, for a W3C spec. Generalize. Im almost 18 years out of date on SSL spec, but I dont recall something called the " TLS client-certificate retrieval protocol." The actual protocols of the SSL state machine are well defined in the SSL3 and followup IETF specs. They have proper names. Be precise. Editor to help those for whom English is second language. Editor needs to do fact checking, and quality control. step 2 of 3.1 is confusing. If makes it sound like if an unsolicited certificate message is sent voluntarily by client, before a certificate is sought by the server, it MUST be rejected by the server. Is this what is meant? Must any unsolicited presentation be rejected? If so what MUST happen? is there an SSL close? If a TLS session is resumed, and a request received inducing the TLS entity to commence a complete SSL handshake and requesting a certificate, is it a WebID Protocol violation to use the client certificate from the earlier completed handshake? (this case is common.) MUST certificates be provided by the current handshake? If two handshakes on 1 TCP connection present diferent client certificates, what happens? The phrase "claimed WebID URIs." should be removed, unless some doctrine about "claimed'ness is exaplained". 4 says "must attempt to verify thepublic key information associated with at least one of the claimedWebID URIs. " it doesnt say HOW to verify the public key "information" (which I assume to mean "Public Key"). Is this an implemention specific method? It doest make much sense, in technical English, to say the public key ...associated with at least one of... How do I tell that this one is, and that one is not? I think the text is trying to say: pick one by one of the URIs from the cert, and see if they verify against the graph (not that we have pulled it yet, in the step sequence). one must verify. Much tighter grade specification language is required, focussed on implementors writing state machines. step 5 is clear, but makes wholly improper use of MUST. remove common fallacy "verifies that theIdentification Agent owns the private key". There is no method know to do this (establish ownership). One can establish "control over." by the method of proving posession of. Change" TLS mutual-authentication between the Verification Agent and theIdentification Agent." to TLS mutual-authentication between the Verification Agent and the User Agent" Identificiation Agents cannot perform TLS, let alone performs a mutually authentication service, let alone authenticate a server or a "Validation Agent". Browsers (ie. User Agents) can. Recall how the two were distinguished in the definition. "If the Verification Agent does not have access to the TLS layer, a digital signature challenge must be provided by theVerification Agent. " This is MUST, and is obviously critical. While there is a forward reference, I dont have a clue what this referring to. This is VERY WORRYING. I also dont know what "have access to the TLS layer" means. Does receiving client certs and information from the SSL server side socket constitute "access to". Very vague. Needs work. Never found the refernece, VERY CONFUSING INDEED. I probably would not hire a person saying this to me... note.; and would quickly wrap up the interview. The next section is titled "Authentication Sequence Details", but was referenced as "Additional algorithms" earlier. Dont misuse the term algorithm, or qualify with "additional". From the title, its just more detailed steps in a sequence description. 3.2.1 and 3.2.2 are blank, and are implementation critical. Their absence makes the spec almost unimplementable. its only by luck that the current 15 implementations do the same thing, here, if that is indeed the case. 3.2.3 A Verification Agent must be able to process documents " has an inappropriate use of MUST. one never uses MUST in a MUST "be able". Learn to use SHOULD, MUST properly. This is elementary spec writing. On the topic, the statement contradicts earlier mandates concerning either/or requirements for validation agents handling RDFa, and/or RDF/XML. Previously, a Validation Agent was NOT required to always "be able to" process RDFa (for example my implementation will not). The text makes it sound that a Validation Agent that fails in its HTTP request to advertise the willingness to accept text/HTML would be non-conforming, for example. On formalist grounds "Verifying the WebID is identified by that public key" does a public key even identify a WebID. a WebID is not a defined term. Only an WebID URI is defined. Since there are multiple public keys that associated with a WebID URI, what kind of multi-identification relation is being mandated here, to be then verified? I dont know. VERY VERY WORRYING is the following : "against the one provided by the WebID Profile or another trusted source, ". This text suggests that a Validation Agent implementation may opt out of using a WebID Profile document,as dereferenced using a Claimed URI, and use some other source instead. To be literal, the MUST steps are to be performed, but then an implementation may ignore the result and just use something else. Kind of makes me want to research additional options... if true. "9D ☮ 79 ☮ BF ☮ E2 ☮ F4 ☮ 98 ☮..." has strange symbols (in my rendering). I dont know if to take it literally, or not, speaking as an implementor. Publish in PDF, if the web cannot render literals. "3.2.6 Secure CommunicationThis section will explain how an Identification Agent and a Verification Agent may communicate securely using a set of verified identification credentials.If the Verification Agent has verified that theWebID Profile is owned by the Identification Agent, the Verification Agent should use the verifiedpublic key contained in the Identification Certificatefor all TLS-based communication with the Identification Agent." it "will" or it "does". First, one doesnt' "communicate security" using credentials, verified or otherwise. I have absolutely no idea what Im supposed to do for the folowing SHOULD. What is the scope of this SHOULD? The current web session? If the cert expires after 30m of activity on the wire, what am I supposed to do? keep using it? close the SSL session? we must use MUST SHOULD properly, and not make vacuous conformance statements. 3.3.1 goes on about topics not within the scope of the webid protocol (e.g. accounts, and foaf attributes). Remove, or label as non-normative. "The following propertiesshould be used when conveying cryptographic information inWebID Profile documents:"... Surely MUST. Surely, if someone choose the path of least resistance in a SHOULD clause, there will be NO interworking?

Received on Thursday, 17 November 2011 06:34:43 UTC