- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Thu, 18 Dec 2003 16:38:06 +0100
- To: <www-webont-wg@w3.org>
Evan and I had the relatively narrow task of looking at the QA-OPS guidelines ... I feel a desire to comment on the QA framework as a whole, and intend to do so as a personal comment to be sent after the webont comment (assuming we agree to something like what we have). I thought it might be helpful to let WebOnt see my current thoughts ... we could, for example, decide that some points were so important that they needed to be made by the WG - however I think the review prepared by Evan and myself is sufficient. (Unlikely ... I really am much too extreme in my opposition to this stuff - I see that a repeating theme is that they commit to AAA quality in their charter but seem remarkably short on fulfilment - and I get quite angry about this armchair quality work, which is itself too mediocre) I guess I will need to make it very clear that this is a personal comment, and not on behalf of anyone else (e.g. HP or a WG). === This is a review of the following documents: http://www.w3.org/TR/2003/NOTE-qaframe-intro-20030912/ (WG NOTE) http://www.w3.org/TR/2003/CR-qaframe-ops-20030922/ (CR) http://www.w3.org/TR/2003/CR-qaframe-spec-20031110/ (CR) http://www.w3.org/QA/WG/2003/10/TestGL-20031020.html (editors draft) also mentioning http://www.w3.org/TR/2003/CR-qaframe-spec-20031110/qaframe-spec-ta http://www.w3.org/QA/WG/2003/09/qaframe-spec-extech-20030912 I note that I have done most on the test editors draft - which may have been inappropriate timing, but hope it helps anyway. === Comment 1: editorial - ToC ==================== http://www.w3.org/TR/2003/NOTE-qaframe-intro-20030912/ does not have a ToC - please fix. Comment 2: substantive - Goals ====================== Goals etc. http://www.w3.org/TR/2003/NOTE-qaframe-intro-20030912/ section 1.1 I believe the goal should be better quality recommendations. I believe test suites may contribute to this, but in terms of scoping the QA work, and in terms of setting the goals of the QA work this should be linked to the output of W3C which is a set of documents. Thus test suites are only useful in as much as they help provide better quality recommendations. This of course begs the question as to what are the quality metrics for a recommendation - suggestions: - precision - distinction between normative and informative force - consistency, with self, with other W3C publications - implementability - readability (i.e. a precise document such as OWL S&AS by itself fails, but in combination with other material may meet this goal) The problem with setting conformance tests as the goal is that many WG members will not be committed to this goal In more detail [[ For those undertaking quality assurance projects, the QA Framework should give: + a step-by-step guide to QA process and operational setup, as well as to development; + and, access to resources that the QA Activity makes available to assist the WGs in all aspects of their quality practices. ** neither of these points say very much, since they depend on the definition of QA, which most readers will not share, suggest delete ** More specific goals include: + to encourage the employment of quality practices in the development of the specifications from their inception through their deployment; ** again without a shared understand of quality this statement is vacuous, no-one is opposed to 'quality', but some will be opposed to the QA WG's conceptualization of quality ** + to encourage development and use of a common, shared set of tools and methods for building test materials; to foster a common look-and-feel for test suites, validation tools, harnesses, and results reporting. ** this is the first point which it is possible to disagree with, and hence is the first substantive statement of goals - and it is inappropriate - the goal must be higher level than this ** ]] A problem with having conformance tests as a goal is that it is unrealistic to expect the whole of the WG to buy into it. Whereas (nearly) all (active) WG members will accept that the quality of the documents being produced is a reasonable goal for the WG. Quality is not the responsibility of a specialist subgroup but a shared responsibility of the whole WG - obviously different members of the WG will have different beliefs and opinions as to the value of testing, and will only really support test work once it has begun to show real benefit on more general measures. Comment 3 - AAA, eh? ========= The QA WG is committed in its charter to AAA quality on all the metrics. It does not appear in your CR that you achieve this, and this alone is reason for the QA CR to not go forward. Examples where you fail on your own metrics follow. Comment 4 - Synchronize ========= http://www.w3.org/TR/2003/CR-qaframe-ops-20030922/guidelines-chapter#Gd-sync -spec-TM-devt [[ Checkpoint 3.1. Synchronize the publication of QA deliverables and the specification's drafts. [Priority 2] ]] while I support the WebOnt WG's more general comment about your checkpoints being too strong, I also suggest that this one is too weak. It is hard to review specifications that are released in pieces with references from one part at one level of completeness to another at another level. I find your documents a very good example of how not to release LC and CRs. It is too difficult for the reader to make any sort of consistent cut through what is in reality a single logical publication, but which has pieces at very different levels. I also find that WG notes for the informative sections work less well than using recommendations for the informative sections. Partly this is because you seem to have allowed yourself lower editorial standards in the notes (e.g. the missing ToC), partly that a WG note is not necessarily a consensus document, partly because the lack of LC, CR, PR checkpoints in a note make it difficult for the reader to understand what sort of review is appropriate. I find no evidence that you fulfilled this checkpoint in your own publications. In particular I have not found test material for QAF-OPS, and the test material I have found for QAF-SPEC is too incomplete for CR, (the main content seems to be http://www.w3.org/TR/2003/CR-qaframe-spec-20031110/qaframe-spec-ta which is test assertions rather than tests) So I suggest that this checkpoint be reworked as: 1) During the latter review stages of the recommendation track (LC, CR, PR, Rec) it is important for WGs to appreciate the difficulty that non-synchronous publication of all relevant material causes to reviewers. 2) Informative content that the WG believes many readers will need in order to fully understand the normative content of a recommendation should be in recommendation track documents. 3) WG notes should be used for additional optional material. 4) test material should be published at the same time as the main documents 5) When synchronized publication is not possible, then the earlier publications should indicate the intended date of publication of the later documents. The review period should extend an appropriate date after the publication of the last documents published. In particular no one part of a package of related documents can move ahead more than half-a-step of other parts. Comment 5 - appropriate linking to Tests ========= I have no idea how to tell whether or not there are tests for your CR documents. I have to resort to google. RDF Core and WebOnt both decided to publish their tests as a rec track doc. It would be interesting to see the QA's group view of this. Here are some advantages of that decision: - clear the level of synchronization or not between test and other WG work. - more obvious where to find the tests - test publication is announced using standard procedures - test work is recognised as an important part of WG activity with public credit given to test editors (although some test contributors are undervalued) - test work is preserved for posterity using W3C's preexisting publishing process Here are some disadvantages - more difficult/impossible to add/modify tests after Rec - not clear best way to incorporate tests in a document - RDF Core just use a zip which ends up as the normative copy - WebOnt include tests inline, so test document is enormous (XXL) In any case the QA documents should suggest that rec track documents have clear and straightforward links to the relevant test suites. Comment 6: "scope" ================== This is a banality http://www.w3.org/TR/2003/CR-qaframe-spec-20031110/qaframe-spec-ta Checkpoint 1.1 says [[ The first section of the specification contains a clause entitled "scope" AND enumerates the subject matter of the specification. ]] a) conventionally W3C specs up case the first letter of a section title. b) the document http://www.w3.org/TR/2003/CR-qaframe-spec-20031110/ does not satisfy this (wording of the) checkpoint c) the document http://www.w3.org/TR/2003/CR-qaframe-ops-20030922/ does not satisfy this (wording of the) checkpoint The sloppiness of the wording is indicative of the lack of quality in the family of documents. Problems with the sentence include: a) "first" is too strong (cf "1.2. Scope and goals") The scope should be stated in the introductory material. b) "section" is too strong (cf "1.2. Scope and goals") (well, when looking at the word "entitled" - clauses do not have titles, sections and subsections do) c) "specification" is incorrect (cf. "QA Framework: Operational Guidelines W3C Candidate **Recommendation**" ) d) "clause" is undefined and is not generally used in the discussion of W3C recommendations e) "entitled" can only plausibly apply to certain xhtml elements, it is not clear that these are what you have in mind. f) "scope" is too narrow - surely what matters here is the intent not the actual words used. g) "AND" emphasis unnecessary h) "enumerates" numbering the parts is for the ToC For example, I find that http://www.w3.org/TR/2003/PR-owl-semantics-20031215/#1 adequately quickly and concisely describes what the document is about and why I might or might not read it, and what I might look at instead. Any reworking of this checkpoint should be liberal enough to permit the OWL Semantics PR document to pass it, since that document is of adequate quality on this metric. I am unconvinced that this family of documents have had adequate review by the QA WG, and the QA IG. I suggest that you should set yourselves higher goals before seeking wider review again. Comment 7: throughout s/Specification/Recommendation/g ========== W3C publishes recs not specs. Comment 8: ========== "the Working Group MUST identify a person to manage the WG's quality practices. " I cannot tell whether the QAWG have fulfilled this requirement or not. I suggest that the discussion should suggest that the QA moderator be listed on the WG home page. (I bet you haven't - aren't I a a cynic?) I am unclear as to the value of this. The problem is to do with rewards, motivations and power. Rewards and Motivations ======================= If I get appointed as QA moderator I get a nice new entry on my CV, but what real interest do I have in ensuring the WG produces quality documents. The editors get their names on the docs, not me. I note that RDF Core appointed Brian McBride as Series Editor - he did a huge amount of work which largely was driven by quality goals such as consistency across the docs, consistency between the tests and the docs etc. and he is justly rewarded by a fairly big splash of his name on the W3C recs (hopefully). If I were a QA moderator - I am not getting paid for this job, W3C work is voluntary and has to compete with other tasks my boss might think of as worthwhile, what is my recompense? If the job is little work then this is not a problem but the checkpoint is unmotivated. (I note that WG chairs have the same problem - basically their self-interest is to get to rec with the minimal amount of effort) Power ===== Lets suppose I am a consciencious QA moderator and the WG repeatedly makes decisions that undermine quality. My only initial power is to (threaten to) resign. As such this checkpoint needs to be integrated into the process document and for it to be clear how things escalate after a QA moderator as taken that ultimate step. Obviously the WG can/should appoint someone else, and this must be done in a timely way. How does one avoid a stooge? How does one avoid a lip-service to QA? It seems to me that quality work must clearly justify itself. i.e. that the work done by the QA moderator should be of such obvious value to the WG that he/she will gain a respect within the group that enables a certain power within the group. If this is the case then the title is unnecessary. I don't think this checkpoint is though through and I think the QA work would do well enough without it. A Wg has document deliverables and test deliverables: the owners of these deliverables need to own the quality process for them. Comment 9: ??? ================================== This is very confusing ... is this still part of your framework? http://www.w3.org/QA/WG/2003/02/OpsET-qapd-20030217 is there a later version? is it dead? Problems: 1) some docs in WG space are part of the recommendation and not merely editors drafts e.g. http://www.w3.org/QA/WG/2003/09/qaframe-ops-extech-20030912 2) other documents in the WG space are defunct are irrelevant in one way or another. 3) specifically http://www.w3.org/QA/WG/2003/02/OpsET-qapd-20030217 claims to be an appendix to http://www.w3.org/QA/WG/2003/02/qaframe-ops-extech-20030217 which claims to be an earlier version of http://www.w3.org/QA/WG/2003/09/qaframe-ops-extech-20030912 4) however http://www.w3.org/QA/WG/2003/02/qaframe-ops-extech-20030217 does not list http://www.w3.org/QA/WG/2003/02/OpsET-qapd-20030217 in the ToC are there it is ... in a paragraph underneath the ToC - ummm a very "high quality" ToC. Comment 10 ========== http://www.w3.org/QA/WG/2003/09/OpsET-qapd-20030912 http://www.google.com/custom?hl=en&lr=&ie=ISO-8859-1&cof=AWFID%3A0b9847e42ca f283e%3BL%3Ahttp%3A%2F%2Fwww.w3.org%2FIcons%2Fw3c_home%3BLH%3A48%3BLW%3A72%3 BBGC%3Awhite%3BT%3Ablack%3BLC%3A%23000099%3BVLC%3A%23660066%3BALC%3A%23ff330 0%3BAH%3Aleft%3B&domains=www.w3.org&q=%22QA+Test+Material+Process+Document+f or+QA%22&btnG=Google+Search&sitesearch=www.w3.org no hits http://www.google.com/custom?hl=en&lr=&ie=ISO-8859-1&cof=AWFID%3A0b9847e42ca f283e%3BL%3Ahttp%3A%2F%2Fwww.w3.org%2FIcons%2Fw3c_home%3BLH%3A48%3BLW%3A72%3 BBGC%3Awhite%3BT%3Ablack%3BLC%3A%23000099%3BVLC%3A%23660066%3BALC%3A%23ff330 0%3BAH%3Aleft%3B&domains=www.w3.org&q=%22QA+Test+Material+Process+Document+f or+Quality+Assurance%22&sitesearch=www.w3.org no hits http://www.google.com/custom?hl=en&lr=&ie=ISO-8859-1&cof=AWFID%3A0b9847e42ca f283e%3BL%3Ahttp%3A%2F%2Fwww.w3.org%2FIcons%2Fw3c_home%3BLH%3A48%3BLW%3A72%3 BBGC%3Awhite%3BT%3Ablack%3BLC%3A%23000099%3BVLC%3A%23660066%3BALC%3A%23ff330 0%3BAH%3Aleft%3B&domains=www.w3.org&q=%22QA+Test+Material+Process+Document+f or+QAWG%22&btnG=Google+Search&sitesearch=www.w3.org no hits where is your QA Test Material Process Document, AAA quality WG? In a way this whole thing looks like a sick joke in which you invent unnecessary work for others which you are not prepared to do yourselves. More politely you moved to last call prematurely, (let alone CR) Comment 11 ========== http://www.w3.org/QA/WG/2003/10/TestGL-20031020.html [[ Guideline 1. Perform a functional analysis of the specification and determine the testing strategy to be used. In order to determine the testing strategy or strategies to be used, a high-level analysis of the structure of the specification (the subject of the test suite) must be performed. The better the initial analysis, the clearer the testing strategy will be. ]] (umm perhaps your guidelines should have a letter before so it is clear which document they come from e.g. Guideline T.1) Neither WebOnt nor RDFCore did this. It is hard since the main purpose of the tests for these WG were to help in the development of a quality recommendation, and one cannot do a final functional analysis of the rec until its basically finished, which would have overly committed us to a waterfall model of development. In fact, that motivcation indicates that the second sentence quoted is too strong there is no "must be performed" here, suggest "may be helpful". Having said that, it is clear that the coverage of the tests in both the SW WGs is weaker than it would have been if we had followed this guideline at some point, this then comes back to issues to do with synchronization and timelines etc. In WebOnt I am reasonable sure that most of the untested bits are from that part of the rec that is fairly easy to implement. Thus, since we do not have a conformance test suite, the many OWL implementations that pass all the tests may nevertheless have a variety of trivial errors that prevent interoperability. I don't see that as the responsibility of the WG - conformance tests come later, and at that point (or in bug reports to software developers) it will become clear what trivial errors in software need fixing. Of course, in a very few cases these trivial errors may point to minor errors in the spec where there is insufficient clarity - but I believe that issue driven test development has covered almost all of these areas adequately. [[ Checkpoint 1.3. Analyze the structure of the specification, partition it as appropriate, and determine and document the testing approach to be used for each partition. [Priority 1] ]] Suggestions: a) weaken this to have "may" force rather than "must" force. b) Use RFC 2119 keywords. Comment 11 ========== [[ Checkpoint 2.1. Identify and list testable assertions [Priority 2] Conformance requirements: Test assertions within or derived from the specification must be identified and documented. Checkpoint 2.2. Tag assertions with essential metadata [Priority 1] Rationale: It must be possible to uniquely identify assertions, and to map them to a particular location, or to particular text, within the specification. Wildly oversimplistic. ]] Even the simplest OWL test relies on many parts of the recommendation. The idea that it is possible to tie a test to one or two parts of the recommendation is philosophical flawed (similar to the concept of causation, cf a huge body of literature). I do not believe this is uniquely a property of OWL. Obviously one tries to structure the tests in such a way that assuming a system passes some set of easier tests, then this new test presents an interesting challenge, but ... Of course this also amounts to the issue that you lot seem to believe that it is possible to test for conformance whereas that is trivially incorrect. (Given any set of conformance tests for any system where each test is characterise as one more inputs resulting in one or more outputs, the piece of software that is defined to precisely pass the test suite, by giving the determined output for the determined input, and otherwise to fail horribly, is a non-conformant piece of software that passes the conformance tests). Suggest drop these requirements, and the related ones in SpecGL. Possibly weaken to a "It may be helpful to list the test assertions found within or derived from a recommendation" Comment 12: [[ (Test 3.2) When the Working Group requests test submissions, it must also request that the appropriate metadata be supplied. ]] I found it easier to completely own the test metadata in webont (well me and Jos the co-editor). Unfortunately the metadata quality is crucial and is best ensured by having a fairly small number of people responsible - sure it's a lot of work. The *must* is too strong, suggest *may*. The list of test metadata omits "the type of the test" and "the files associated with the test" Comment 12 ========== [[ Conformance requirement: The test materials management process must provide coverage data. At a minimum, the percentage of assertions for which at least one test-case exists should be calculated and published. ]] Makework - this statistic is useless why the **** do you want to waste other people's time in calculating it. Any test suite tests 0% of any plausible language worth specifying because the language is infinite and the test suite is finite. Any other number is simply a fib. Suggest drop this requirement and any related requirement. Comment 13 issue tracking is not a test issue ========== [[ Checkpoint 3.4 Provide an issue-tracking system [Priority 2] Conformance requirements: The test materials management process must include an issue-tracking system. Rationale: If a high-quality test suite is to developed it is important to methodically record and track problems and issues that arise during test development, testing, and use. For example, the test review process may generate a variety of issues (whether the test is necessary, appropriate, or correct), while after publication users of the test suite may assert that a particular test is incorrect. Such issues must be tracked, and their resolution recorded. ]] This is of course a quality issue but has nothing to do with test - suggest move to the Operational Guidelines. Every WG should have a means of issue tracking. Comment 14 way too strong a must ========== [[ Checkpoint 3.5 Automate the test materials management process [Priority 2] Conformance requirements: The test materials management process must be automated. Rationale: Automation of the test materials management process, perhaps by providing a web-based interface to a database backend, will simplify the process of organizing, selecting, and filtering test materials. ]] The rationale is true but does not justify a must; the QA group could collect a set of tools that have been used to help automate test material management, and help try and spread best practice but a *must* here is ridiculous. This really should not be a checkpoint. I note that the QAWG commits to AAA test conformance, please describe your automatic system for test material management. (Since the spec GL and the ops GL are in CR and not test GL, I would be happy with an answer that restricted itself to those two documents). Comment 15: not the WG responsibility =========== [[ Checkpoint 4.2. Automate the test execution process [Priority 2] Conformance requirements: Test execution should be automated in a cross-platform manner. The automation system must support running a subset of tests based on various selection criteria. Rationale: If feasible, automating the test execution process is the best way to ensure that it is repeatable and deterministic, as required by Checkpoint 4.1. If the test execution process is automated, this should be done in a cross-platform manner, so that all implementers may take advantage of the automation. ]] WebOnt made it clear to its implementors that we expected test results to have been collected in an automated fashion, but it is not possible for a WG to provide such an execution environment for every conceivable spec. Once again, noting the QAWGs AAA commitments in its charter, I hope you will demonstrate the sense of this checkpoint before any of your documents proceed further along the recommendation track. I guess you need to solve some natural language research problems first. Comment 16: =========== [[ TestGL Checkpoint 5.1 Review the test materials [Priority 1] Conformance requirements: The test materials must be reviewed to ensure that they meet the submission requirements. The status of the review must be recorded in the test materials management system, as discussed in Checkpoint 3.2 above. ]] You cannot have a priority 1 depending on a priority 2, I think the "management system" is the problem replace with "metadata". In WebOnt we automated this part - every time the OWL Test Cases document is produced all the test material is verified to conform with the "stylistic" guidelines in OWL Test. Hence we meet the spirit of this without meeting the letter. Once again, your desire to have strong wording is inappropriate. Weaker wording that would be acceptable would be: [[ TestGL Checkpoint 5.1 Review the test materials [Priority 1] Conformance requirements: The test materials should be reviewed to ensure that they meet the submission requirements. The status of the review may be recorded in the test materials metadata, as discussed in Checkpoint 3.2 above. ]] I note that one test which I accepted http://www.w3.org/TR/2003/PR-owl-test-20031215/byFunction#imports-014 had as its whole point that it did not conform to the stylistic preferences (using a superfluous suffix on a URI) and that this presented problems which were not exercised by the other tests. So, it is important that there is adequate discretion in the process to accept tests that do not meet the submission requirements. Comment 17: =========== Test 6.1 [[ Discussion: It is not necessary for tests to automatically report status for this checkpoint to be met. It would be sufficient, in the case of manually executed tests, for the test execution procedure to unambiguously define how the person executing the tests should determine the test execution status. ]] tell me again about the QA WG's tests (for opsGL and specsGL) that permit unambiguous determination of the test execution status, I seemed to have missed that part of your document set. comment 18 =========== [[ Checkpoint 6.2 Tests should report diagnostic information [Priority 2] Conformance requirements: When tests fail, they must provide diagnostic information to assist the implementer in determining the source of the problem. ]] No!! It is a huge amount of work for the WG to provide the implementors free of charge with a test suite. No way are the implementors entitled to a test suite with diagnostics. The cost is huge - developers get paid, they should put some sweat in, too. I look forward to seeing the QAWG's diagnostics in the test suite for opsGL and specsGL. This requirement is mad, and should go. Jeremy
Received on Thursday, 18 December 2003 10:38:44 UTC