Re: Combining the two.... Re: Do we like this better? - was way to move forward with plain language from lisa.seeman on 2017-05-23 (public-cognitive-a11y-tf@w3.org from May 2017)

From: lisa.seeman <lisa.seeman@zoho.com>
Date: Tue, 23 May 2017 18:09:46 +0300
To: John Foliot <john.foliot@deque.com>
Cc: "Detlev Fischer" <detlev.fischer@testkreis.de>, "WCAG" <w3c-wai-gl@w3.org>, "public-cognitive-a11y-tf" <public-cognitive-a11y-tf@w3.org>
Message-Id: <15c35d9821b.e17219bf240014.4829885091031532763@zoho.com>

#1: "Providing":
in the github issue we have techniques that disrcibe how. One way is simply to use it as the text. anther is to use the coga-easylang attribute

#2: The List

This is a word frequesy list in this proposal. that is a defined term. there are examples in the github uissue of word frquency lists and EA sent some core vocableries. It can be any of them. There are also more examples in the content

b.) Format: any format that is accessible including PDF

#3: Conformance Statement (?): you dont need to do this for wcag but if you need to prove accessibility comformance for you local law it is a good idea

will answer the rest when I have more time

All the best

Lisa Seeman

LinkedIn, Twitter

---- On Tue, 23 May 2017 17:43:42 +0300 John Foliot&lt;john.foliot@deque.com&gt; wrote ----

Summary: There are a number of unanswered questions here that I have tried to re-state in a logical and clear fashion. Apologies if it seems like I am a broken record, but I have not heard answers that satisfy the questions.

*************

Hi Lisa,

I am concerned that this SC is being turned inside out to try and get it included into 2.1, yet the *reason* for the SC is being lost (if, in fact, the reason can be captured). From your Apr. 4th posting (which appears to be the most recent draft language - maybe?):

Common words: Use the most common 1500 words (including word roots in agglutinative languages) or phrases or, provide words, phrases or abbreviations that are the most-common form to refer to the concept in a public word frequency list for the context.

alternatively, the GitHub page links to the following and says:

Common words

Provide words or phrases from a public core vocabulary; or the most common 1500 words or phrases (including word roots); or word, phrases or abbreviations that are the most-common form to refer to the concept in a public word frequency list for the identified context.

In either instance, I am continuing to have issues. I have tried to isolate my key concerns below:

#1: "Providing":

The draft Success Criteria currently mandates that the content creator "provide" (words, phrase or abbreviations...), yet it is unclear *HOW* to do so in a measurable and testable fashion. We agreed that SC must be testable through automated or manual processes, but until such time as we define the "how" bit here, I don't know how to test whether or not the "list" has been provided. As well, the draft SC is stating that the list be "publicly" available - what is the impact on content behind firewalls, etc. if the page is restricted (and not "public")?

Furthermore, I'm struggling with this because early on we stated as a group that new SC "...describe the specific condition required to meet the criteria, not the method to address the criteria" - yet it seems that "...providing a (public) word list..." is *A* method (and then I ask, is it the *only* method? I don't know.)

#2: The List:

The "list" needs to be clearly and completely defined (content AND format), as it appears to be the mandated deliverable behind this SC.

a.) Content:

So, what kind of "list" is required? Leaving it undefined (or leaving it for the non-normative Techniques section) is introducing a significant amount of confusion.
Is it simply a word "frequency" list (this word appears twice, this other word appears 5 times, etc.)?

Is it a "glossary" list (first term: definition, second term: definition, etc.)?

Is it a "thesaurus" list (first term: alternative words, second term: alternative words, etc.)?
Is it another type of list ("the following 500 terms are the most frequently used terms on this site")?
Does the list need to be presented in alphabetical order? Most frequently used term to least used? Some other ranking format? Something else entirely?

Defining the requirement for "LIST" in terms of what and where cannot be left outside of the SC, as otherwise I could state that I provided a list to a public mailing list, and then what?...

b.) Format:

In one of the comments you posted in GitHub, you referenced: http://www.minspeak.com/CoreVocabulary.php#.WQ8EzuUrI2w, which I visited to investigate further.

That URL provided me a web page with a list of reference lists, including Balandin list of 347 core words used by adults - which turned out to be a link to a PDF document. Is PDF an acceptable format for a "vocabulary list"?

Does this satisfy the driver behind this proposed SC? How? In other words, armed with this PDF list of 347 core words, what does this do to benefit the end user? How do I, as a content creator, use this list to benefit a user?

#3: Conformance Statement (?):

You now reference a "conformance statement" in your email, yet while there are instructions on what and how a conformance statement should be addressed in WCAG 2.0 today, there is no requirement (and never has been) from WCAG 2.0 to actually furnish or provide such a conformance statement - whenever such a statement is mandated, it is done so by governments as part of their legislation around digital accessibility.

"Conformance claims are not required. Authors can conform to WCAG 2.0 without making a claim."
(source: https://www.w3.org/TR/WCAG20/#conformance-claims)

#4: Using the list:

Assuming that we more tightly define "LIST" (to mean, for example, a comma-separated list of terms and definitions), then what? That's a serious question.

If I provide a list of the common terms used on my website in the prescribed format, then what happens? The *purpose* of creating the list is now being lost (if in fact it was ever clearly identified, outside of stating that it benefits people with reading issues). This draft SC very much feels like it is a set-up to a second part of the delivery (which is)...

#5: Future Technologies:

"...this can be done via added coga semantics and personalization..."

Lisa, I have previously stated my support for the work on the Coga Semantics Recommendation, a TF delivery of two Working Groups I am a member of (ARIA WG and APA WG), and so while not directly involved in the effort, I am aware of the activity at a level beyond that of most content creators today.

The problem is, this is still a "future technology" - we cannot be scoping our SC so narrowly that the only way to achieve success is to use an emergent technology that is untried at scale, not yet formally accepted at the W3C, and one where there currently are no tools *at scale* to allow for this to be done.

We have previously stated that Success Criteria must:

"...have Techniques which demonstrate that each Success Criterion is implementable, using readily-available formats, user agents, and assistive technologies."

Due to the current status of the draft Coga Semantics proposal, it does not meet the "readily-available" requirement we've insisted upon previously, and so while this could be a potential technique going forward, it is not one today, and despite desires and assurances, we cannot categorically state that Coga Semantics will be a 'thing' by this time next year, never-mind one that is readily available. I am optimistic for Coga Semantics, but it is still too early.

Lisa, you have previously suggested that this is one of the more important draft SC coming forward from the COGA TF, and I accept that statement as being true. If we are to ensure that we successfully meet the need that we are attempting to meet here, I will suggest that we need a lot more answers than we have at this time. As presented today however, I cannot support this draft SC because of the ambiguities I have outlined above.

Respectfully,

On Tue, May 23, 2017 at 6:11 AM, lisa.seeman &lt;lisa.seeman@zoho.com&gt; wrote:
That is why we had the word frequency list before. that was 100 percent testable and we had free tools already available.

Try 3... How about....

Provide words, phrases or abbreviations that are the most-common form to refer to the concept in a public word frequency list for the identified context.

Notes
- the sc is technology agnostic s the "how" an d "what format" etc should not be discussed until be get to techniques. Although clearly it needs to be accessible. An accessibility conformance statement would say what list was used.

- we have opensource scripts for building word frequency lists (see the comments in the github issue). a script for testing words against a word list exists in other places (like the translation industry )

- we are not limiting the size of the word frequency list , so they can be as big as is needed

- also not this can be done via added coga semantics and personlization

-The public word frequency list and identified context are defined term, we can improve them if we heel the need - but let us first decide if this is the direction before zooming in on that.

Perhaps we could also change the scope to critical features as identified in issue 6

All the best

Lisa Seeman

LinkedIn, Twitter

---- On Mon, 22 May 2017 17:22:27 +0300 Detlev Fischer&lt;detlev.fischer@testkreis.de&gt; wrote ----

lisa.seeman schrieb am 22.05.2017 15:55:

&gt; It looks like we are more comfortable with this direction - but we would need some testing tools before CR
&gt; SO far as I know the IBM tool is not free, and the Microsoft tool requires a subscription.
&gt; A way to move forward is put it in the next version of wcag 2.1 and reach out to the companies for a free version of the tool.

In my view, any automatic tool checking the commonality of words by applying some generic algorithm will be bound to produce incorrect results in all cases where you have a site covering a specific domain with specific terms (i.e., very often). Synonyms where you can replace one term with another without also introducing a shift of meaning are the exception, not the rule. Then you have the homonym problem (same term meaning different things in diffeent contexts / domains) A tool that offers a meaningful analysis would have to be capable of inferring the respective domain and its vocabulary and adapting its algorithm accordingly.

--
John Foliot

Principal Accessibility Strategist

Deque Systems Inc.

john.foliot@deque.com

Advancing the mission of digital accessibility and inclusion

Received on Tuesday, 23 May 2017 15:10:26 UTC