Re: Combining the two.... Re: Do we like this better? - was way to move forward with plain language from John Foliot on 2017-05-23 (w3c-wai-gl@w3.org from April to June 2017)

From: John Foliot <john.foliot@deque.com>
Date: Tue, 23 May 2017 09:43:42 -0500
To: "lisa.seeman" <lisa.seeman@zoho.com>
Cc: Detlev Fischer <detlev.fischer@testkreis.de>, WCAG <w3c-wai-gl@w3.org>, public-cognitive-a11y-tf <public-cognitive-a11y-tf@w3.org>
Message-ID: <CAKdCpxy4UzrZr0_ScjV+rCEvQTxcxZtki6S7mS-s61ROcpgy2w@mail.gmail.com>
Summary: There are a number of unanswered questions here that I have tried
to re-state in a logical and clear fashion. Apologies if it seems like I am
a broken record, but I have not heard answers that satisfy the questions.

*************

Hi Lisa,

I am concerned that this SC is being turned inside out to try and get it
included into 2.1, yet the *reason* for the SC is being lost (if, in fact,
the reason can be captured). From your Apr. 4th posting (which appears to
be the most recent draft language - maybe?):

Common words: Use the most common 1500 words (including word roots in
agglutinative languages) or phrases or, provide words, phrases or
abbreviations that are the most-common form to refer to the concept in a
public word frequency list for the context.

alternatively, the GitHub page links to the following
<https://rawgit.com/w3c/wcag21/plain-language-minimum_ISSUE-30/guidelines/sc/21/plain-language-minimum.html>
and
says:

Common words

Provide words or phrases from a public core vocabulary; or the most common
1500 words or phrases (including word roots); or word, phrases or
abbreviations that are the most-common form to refer to the concept in a
public word frequency list for the identified context.


In either instance, I am continuing to have issues. I have tried to isolate
my key concerns below:


*#1: "Providing":*

The draft Success Criteria currently mandates that the content creator
"provide" (words, phrase or abbreviations...), yet it is unclear *HOW* to
do so in a measurable and testable fashion. We agreed that SC must be *testable
through automated or manual processes*, but until such time as we define
the "how" bit here, I don't know how to test whether or not the "list" has
been provided. As well, the draft SC is stating that the list be "publicly"
available - what is the impact on content behind firewalls, etc. if the
page is restricted (and not "public")?

Furthermore, I'm struggling with this because early on we stated as a group
that new SC "...*describe the specific condition required to meet the
criteria, not the method to address the criteria*" - yet it seems that
"...providing a (public) word list..." is *A* method (and then I ask, is it
the *only* method? I don't know.)


*#2: The List:*

The "list" needs to be clearly and completely defined (content AND format),
as it appears to be the mandated deliverable behind this SC.

*     a.) Content:*

So, what kind of "list" is required? Leaving it undefined (or leaving it
for the non-normative Techniques section) is introducing a significant
amount of confusion.

   - Is it simply a word "frequency" list (this word appears twice, this
   other word appears 5 times, etc.)?
   - Is it a "glossary" list (first term: definition, second term:
   definition, etc.)?
   - Is it a "thesaurus" list (first term: alternative words, second term:
   alternative words, etc.)?
   - Is it another type of list ("the following 500 terms are the most
   frequently used terms on this site")?
   - Does the list need to be presented in alphabetical order? Most
   frequently used term to least used? Some other ranking format? Something
   else entirely?

Defining the requirement for "LIST" in terms of what and where *cannot be
left outside of the SC*, as otherwise I could state that I provided a list
to a public mailing list, and then what?...

*     b.) Format: *

In one of the comments you posted in GitHub, you referenced:
http://www.minspeak.com/CoreVocabulary.php#.WQ8EzuUrI2w, which I visited to
investigate further.

That URL provided me a web page with a list of reference lists,
including Balandin
list of 347 core words used by adults
<http://www.minspeak.com/documents/3-BaladinList.pdf> - which turned out to
be a link to a PDF document. Is PDF an acceptable format for a "vocabulary
list"?

Does this satisfy the driver behind this proposed SC? How? In other words,
armed with this PDF list of 347 core words, what does this do to benefit
the end user? How do I, as a content creator, use this list to benefit a
user?


*#3: Conformance Statement (?):*

You now reference a "conformance statement" in your email, yet while there
are instructions on what and how a conformance statement should be addressed
<https://www.w3.org/TR/WCAG20/#conformance-claims> in WCAG 2.0 today, there
is no requirement (and never has been) from WCAG 2.0 to actually furnish or
provide such a conformance statement - whenever such a statement is
mandated, it is done so by governments as part of their legislation around
digital accessibility.

"Conformance claims are *not required*. Authors can conform to WCAG 2.0
without making a claim."

(source: https://www.w3.org/TR/WCAG20/#conformance-claims)


*#4: Using the list:*

Assuming that we more tightly define "LIST" (to mean, for example, a
comma-separated list of terms and definitions), then what? That's a serious
question.

If I provide a list of the common terms used on my website in the
prescribed format, then what happens? The *purpose* of creating the list is
now being lost (if in fact it was ever clearly identified, outside of
stating that it benefits people with reading issues). This draft SC very
much feels like it is a set-up to a second part of the delivery (which
is)...


*#5: Future Technologies:*

*"...*this can be done via added coga semantics and personalization..."


Lisa
, I have previously stated my support for the work on the Coga Semantics
Recommendation, a TF delivery of two Working Groups I am a member of (ARIA
WG and APA WG), and so while not directly involved in the effort, I am
aware of the activity at a level beyond that of most content creators today.

The problem is, this is still a "future technology" - we cannot be scoping
our SC so narrowly that the only way to achieve success is to use an
emergent technology that is untried at scale, not yet formally accepted at
the W3C, and one where there currently are no tools *at scale* to allow
for this to be done.

We have previously stated that Success Criteria must:

"...have Techniques which demonstrate that each Success Criterion is
implementable, using readily-available formats, user agents, and assistive
technologies."


Due to the
current status of the draft Coga Semantics proposal, it does not meet the
"readily-available" requirement we've insisted upon previously, and so
while this could be a potential technique going forward, *it is not one
today*, and despite desires and assurances, we cannot categorically state
that Coga Semantics will be a 'thing' by this time next year, never-mind
one that is readily available. I am optimistic for Coga Semantics, but it
is still too early.


Lisa, you have previously suggested that this is one of the more important
draft SC coming forward from the COGA TF, and I accept that statement as
being true. If we are to ensure that we successfully meet the need that we
are attempting to meet here, I will suggest that we need a lot more answers
than we have at this time. As presented today however, I cannot support
this draft SC because of the ambiguities I have outlined above.

Respectfully,

JF

On Tue, May 23, 2017 at 6:11 AM, lisa.seeman <lisa.seeman@zoho.com> wrote:

> That is why we had the word frequency list before. that was 100 percent
> testable and we had free tools already available.
>
> Try 3...  How about....
>
> *Provide words, phrases  or abbreviations that are the most-common form to
> refer to the concept in a public word frequency list for the identified
> context*.
>
>
> Notes
> - the sc is technology agnostic s the "how" an d "what format" etc should
> not be discussed until be get to techniques. Although clearly it needs to
> be accessible. An accessibility conformance statement would say what list
> was used.
>
> - we have opensource scripts for building word frequency lists (see the
> comments in the github issue). a script for testing words against a word
> list exists in other places (like the translation industry )
>
> - we are not limiting the size of the word frequency list , so they can be
> as big as is needed
>
> - also not this can be done via added coga semantics and personlization
>
> -The * public word frequency list and identified context* are defined
> term, we can improve them if we heel the need - but let us first decide if
> this is the direction before zooming in on that.
>
>
> Perhaps we could also change the scope to critical features as identified
> in issue 6
>
>
> All the best
>
> Lisa Seeman
>
> LinkedIn <http://il.linkedin.com/in/lisaseeman/>, Twitter
> <https://twitter.com/SeemanLisa>
>
>
>
>
> ---- On Mon, 22 May 2017 17:22:27 +0300 *Detlev
> Fischer<detlev.fischer@testkreis.de <detlev.fischer@testkreis.de>>* wrote
> ----
>
> lisa.seeman schrieb am 22.05.2017 15:55:
>
> > It looks like we are more comfortable with this direction - but we would
> need some testing tools before CR
> > SO far as I know the IBM tool is not free, and the Microsoft tool
> requires a subscription.
> > A way to move forward is put it in the next version of wcag 2.1 and
> reach out to the companies for a free version of the tool.
>
> In my view, any automatic tool checking the commonality of words by
> applying some generic algorithm will be bound to produce incorrect results
> in all cases where you have a site covering a specific domain with specific
> terms (i.e., very often). Synonyms where you can replace one term with
> another without also introducing a shift of meaning are the exception, not
> the rule. Then you have the homonym problem (same term meaning different
> things in diffeent contexts / domains) A tool that offers a meaningful
> analysis would have to be capable of inferring the respective domain and
> its vocabulary and adapting its algorithm accordingly.
>
>
>
>


-- 
John Foliot
Principal Accessibility Strategist
Deque Systems Inc.
john.foliot@deque.com

Advancing the mission of digital accessibility and inclusion
Received on Tuesday, 23 May 2017 14:44:19 UTC