Re: [w3ctag/design-reviews] Handwriting Recognition API (#591) from Jiewei Qian on 2021-01-28 (public-webapps-github@w3.org from January 2021)

From: Jiewei Qian <notifications@github.com>
Date: Wed, 27 Jan 2021 20:03:49 -0800
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/591/768784558@github.com>

> Vertical writing.

Here I assume you mean a language that can be written both horizontally and vertically. 

Google's recognizer generally returns characters in the order they were written (for the above type of languages). So it works in both writing directions (e.g. rtl, ltr, top-bottom). Our metric shows vertical written isn't commonly used by our users, so this feature hasn't got recent attentions.

We aren't sure how other recognizers work. Some may only work with one direction (and doesn't work at all for vertical writing). Some may ignore the character writing order. 

WDYT to have a hint about writing direction? In case some recognizers need this information. Note, some recognizer may disregard this hint altogether.

> RTL writing

For RTL languages, the recognizer already knows it should process text from right to left.

For LTR languages, but characters written from right to left (e.g. "hello" written in "olleh" order). It's a rare/uncommon scenario. I'm not sure what's the correct interpretation. The user perhaps want the text to be interpreted as "hello", but it's really up to the recognizer to decide what it will output. Either output can be considered valid IMO.

---

> Mixed scripts.

The recognizer could determine the writing direction by looking at each character's written time and their spatial relations. Similarly for context switching.

For example,
- Unidirectional text "ABC". The writing direction can be learned by looking at the order of each character (A->B->C or C->B->A).
- Mixed: "AB cba CD" (upper-case / lower-case are two different scripts), "A->B->C->D->**a->b**", or, "A->B->C->D->**b->a**".

This being said, existing recognizers (those available on the market) don't support mixed scripts (e.g. english + arabic). They will recognize text as if the text is written in a single script (e.g. recognize arabic characters as english characters, and give less-ideal results).

I don't think we should try to solve the mixed script problem if the underlying implementations haven't solved it. Our solution may not work for them. Or, if the implementation is advanced, it doesn't care about whether we provide this information / hint).

---

> Why navigator object

We choose navigator because it's preferred over alternatives (e.g. window, global constructor): 

We expect handwriting recognizer to interact with platform-specific APIs, and support different features (on different platforms). Navigator seems natural based on this consideration of feature differences.

We don't have particular preferences on where the methods are. Are you suggesting we put the methods behind a attribute (e.g. `navigator.handwritingService.doSomething()`)?

---

> What's the metric of the cartesians in the explainer

The explainer examples use logical pixels.

The recognizer doesn't particularly care about the measurement unit, as long as all provided coordinates are measured in the same way (i.e. don't mix logical pixels and device pixels).

The recognizer implementation normalizes the coordinates, and perform recognition relatively (e.g. relative to the smallest character / block in the drawing).

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/591#issuecomment-768784558

Received on Thursday, 28 January 2021 04:04:01 UTC