- From: Duff Johnson <duff@duff-johnson.com>
- Date: Tue, 8 Dec 2020 17:14:15 -0500
- To: Wayne Dick <wayneedick@gmail.com>
- Cc: "Charles 'chaals' (McCathie) Nevile" <chaals@yandex.ru>, W3C WAI ig <w3c-wai-ig@w3.org>
- Message-Id: <6A41808E-4CD3-46C1-9B11-6C1E2E9BD22D@duff-johnson.com>
Some of this stuff is already in the field. See, for example, Adobe Reader for mobile devices. Its new “liquid mode” feature - intended for reuse rather than accessibility applications - is a hint of the AI-assisted future of live analysis and repurposing of unstructured PDF content. More info: https://www.pdfa.org/adobe-announces-liquid-mode-for-acrobat-mobile/ Duff. > On Dec 8, 2020, at 15:31, Wayne Dick <wayneedick@gmail.com> wrote: > > Thanks Charles, > Right around 2000 is when I abandoned pattern recognition as a means to rationalize visual pages. At that time, owing to a drop in AI funding, and some hardware limits really signaled a dead end. However, in the last 10 years graphics hardware has improved dramatically, and we have a lot of very regular data... a lot, like a web's worth. > > When we approached this in the 90's to about 2000 we were dealing with digitized pages from articles and books. Just finding the angle of the lines on a page was a process. There was a lot of noise to filter. > > With electronic-based literature we have a completely different state of input. Skewed lines are usually decorative, a category of their own. There is a matching between screen regions and runtime code elements. > > I see three approaches at least. Analysis based on generated code, image analysis of displayed text and hybrid analysis based on our knowledge of the content. This could lead to a couple of accommodations that would exceed the impact of anything we have today. > > 1) Very smart screen magnification. This could apply to professional documents in PDF or another printer oriented language. This would make Journal articles accessible. > 2) Very smart analysis of content based on the image. That is rendering a markup equivalent that could be read and personalized given the user's needs configuration. > 3) Hybrid analysis resulting in the output of 2) or 3). > > In the 90s we got bogged down with Post Office examples where the goal was to extract and process address and zip code information from very messy input. > > It just seems like it may be time to look at this again. We do have dramatically better tools in 2020. > > Thanks all for your feedback, > Wayne > > > On Tue, Dec 8, 2020 at 2:11 AM Charles 'chaals' (McCathie) Nevile <chaals@yandex.ru <mailto:chaals@yandex.ru>> wrote: > On Tue, 08 Dec 2020 07:20:48 +1100, Wayne Dick <wayneedick@gmail.com <mailto:wayneedick@gmail.com>> > wrote: > > > I am interested in any research in this direction. Anybody know about > > anything like this in progress? > > Hello Wayne, all. > > I went to a presentation in New Zealand in the early 2000s, at the > invitation of Graham Oliver, on a project that had been running for quite > some years (if I recall correctly, since the early 90s) to do this. > > I no longer recall enough to easily find it (and I have looked for it > before without success). > > The basic idea was to use machine learning systems to look at the > interface of a user's computer, and provide a personalised approach to > understanding the components. Initially the system used a very expensive > high-powered computer to read the interface of a standard desktop PC, but > as increasing power became available, it was slowly morphing toward > software running directly on the machine. > > I also recall that a large part of the explanation about automatic visual > recognition used jet fighter planes as the example object to follow. > > In my mind the project may have been associated with Stanford University, > and it may have been called Eureka, although that is widely used as a > name, so not a very helpful search term :( > > If this rings a bell with anyone I would love to find more pointers to the > work. > > Cheers > > Chaals > > -- > Using Opera's mail client: http://www.opera.com/mail/ <http://www.opera.com/mail/> >
Received on Tuesday, 8 December 2020 22:14:31 UTC