W3C home > Mailing lists > Public > w3c-wai-ig@w3.org > July to September 2000

Re: Converter Agent

From: Charles McCathieNevile <charles@w3.org>
Date: Sun, 27 Aug 2000 21:16:04 -0400 (EDT)
To: Pei Ling <mizuno@pacific.net.sg>
cc: w3c-wai-ig@w3.org
Message-ID: <Pine.LNX.4.21.0008272111260.30972-100000@tux.w3.org>
Pei Ling,

There are several tools that produce XHTML (HTML that is written in
XML) either as authoring tools or as conversion tools.

They cannot recognise the headings if they are not specified, but if they are
already marked up in HTML then it is trivial to ensure that they remain
marked up in XML.

A tool that would be interesting is one which looked at formatting properties
used in a document and tried to identify headings and lists (and perhaps
other structures). Normally people who use visual formatting instead of
semantic srtructure in documents (be they HTML, word processor documents, or
even handwritten ones) do so in a fairly regular way, and I believe that in
most cases it would not be terribly difficult to specify an algorithm for
guessing at the main structures. I actually tried to do this once, and if you
are intersted I will try to find that work and make it available (it was only
very rough notes, but I don't think it is an impossible project at all, and
believe it would be valuable). On a related note, I read in a press release
on several WAI lists today that Adobe are trying to make a version of Acrobat
software that can do some of these things.

cheers

Charles McCN

On Mon, 28 Aug 2000, Pei Ling wrote:

  
  
  Is there any existing agents in the market that's able to function as a converter for HTML to XML documents? Is it possible for the agent then to identify the headings or sub-headings from the content and accordingly translate the HTML code into XML?
  
  Any reply/comment would be very much appreciated.
  
  ----------------------------------------------------------
  
  Pei Ling, Tan
  Final Year Project Student at NYP
  mailto: mizuno@pacific.net.sg
  
  ----------------------------------------
  PS: Sign up for a free www.earth9.com account, start a Habitat and GET ORGANISED.
  

-- 
Charles McCathieNevile    mailto:charles@w3.org    phone: +61 (0) 409 134 136
W3C Web Accessibility Initiative                      http://www.w3.org/WAI
Location: I-cubed, 110 Victoria Street, Carlton VIC 3053
Postal: GPO Box 2476V, Melbourne 3001,  Australia 
Received on Sunday, 27 August 2000 21:16:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 19 July 2011 18:13:49 GMT