Re: Backus Naur Form (BNF) to XML

BNF describes the grammar for a language. So if you have a BNF, and an
example of something written in that language, you can hunt out each thing
and say what kind of object it is, and away you go with your conversion...

(more or less)

This seems like a very cool thing to do.

For converting bad HTML to XML I still suggest that we call Tidy our
algroithm, and then point out that it is already implemented. The problem is
that most programmers get things more or less into the right syntax - most
HTML coders (including those using tools to do it for them) more or less
don't. So it won't amtch a BNF description in the first place.


On Sun, 17 Dec 2000, Sean B. Palmer wrote:

  [Forgive me, I'm having one of those overly productive days]
  > If anyone knows of something like this, please post it,

  1. CSS2 is expressed in EBNF:

  2. So is XML (thanks to Jelks):

  3. Lots of programming languages can be expressed in BNF:

  4. Not sure about HTML: "EBNF is also used in many other standards, such as
  definitions of protocol formats, data formats and markup languages such as
  XML and SGML. (HTML is not defined with a grammar, instead it is defined
  with an SGML DTD, which is sort of a higher-level grammar.)" - But
  HTML can be converted into XML (XHTML) anyway, so who cares?

  5. I think EBNF/BNF describe the langauges themselves rather than outputs
  of those langauges, but I could be wrong (I only ever looked at EBNF once

  6. I don't think this HTML parse tree is in (E)BNF, but it looks
  interesting all the same:

  That's the lot. Hoping as ever that this is useful rather than

  Kindest Regards,
  Sean B. Palmer
  "Perhaps, but let's not get bogged down in semantics."
     - Homer J. Simpson, BABF07.

Charles McCathieNevile    phone: +61 (0) 409 134 136
W3C Web Accessibility Initiative            
Location: I-cubed, 110 Victoria Street, Carlton VIC 3053, Australia
September - November 2000:
W3C INRIA, 2004 Route des Lucioles, BP 93, 06902 Sophia Antipolis Cedex, France

Received on Sunday, 17 December 2000 15:32:32 UTC