Re: Using virtual machine for compliant markdown interpreter? Viable? from Benjamin Chung on 2014-07-06 (public-markdown@w3.org from July 2014)

From: Benjamin Chung <bwchung@andrew.cmu.edu>
Date: Sun, 6 Jul 2014 18:10:54 -0400
To: mofo syne <mofosyne@gmail.com>
Cc: public-markdown@w3.org
Message-ID: <CA+-9mkD6J05oxLeMsB+P_8xN2z5G_o8tib7KJtYQ7wF8s_M=tA@mail.gmail.com>

The issue inherent to this approach is that the interface to the adapted
parser is never going to be quite as good as it would be if the parser was
developed within the host language itself. Furthermore, an optimal
reference implementation will produce an intermediate AST that is then
consumed by a reference code generator, and developing a format for that
structure that will work across all languages once compiled is a
non-trivial task.

The way that I would recommend doing it involves adding another layer of
abstraction. In the approach that I recommend, Markdown would have a
formally defined grammar that defines a specific language. Then, tools like
ANTLR (note that ANTLR does not, as far as I know, provide enough
flexibility to parse Markdown) can interpret this input formal grammar into
a parser that is both language native and parses exactly the same language
as the input grammar does. You can even go one step further, and use a tool
like Coq to prove that your parser generator's output parser parses the
exact same language as your input grammar describes, but I'm not sure if
anyone has done this yet.

This approach has a number of advantages - it allows one to have a language
native parser and AST intermediate form, while maintaining a common
consistent language. However, it comes at the expense of requiring a
complex parser generator tool, and these tools traditionally produce
terrible error messages.

On Sat, Jul 5, 2014 at 12:24 AM, mofo syne <mofosyne@gmail.com> wrote:

> Back in the infocom days, text based games were often compiled into
> byte codes that are run in a virtual machine. This allows for multiple
> machines of different architecture to support an interactive fiction
> as long as it ran the same interpreter.
>
> I wonder if the same approach for promoting standardized markdown
> could be done with this approach.
>
> Say, write the 'gold standard' markdown in C, and compile it into a
> distributable file that can be ran via an interpreter in multiple
> different languages like python, javascript, ruby, php, go, etc...
>
> The benefit to this approach is that updating the markdown engine is
> just a case of switching the bytecodes.
>
> Virtual machines is more suitable for 'gold standard' markdown,
> compared to other applications because a markdown program is
> essentially an stdin --> stdio (markdown text in --> html out).
>
> Plus if you want to 'extend markdown' but have it available on all
> platform, you could just replace the bytecode as well. Heck it may not
> even have to be markdown, could be asciidoc or something else.
>
> Is this a viable approach? And if so, is there an existing minimal
> interpreter that can efficiently deal with textual processing with
> minimal overhead?
>
> -----
> Brian Khuu
>
>
>

Received on Sunday, 6 July 2014 22:25:22 UTC