Composing language descriptions: tree automata and language design?

Ever since all this XML namespace stuff started, I have
been saying

	Context-Free Grammars are Very Well Studied,
	and basing standards like XML on them is a safe bet.
	(Similarly for RDF based on AI frame/slot/value)

	But you can't just take a CFG from Joe and another
	from Bob and slap them together. At least I haven't
	seen any well-studied way to do it.

Then Murata Makoto sent a pointer to this little paper:

% Forest-regular Languages and Tree-regular Languages
% MURATA Makoto
% May 26, 1995
% http://www.geocities.com/ResearchTriangle/Lab/6259/prelim1.pdf

He also posted a small example, but since it went to xml-wg,
I won't forward it. A larger example is available in:

http://www.fxis.co.jp/DMS/sgml/xml/sgmlxml97.html

I'm starting to get a "where have you been all my life?" feeling.

Am I the only person who's been studying language design for
the last few years who hasn't studied tree autonoma?
I surfed around in comp.lang.* last night just to sanity-check.
Nope. No sign of it.

It looks really cool. It looks like a good model for
adding the equivalent of Modula-3 generics[3] or maybe even ML
modules (no citation--I haven't studied these closely) to SGML.
But I'm not sure. I don't quite have my head wrapped around
it yet. I could use some help.

[3] http://www.research.digital.com/SRC/m3defn/html/generics.html
(but to do it justice, you need to read chapter 8,
"How the Language Got its Spots" from
System Programming with Modula-3 Edited by Greg Nelson Prentice Hall
Series in Innovative Technology ISBN 0-13-590464-1 L.C. QA76.66.S87 1991 

I can't find a copy on the web, but it's widely cited; e.g.
http://cs.fit.edu/~ryan/study/bibliography.html)

The classicist in me had just about given up hope of finding
a good model for composing DTDs before I saw this. I thought
we were going to have to just swim around in loosely defined
XML-based languages (where the instances were well-formed
but couldn't be machine-checked beyond that) for some time.

Rohit... have I piqued your interest? Can you spare
the time to study this forest autonoma stuff to see
if it fits into the ecology of knowledge representation
in the Web?

Anybody else?

-- 
Dan Connolly
http://www.w3.org/People/Connolly/

Forwarded message 1

  • From: Dan Connolly <connolly@w3.org>
  • Date: Fri, 27 Feb 1998 02:22:50 -0600
  • Subject: ZF Replacement schema in larch?
  • CC: connolly@w3.org
  • Message-ID: <34F677DA.9AB@w3.org>
I'm having trouble tranlating the idiom
	{ x: P(x) }
into larch. Any clues? I checked the larch FAQ
http://www.cs.iastate.edu/~leavens/larch-faq.html#SEC39
with no luck.

The actual example I'm after is more like:

  Dom(u) = {1} \U {1.v1.v2...vk : v1.v2...vk \in Dom(v)}


This seems to provide a clue:

===============
http://www.realtime-info.be/encyc/techno/terms/26/102.htm

		the
                     Replacement axiom
                     schema: 

                     If F(x,y) is a formula
                     such that for any x,
                     there is a unique y
                     making F true, and X is
                     a set, then 

                             {F x : x in X}

                     is a set. In other words,
                     if you do something to
                     each element of a set,
                     the result is a set. 
===============

But I can't figure out how to use that clue.

I might be avoiding the use of \E for no good reason.
Maybe that solves the problem.

I'm trying to absorb a paper about Forest autonoma
http://www.geocities.com/ResearchTriangle/Lab/6259/prelim1.pdf
by transcribing it into larch. I made it through
the basic definitions, but I'm having trouble with
some of the more complex ones. I've attached
my work so far. Any clues?

Also... is there any hope of running the larch prover
on a linux box?

-- 
Dan Connolly
http://www.w3.org/People/Connolly/
forest(E, C): trait
%
% Transcribed by Dan Connolly <connolly@w3.org> from:
% Forest-regular Languages and Tree-regular Languages
% MURATA Makoto
% May 26, 1995
% http://www.geocities.com/ResearchTriangle/Lab/6259/prelim1.pdf
%
% $Id$

includes
  Integer,
  List(Int, IntList),
  Set(IntList, IntListSet)

introduces

  empty: -> C
  __{__} : E, C -> C
  __ || __ : C, C -> C

  tree: C -> Bool

  width : C -> Int

  Dom: C -> IntListSet
  Domaux: C -> IntListSet
  Domaux2: C, C -> IntListSet

  forest: C, IntList -> E

  __ / __ : C, IntList -> C

asserts
  C generated by empty, __{__}, || % 2.1 Forest

  forall a: E, u,v,w: C, i, w1, u1: Int, v1k, w2k, u1k, u2k, d: IntList

% 2.2 tree
  tree(a{u}) == true; % 2.2. Tree
  tree(empty) == false;
  tree(u||v) == false;

% 2.3 forest width
  width(empty) == 0;
  width(a{u}) == 1;
  width(u||v) == width(u) + width(v);

% 2.4 forest domain
  Dom(empty) = {};

  Dom(a{v}) == {{1}} \U Domaux(a{v});
  (1 \precat v1k) \in Domaux(a{v}) == v1k \in Dom(v);
  % @@ and nothing else is in Domaux(a{v})

  Dom(v || w) == Dom(v) \U Domaux2(v,w);
  ((w1 + width(v)) \precat w2k) \in Domaux2(v,w) ==  w2k \in Dom(w);
  % @@ and nothing else

% 2.6 forest function
  forest(a{v}, {1}) == a;
  forest(a{v}, 1 \precat v1k) == forest(v, v1k);
  u1k \in Dom(v) => forest(v || w, u1k) = forest(v, u1k);
  (\not ((u1 \precat u2k) \in Dom(v))) =>
	 forest(v || w, u1 \precat u2k) =
		forest(w, (u1 + width(v)) \precat u2k)

% 2.8 subtree
%  Dom(u/d) == { 1 \precat v1k

Received on Friday, 27 February 1998 23:32:22 UTC