W3C home > Mailing lists > Public > www-tag@w3.org > September 2006

Re-expressing our formalisation of Language

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Tue, 05 Sep 2006 19:48:01 +0100
To: www-tag@w3.org
Message-ID: <f5bd5aaduam.fsf@erasmus.inf.ed.ac.uk>

Hash: SHA1

Wrt the minutes (forthcoming) of today's call, and in particular the
diagram at [1] (from Editorial Draft] Extending and Versioning
Languages Part 1 [2]), here's my take on where we might have ended up:

A Language is a n-tuple, consisting of
 TextSet, a set of strings
 InformationSet, a set of infons (intentionally vague)
 Interpret, a functional mapping from TextSet to InformationSet,
   i.e. a subset of TextSet X InformationSet such that if a,b and c,d
   are in Interpret, then a==c implies b==d

If Function is a class with three properties, namely Domain, Range and
Mapping, then Language<Function, with TextSet<Domain,
InformationSet<Range and Interpret<Mapping.

I think it's useful to _also_ say that a Language has zero or more
Grammars, which are, informally, expressions of characteristic
functions for the TextSet, using e.g. regexps, BNFs, schemas, . . .

And that there are zero or more Interpreters, which are, informally,
effective computations from members of TextSet to members of

Likewise, finally, zero or more Models, which are, informally,
expressions of characteristic functions for the InformationSet.

Note that the 'expressions of characteristic functions' may be formal,
or informal, or a mixture of the two (e.g. "[1-9][0-9]*" plus "the
corresponding number per the standard decimal numeral interpretation
is prime").

I'm not sure how to do this in UML, i.e. whether it changes the diagram
beyond relabelling Syntax as Grammar (1 to many), Semantics to Model
(1 to many) and ActOfInterpretation as Interpret, as well as adding
Interpreters (1 to many) with input and output relations to TextSet
and InformationSet respectively.


[1] http://www.w3.org/2001/tag/doc/ext-vers-generic-uml-v4.png
[2] http://www.w3.org/2001/tag/doc/versioning#terminology
- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]
Version: GnuPG v1.2.6 (GNU/Linux)

Received on Tuesday, 5 September 2006 18:48:17 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:50 UTC