Re: Re-expressing our formalisation of Language

>Dan Connolly wrote:
>>  On Wed, 2006-09-06 at 19:01 -0400, wrote:
>>  > Dan Connolly writes:
>>  >
>>  > > I'd be happy to go with the conventions. I find the wikipedia
>>  > > article pretty nice to start from
>  > >

That is rather high in fancy-math-notation. Very 
little of this is really necessary, and it 
creates a major expositional barrier for many 
readers. Quine's book 'the philosophy of logic' 
is very readable and sets out the main ideas very 
nicely; but it is, to be fair, an entire book. On 
the other hand, for me at any rate, reading Quine 
is a pleasure in itself.

>  > >
>>  > Gee, I'm really torn about that.  On the one hand, as one
>>  who's not expert
>>  > in those areas, I'm very excited to discover that these
>>  formalisms have
>>  > been so carefully developed.  Not reinventing the wheel seemslike the
>>  > right approach.
>>  >
>>  > Having said that, David Orchard was on the call making the
>>  case that even
>>  > my relatively simple efforts to present set theoretic approaches
>>  > separately from programmatic descriptions like XML Schema
>>  were a step away
>>  > from the sort of approachable commonsense explanations that
>>  our readers
>>  > will be looking for.

The trouble with this view is that 'common-sense 
explanations' of semantics are almost all plain 
flat wrong, and in many cases are incoherent when 
subjected to careful analysis. (For example, any 
account which says that texts get their meaning 
by being mapped to *things* called 'meanings', is 
either wrong or kind of trivial.) So basing 
anything on them is likely to be a very bad idea, 
no matter how readable they might seem to be. I 
concede that there is an expositional barrier to 
be overcome, but that is not a good argument for 
putting simplified fictions into what claims to 
be a technical guide.

>  Honestly, I find that wikipedia article
>>  tough going,
>>  yes... it took me about 18 months, somewhere between 1998 and 2003, to
>>  get it.
>That's a relief.  I'm feeling less stupid already!
>>  On the other hand, 18 months is not all that long compared to the
>>  lifetime of the TAG versioning issue
>Agreed.  My concern is that, compared to the time most readers want to
>spend with a TAG finding, it's a bit on the long side.  Seriously, I'd
>love to find a way to use whichever existing formalisms as the
>underpinning for something good, but only if we can manage to explain it
>in commonsense terms that would be of value to the typical finding reader
>who's looking for straightforward (if deeply reasonsed) advice on
>versioning their Web languages.

Well, there are some reasonably accessible 
accounts. I tried to briefly convey the general 
idea of 'set-theoretic' (horrible and misleading 
terminology) semantics, aka model theory, aka 
Tarskian semantics, using non-mathematical 
language, in the RDF Semantics document and 
and the linked glossary entries.

But the best way to relate this kind of semantics 
to operational issues, I suggest, might be 
through talking about entailment, a *relationship 
between* texts which is very close to being an 
operational idea already. A entails B (see if B is true 
whenever A is. Thought of operationally, A 
entails B when the meanings of the expressions 
used in the texts A and B are enough to sanction 
the operation of deriving B from A: if you have 
A, you can legitimately infer B; the 'rule' <from 
A, infer B> is valid, is semantically correct. 
This seems to easier to grok than the rather 
rebarbative notion of being true in all 
interpretations. And it allows generalizations to 
other kinds of inter-text relationships, perhaps 
defined by transformations of various kinds (it 
applies directly to 'rule languages' and things 
like logic and functional programming), and one 
can indeed see an interpreter as being a direct 
implementation of this relationship.

So, here are a few of the kinds of relationship 
between languages and texts that can be defined. 
Say that a language is a set of texts with an 
associated semantics, which for our purposes we 
can simply *define* as an entailment relationship 
En between texts of a language Ln. Say that L1 
*syntactically extends* L2 if every text of L2 is 
also a text of L1, *semantically extends* it if 
in addition E1 is a superproperty of E2, i.e. if 
A E2 B then A E1 B. Expressed more operationally, 
this means that you can run an E2 interpreter on 
the E2-syntactic subset of E1, and it will still 
be *correct* in E1, if possibly weaker than an E1 
interpreter. A logical example would be RDFS or 
OWL extending RDF; a programming example might be 
E1 being a functional language and E2 being E1 
without recursion; and a 'markup' example might 
be E1 and E2 both being XML entity vocabularies 
but E1 being larger than E2. In all these cases 
one can see the idea, that E1 extends E2 by 
allowing an interpreter to draw more conclusions 
than E2 is allowed to sanction. And in all these 
cases, the entailment relation can be directly 
related to a model-theory style semantic theory 
in which the extension amounts to having a richer 
notion of what counts as an interpretation, so 
that one gets E1 interpretations by imposing 
extra constraints on E2 interpretations and maybe 
by also adding some more structure to them. As 
the language can say more, the worlds it can 
describe get more complicated, but you can also 
rule out more of them by saying more stuff.

There are all kinds of quite neat notions that 
are now very easy to define, eg a 'monotonic' 
language has the property that if A entails B 
then A+C must entail B also, for any text C, 
where + indicates some basic kind of 'legal 
conjoining' operation on texts; and then its easy 
to see why monotonic languages have the nice tidy 
properties they have, and also why people often 
want to have non-monotonic languages in practice.

Anyway, just a sketch/suggestion for how to make 
a start on this stuff without creating tsunamis 
of horror in the semiotic universe. :-)

Pat Hayes

>   Early returns suggest that I may have as
>much as 17.5 months to go before I'm competent to have an intuition as to
>whether that's practical using these formalisms.  Still, an interesting
>Noah Mendelsohn
>IBM Corporation
>One Rogers Street
>Cambridge, MA 02142

IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell

Received on Monday, 11 September 2006 20:50:45 UTC