W3C home > Mailing lists > Public > www-tag@w3.org > September 2006

Re: Re-expressing our formalisation of Language

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 11 Sep 2006 12:47:31 -0500
Message-Id: <p06230901c1275b3d209e@[]>
To: Harry Halpin <hhalpin@ibiblio.org>
Cc: "Henry S. Thompson" <ht@inf.ed.ac.uk>, www-tag@w3.org

>My attempt at reconciling your vocabulary is:
>1. Henry says "textset" = Pat says "expression"

Ok, although my sense implies that the string is parsed (the 
'expression' is more like the parse tree than the string), so the 
grammar isn't just an ancillary option.

>2. Henry says "interpretation" = Pat says "denotation (mapping)"
>3. Henry says "informationSet" = Pat says "interpretation (structure)"

I'd like to have some more input on whether this is correct, in fact. 
I am suspicious, since if it were, the idea of the mapping being 
functional would never have even been considered, it being so 
extremely inconsistent with the ideas of model theory. (Also the idea 
that an interpreter 'implements' the semantic mapping, which is also 
a *very* strange notion when applied to the model theory sense, since 
the most interesting semantic mappings are not computable.)

>It's the ambiguity between 2 and 3 that are causing lots  problems.

I think it runs much deeper than this.

>Now for obscure philosophy :)
>>  Is an 'infon' something like a chunk of information (about something?
>>  About what?) or is it something more like a part of a world or a
>>  possible interpretation? Or could it be something like an topic, or a
>>  thing that some information is about?
>     I believe Henry is probably unconsciously resurrecting the term
>"infon" from "situation semantics" ala "Situations and Attitudes" by
>Barwise and Perry [1] and the "What is Information?" paper by Israel and

I was afraid of that. I do not find this line of thinking in the 
least persuasive, and am not impressed by its very limited success so 
far as a foundational framework (If there has been some important 
breakthrough in the past decade or so which I missed, please point me 
at it. Ive read all of Devlin.) Frankly, 'situation semantics' sucks, 
as a foundational theory. It is full of conceptual holes (the way 
they try to get rid of quantifiers by introducing 'possibilities' is 
ludicrous, and had in any case been comprehensively pre-trashed by 
Quine in 'Word and Object') and most of the terminology which it has 
introduced isn't supported by enough hard mathematics. I call it a 
promissory note for a theory rather than a theory itself. Until all 
this verbiage of "infons", "soas" and so on is given mathematical 
flesh, its not really a theory at all, just a lot of un(der)defined 

>  Israel and Perry formalize an "infon" as an abstraction over a
>"state of affairs." A "state of affairs" can be thought of a collection
>of "real-world atomic facts" like (a,b,c,...) where  the infon is a
>parameter in an abstraction over the state of affairs << ...,_c_,....>>
>where 'a'  is some atom.

I read things like this and it seems to me that they do not make 
sense. OK, regard this claim as a challenge, and prove me wrong. 
Start with 'real-world atomic facts'. What ARE those? I need a clear, 
sharp answer to this before I can even begin to try to understand 
what it means to abstract over them. I have never read any account of 
'situation theory' which gives a clear answer to questions like this. 
(Problems to address. (1) What exactly *is* a 'real-world fact'? In 
ordinary English usage, a "fact" seems to be a *sentence* that is 
true, but Im presuming y'all don't mean that here. (2) What makes a 
fact "atomic"? Again, this kind of terminology is usually used to 
refer to syntax, so that a *sentence* is atomic when it has no 
sentential syntactic constituents; but also again, Im guessing that 
you don't mean to be referring to syntax here. So, if a 'fact' isn't 
syntactic in nature, what makes one fact more atomic than another? 
There are obvious Nelson-Goodmanesque grue/bleen objections to any 
possible answer, seems to me. (3) How does one individuate "facts", 
which is a prerequisite to giving them names? For example, if facts 
can be listed ("a, b c, ...") then it must make sense to count them. 
So take some simple 'state of affairs', say the inside of my office, 
or of yours when you read this: *how many* atomic facts does it 
contain? Order of magnitude will do. Do the facts involve 
common-sense objects? Mereological pieces of stuff? Molecules? Quarks 
and leptons? Hmmm. Does the answer depend on how the office is 
described, ie on the concepts being used to say what is in the 
office? If so - and I fail to see how any other answer can make sense 
- then in what way is this more "real-world" than a Tarskian 
conception of a possible world as a relational system?

>So, the atomic facts of a state of affairs
>could be s_1=(a='chair', b='table', c='rock'...) the infon
>i_1=<...,_c_='rock'...> is then any state of affairs that has a rock in
>the right paramter

Wait. I know what rocks are. Now, what does it mean to "have a rock 
in a parameter"? Rocks are *real*, they have mass and are made of 
mineral stuff, etc. They hurt if they fall on your head. Parameters 
are mathematical abstractions, or perhaps syntactic elements in a 
formal notation. This seems, therefore, to NOT MAKE ANY SENSE. Right 
at the very start, this way of talking embodies a conceptual 
confusion. It seems to be based on a category error. (If you start 
talking about information, it makes even less sense, as we then seem 
to have a 3-way category confusion.)

>, so s_2=('a='Tim', b='Dan', c='rock) and s_1 would
>satisfy the infon i_1.  In normal language, an infon is a "partially
>defined state of the real world"

That also does not make sense. States of the REAL world are never 
*partially* defined, since they are REAL. In fact, they can't be 
defined, partially or otherwise. They just *are*. Do you mean, infons 
are partial *specifications* of states of the real world? That might 
make sense, but then what distinguishes these 'infons' from pieces of 
syntax? (I once went right through S&A, translating it all back into 
something coherent using this strategy, and it then simply reduces to 
a re-statement of a slightly unconventional relational logic - one 
that allows interpretations to be embedded inside one another, since 
they are 'partial' - stated using an alien terminology. But at least 
the treatment of the quantifiers then made sense.)

>or "constraint of the real world" -> so
>maybe an XML Schema can be thought of as  kind of "infon" that can
>constrain a set of XML documents.

Which is a *syntactic* entity - an XML Schema document or assertion - 
which *describes*, and hence constrains, a set of things which it is 
*about* , i.e. refers to, i.e. denotes. Exactly: pure Tarski.

>After all, they all exist in the "real
>world" given XML's lack of a formal semantics :)
>The main differences between situation semantics and other formal (i.e.
>Tarksi/Montague) sort of semantics is in a nutshell the (using Pat's
>vocabulary) the interpretations over situation semantics are supposed to
>be *real world things* and they are usually abstracted to some formalism
>like sets in Montague semantics.

It is often claimed that because Montague/Tarskian model theory is 
couched in set theory, it must be only about sets, and hence not 
about real things. This is just an error. Set theory is a branch of 
mathematics, and like all of mathematics can be used to talk about 
anything, including (parts of) the real world. Model theory says that 
the universe of an interpretation is a nonempty set, but it does not 
say it is a set *of sets* . It can be set of rocks, or people, or 
mereologically understood chunks of reality, or indeed anything else 
at all.

>  When someone in situation semantics
>land is talking about the denotation being a member of a mathematical
>set that we happen to name using the string 'rock', he means a big,
>lumpy grey piece of matter in all its metaphysical glory.

So does Tarski. His canonical example was the sentence 'snow is 
white' (in German), and he really did mean snow, in all its white 
fluffy metaphysical glory.

>  However, Montague semantics just points out we can't really talk about
>the fuzzy real world using logic in a sensible way

Nonsense! Do you think we can't talk about the real world using 
mathematics? If so, how do the engineers manage to calculate stresses 
in bridge girders, or design a complete aircraft on the computer so 
well that it flies at the first attempt? But if you ask a pure 
mathematician about the foundations of mathematics, he will start 
talking about sets right away (or maybe about morphisms and 
categories, but the same point applies: he won't be talking about 
bridge girders or airflows.) You should not assume that because a 
theory is couched in mathematical language, it must therefore be 
'about' abstract Platonic mathematical ideals, and therefore not 
about real things. We use mathematics to talk about real things every 
time we make change, or check that we have packed enough underwear 
for the next trip.

>, so let's formalize
>our interpretation structure as sets or something else  mathematically
>defined. And that has proven to be an immensely productive move.
>Situation semantics never got off the ground because they couldn't find
>a way to talk about the informal  world

The world isn't either formal or informal. Those categories apply to 
*symbolic* systems, not to the reality that those systems describe.

>without using formalism. There
>is no "real world state of affairs" that can be objectively described as
>a "set of facts" because as soon as you say that, you're in set theory!

The issue is, if you want to be precise, you need to use precise 
formal language (in your metatheory) and that means mathematics; and 
the language of sets is the bread and butter of mathematics. But , to 
repeat, set theory need not be only about sets of sets (that is, need 
not be 'pure' set theory).

>All the formalities of situation semantics can be folded into the
>Tarski-Montague story. And I think Barwise and  Perry did eventually
>formalize all of it using the KPU set theory.
>  >  Interpret, a functional mapping from TextSet to InformationSet,
>>     i.e. a subset of TextSet X InformationSet such that if a,b and c,d
>>     are in Interpret, then a==c implies b==d
>>  > Why do you call this 'interpret'? Is this supposed to imply
>>  something to the effect that 'infons' are interpretations?
>>  > Main question: Why is this *functional* ??
>Well, the word "functional" should probably be deleted because it's not
>functional really because a TextSet could map to some many "real world"
>state of affairs.

Not 'could map': MUST map to many such states. In fact (barring a few 
corner cases of only theoretical interest) must map to *infinitely 
many* such states. If, that is, we really are talking about 
interpretations in the model-theory sense here. And if we are not, 
what *are* we talking about?

>So giving my TextSet that is a XML document to Firefox
>may return different behavior than if I give it to Mozilla.

True, but why does this observation have anything at all to do with 
what we were talking about in the previous sentence?

>the same behavior of Firefox might be produced by multiple text sets.
>However, if there is some  sort of abstract notion of standard
>compliance (like all text <h1> gotta be bigger than all text marked up
><h2> no matter if you're running firefox or mozilla), then that could
>make it functional since no matter what, your textsets in that language
>(HTML) better *all* make <h1> bigger than <h2>. I think that's what
>Henry is trying to get at - using Pat terminology, the expressions all
>denote some set that obeys some constraint in the interpretation structure.

?? The central point of model theory is that there isn't ONE 
interpretation structure, but infinitely many. So I never know what 
phrases like "THE interpretation structure" mean. (*Which* 
interpretation structure?)

>So, to make this mathematically respectable we should probably normalize
>the Henry/Pat vocabulary, and then think if we can find a good way of
>saying "constraints over (Pat) interpretations"

Well, they are (Tarski) interpretations, and constraints over them 
are exactly what is meant by the standard term 'satisfaction 
(relation)'. My point was that we don't need to do any creative work 
here. Others have already done it. All we need to do is open a 

>without using the word
>"infon" which is kinda weird and confusing. And maybe even find a way to
>give all of this madness a formal semantics

It has had one for about 50 years, that is my point. Why not just use 
the textbook one?

>  - I do think they're valid
>intuitions here, but I do agree that unless we're careful it does all
>sort of sound wacky and like speaking in tongues.

My problem is that I have no way to reconstruct the intuitions from 
the language. It all sounds kind of philosophical and reassuring, but 
its impossible to locate the intended meaning: and meanwhile, the 
language will likely be (mis)used to impose bad design decisions. I 
can easily imagine someone wanting to know how the W3C proposes to 
rewrite the OWL specs to conform to the new Tag findings, for 
example, which at the very least would waste a lot of time and effort.


PS. BTW, the language used in [2] seems to be very confused, even 
using its own definitions.
"A language is a set of text " is not grammatical English: I presume 
they mean 'texts'.
"a set of information" is both ungrammatical and meaningless (or at 
best woefully underdefined. What *are* 'informations' , that we can 
talk of sets of them?).
Even if we ignore the conceptual and grammatical errors sprinkled 
through the document, its not internally consistent. In the 
discussion of example 1, we are told this:
"The set of information in a language almost always has semantics. In 
the Name Language, given and family have the semantics of given and 
family names of people."
I presume the authors mean to quote 'given' and 'semantics', since 
these are used in the language itself as XML markup. So these strings 
"have semantics"; so, they are supposed to be in "the set of 
information", right? So, they are not in the actual "language", but 
are mapped to from it? But they *are* in the language: there they 
are, right in your face in example 1.

Actually, it is possible to make sense of all this, with the 
following mappings. The authors are talking about marked-up text (in 
fact, about XML), and so they are distinguishing between 'the 
language' by which they mean the text that is being marked up, i.e. 
the text without markup, and what they call 'the information', by 
which they mean the mark-up itself, ie the collection of XML entity 
names; the actual occurrence of markup in the text is what defines 
what they describe as the mapping from texts to information (which is 
why it is natural to think of it as functional.)

If this reading is more or less correct - and it is strongly 
suggested by the examples - this entire document has almost nothing 
to do with what is called semantics in linguistics or logic, 
including model theory and Situation theory: it is simply a theory of 
mark-up; which may well be a valuable thing to have, but here it is 
being couched in wildly inappropriate language. I would strongly 
suggest that the Tag consider rewriting the document in more 
conventional language, to clarify its scope and intended meaning, 
before issuing it. In its present form it seems to display a radical 
ignorance of an entire field, by systematically misusing its 
technical terminology for an alien purpose.


>[2]http://www-csli.stanford.edu/~john/PHILPAPERS/whatisinfo.pdf -
>>  ht
>  > [1] http://www.w3.org/2001/tag/doc/ext-vers-generic-uml-v4.png
>  > [2] http://www.w3.org/2001/tag/doc/versioning#terminology
>>  --
>>   Henry S. Thompson, HCRC Language Technology Group, University of
>>  Edinburgh
>>                       Half-time member of W3C Team
>>      2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
>>              Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
>>                     URL: http://www.ltg.ed.ac.uk/~ht/
>>  [mail really from me _always_ has this .sig -- mail without it is
>>  forged spam]
>>  --
>>          -harry
>>  Harry Halpin,  University of Edinburgh
>>  http://www.ibiblio.org/hhalpin 6B522426

IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 11 September 2006 17:47:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:42 GMT