W3C

TAG Weekly

5 Sep 2006

See also: IRC log

Attendees

Present
Raman, Ed_Rice, Ht, Vincent, DanC, Norm, noah, DOrchard, TimBL
Regrets
Chair
Vincent
Scribe
raman

Contents


 

Regrets for next week?

Dave for most/all of it.

Henry: cant make it either

<DanC> regrets DanC for 12 Sep

Dan is also out next week

Ed not here either.

If anyone else is dropping out -- please say so in the next few days

Minute taker for next week if there is a call: Noah

<DanC> agenda points to http://www.w3.org/2006/08/29-tagmem-minutes.html

<Norm> http://www.w3.org/2001/tag/2006/08/29-minutes.html

<DanC> +1 approve http://www.w3.org/2001/tag/2006/08/29-minutes.html

<DanC> (modulo stylesheet)

<EdR> +1 approve

<Vincent> Minutes of last week approved

Versioning Document:

<noah> Noah's comments on the versioning document are at: http://lists.w3.org/Archives/Public/www-tag/2006Aug/att-0111/versioning26July2006withNoahComments.html#noahComments

ToDos: Address Noah's comments, as well as additional comments from others on the mailing list

DO: Agrees with some of Noah's comments, has varying levels of discomfort with others.

<DanC> (FYI, the GRDDL WG is starting to get into the GRDDL/RDDL/namespaceDocument-8 stuff. http://lists.w3.org/Archives/Public/public-grddl-wg/2006Sep/0038.html NDW, the records suggest the ball is with you. I wonder if it's worth a short agendum to get a status update.)

DO: focus on individual comment areas

NM: Would like to take a few minutes to give high-level overview of each comment area
... Let's list the six areas and then have substantive discussion on each of them

See link from NM:

<Norm> (I have no update on namespaceDocument-8, but anxiety about my lack of progress and the relatively few months left in my term is quickly pushing it up my todo list)

Above link from NM gives highlights of his comments:

<DanC> (er... a "set of strings" and "syntactic contraints" are the same thing, no?)

<timbl> I thought so

<DanC> (all this green text in http://lists.w3.org/Archives/Public/www-tag/2006Aug/att-0111/versioning26July2006withNoahComments.html#noahComments ; is any of it suggested replacement text? it seems to be all "the finding should..." commentary.)

<DanC> ("there is a story we can tell" is much more interesting with a proof by example ;-)

Concrete example: I send <person>...</person> define what happens at each layer of interpretation

NM: Concludes high-level overview of his issues

<DanC> (I'm listening for changes to the UML diagram. ... I wasn't hearing any... but Dave just mentioned collapsing Syntax and Text Set...)

<DanC> (round and round we go... I can remember 4 discussions of whether to use "language" to mean TextSet or (TextSet+mapping to info). We ended up with the latter the first 3 times.)

<dorchard> (

TimBL: 1.1 terminology: if by syntactic constraints we we mean things like schemas --- then that defines production rules for the language --- mapping from that text to information --fully defines things

<dorchard> (Dan, I think NM is proposing removing syntax and semantics from the UML)

NM: syntax defines text-set e.g. regexp patterns
... Q: is that set of constraints part of the language definition, or is it just a means to describe the language.

TimBL: that's splitting hairs, should be made two sides of the same thing.

DO: they're linked now

NM: But as written they are separable: regexp, grammar etc are intentional ways of doing it, but they should be equivalent

<DanC> (What Noah is asking for is already there when I read it. Maybe I'm just reading it too fast.)

TimBL asking to remove piece that says "syntactic constraints ..."

<timbl> A language consists of a syntax (a set of possible texts in the language, typically defined by a grammar document and/o other constaints)

<ht> HST notes it is perfectly sensible to ask "do these two grammars define the same language?"

<timbl> A language consists of a syntax (a set of possible texts in the language, typically defined by a grammar document and/o other constaints) and a mapping from texts into information.

<ht> "A language is a set of <text,interpretation> pairs. The set of texts is typically defined by appeal to a syntax. The set of interpretations may be independently specified, or only specified by means of a construction from the texts"

<Zakim> ht, you wanted to remind us of the utility of the notion of characteristic function

<DanC> I can accept HT's proposal. It's not quite clear to me that what's in the 26 July isn't ok too.

<noah> Among the several reasons I want to separate our discussion of intensional descriptions is that in practice many of our schemas (e.g. such as XML Schemas) are pretty loose bounds on the language we really want to exchange (XML Schema can get you to intengers, or maybe with patterns to odd numbers. Primes is a constraint typically imposed at a higher level.

<DanC> I don't see "intensional descriptions" in "any syntactic constraints on the text".

<Zakim> noah, you wanted to talk about about information having substructure

<ht> HST mis-wrote above, should have 'information' for 'interpretation'

<DanC> (the 26 July draft says "A Language consists of a set of text, any syntactic constraints on the text, a set of information, any semantic constraints on the information, and the mapping between texts and information." That seems to have some redundancy, but nothing unacceptably harmful, to me.)

<ht> "A language is a set of <text,information> pairs. The set of informations may be independently specified, or only specified by means of a construction from the texts"

<ht> HST meant the _relation_ between text and information may be unstructured, i.e. nothing better than a lookup table. I absolutely agree that most information domains are structured, as are most interpretation functions

NM: feels we need to be able to tell some story about how we can compare information sets

<timbl> The syntax box constraints any grammar and also any other constrains which define the set of allowable texts

<timbl> ?

<timbl> embedded in that

<timbl> in 1.1

<ht> Right -- so from that diagram, per Noah's request, Language -> Language Definition and Text Set -> Language

<DanC> (remind me of your question, dave? I'm still trying to figure out what folks find unacceptable with the definition of "language" in your 26 July draft)

<dorchard> (what does the diagram look like)

<DanC> <ht> "A language is a set of <text,information> pairs. The set of informations may be independently specified, or only specified by means of a construction from the texts"

<ht> That bit missed a bit: "The set of texts is typically defined by appeal to a syntax."

<noah> Rough proposal:

<DanC> (I like discussing the diagram. I'm reasonably happy with the current diagram. http://www.w3.org/2001/tag/doc/ext-vers-generic-uml-v4.png )

<noah> A language is a set of texts and their interpretation as information. Note that in practice, the set of texts is typically conveyed by means of constraint languages such as XML Schema, regular expressions, etc.

<dorchard> So, you'd have language -> set of texts -> syntax constraints

<noah> The above is a proposal I'd be pretty happy with as a starting point. I suspect the "interpretation as information" may yet need a bit of tuning.

<DanC> (indeed, I'm quite happy that it would be very difficult to talk about ambiguous languages.)

<noah> Dave: what is the -> operator

<Zakim> ht, you wanted to answer DO's question

<ht> OK, then add "and the interpretation via a function from texts to information"

<Zakim> DanC, you wanted to observe that none of the proposed texts (including the 26 July draft from Dave) changes the diagram, so they're all OK by me.

<DanC> (the arc from Syntax to Text Set should be many-to-one. oversight?)

<ht> Meaning there's in principle more than one syntax for a given text set? Yes.

At least more than one means of expressing the syntax for a text-set

<DanC> (raman, have you ever found it worthwhile to get your machine to read a UML diagram to you?)

<DanC> (aha! a substantive point comes up! does a different schema mean a different language? or can you have the same language specified with 2 different grammars?)

<timbl> diagram has arc Language stringSet textSet.

<DanC> (Noah's point is well-made in that the arc from Langauge to Syntax shouldn't be 1-1 (which it's not.))

<ht> DanC, see way above -- absolutely yes, per formal language theory, a language can have a range of grammars

<timbl> The syntax is a speciofication of teh syntax.

<DanC> (yes, I agree, ht; I just mean that this question is observable in the diagram. If you read a 1-1 label on the arc from language to syntax, then you get an unconventional definition of language.)

<noah> Proposal: in diagram, relable "Syntax" to "Syntax Specification" or some such

<ht> Strictly speaking, given that ambiguity is off the table, _all_ we need/want at the heart of things is a (set-theoretic) function from text to information

<noah> The key point for me, that I don't think Dave agrees with, is to erase the horizontal line from Language to Syntax (or Syntax Specification)

<ht> There are then subsidiary questions about whether there are independent specifications of the domain and the range, and whether there is some effective computation which implements the function

Noah, re: what you typed above, I believe this is why TimBL/HT suggested you'd be happy with "language definition" instead of language

<timbl> There is a defined way of interpreting a text. There s no defined algo for expressing given information in a text, in fact.

<timbl> This is OK, because anyone producing stuff knows that what they are sending will sendthe rigth information.

<ht> If TextSet is denumerably infinite, then you can always compute the expression for a bit of information _if_ it's expressable via synthesis by analysis, i.e. generate all possible texts, check their interpretation to see if it's the information you want, when it is, go for it

<noah> A question arose as to what the story is at the producer.

<noah> I believe (and I think Tim agreed to this) that the same function that allows a consumer to extract information is what a producer appeals to in deciding what (s)he has encoded in the first place.

<DanC> [Definition: A Language consists of a set of text, any syntactic constraints on the text, a set of information, any semantic constraints on the information, and the mapping between texts and information. ][Definition: Text is a specific, discrete sequence of characters]

<noah> [Definition: A Language consists of a set of text, any syntactic constraints on the text, a set of information, any semantic constraints on the information, and the mapping between texts and information. ]

<noah> Dave's definition of L2 would likely be:

<noah> * Loose bound on set of texts: character strings

<noah> * Syntactic constraint: digits only

<noah> * Set of information: an integer resulting from the mapping given below

<noah> * Semantic constraints: 2<x<8; (x mod 2) == 1

<noah> * Mapping: atoi()

<noah> Text set: 1, 3, 5 ....

<Zakim> timbl, you wanted to Proposed modification: A language is a mapping between texts and information.

<DanC> +0

<timbl> Definition: the TXET SET of a labgauge is its domian, ie the set from which it maps

<timbl> Definition: the Information Set of a language is the range, ie the set of possible ifnormations mapped from the texts in the syntax

<DanC> (+0 as in http://www.apache.org/foundation/voting.html 'I don't feel strongly about it, but I'm okey with this.' )

<noah> I am very happy with Tim's formulation. We will also need more approachable informal explanations, but as a formal underpinning it's just the sort of thing I was looking for.

<noah> Tim, do we also have "Definition: A language is a function from texts to information."? (Or some such) The fact that language is a function seems implicit in your two definitions above.

<Norm> Au revoir.

<DanC> Scribe: raman

Summary of Action Items

[End of minutes]