Re: Syntax and semantics

At 08:43 PM 5/16/00 -0400, Tim Berners-Lee wrote:

>In a distributed system, the semantics must be carried by the message.


>The semantics of HTML tags are not defined in a mathematical
>way but the semantics of a bank transfer are. In the future, we
>will be able to define the semantics of a new language by relating
>it to things like quicken input files, and also by specifying
>mathematical properties of the protocols - such as the relationship
>between a check and a bank statement.  In the meantime,
>we still use English in specifications.  But the crucial thing is to
>recognize that the namespace identifier identifies the language
>of the message and so indirectly its meaning. The namespace
>identifier has to be the hook onto which I can hang semantic

In the above discussion, it is important to note that XML itself does not 
know anything about bank transactions, nor does it provide a mechanism to 
define "withdrawal" and "deposit", explaining things like (1) when you 
withdraw  X an account, you subtract X from the balance of the account, 
when you deposit X to an account, you add X to the balance of the account; 
(2) when you withdraw X from an account, you should carefully record where 
that money goes *to*, or regulators get upset; (3) what I meant by 
"subtraction" in (1) was this... These things are beyond the scope of XML, 
and XML does not need to get involved in these kinds of questions to 
determine what a name is. Still, XML is very useful for defining the 
grammar used for a financial transaction. Tools that build on XML are 
needed if we want to define the semantics.

Also, I think it is important to recognize that the semantics of a 
transaction may include a number of things not defined or even explicitly 
known by the parties at the time, such as the governing law,  including 
legal precedent. I think that distributed systems will need to be able to 
use globally unique, persistent identifiers in order to reference all the 
information needed to understand even something as simple as a business 
transaction. I doubt that it will be practical to embed all associated 
semantics in every transaction.

Every transaction implicitly invokes a body of law. In the long run, this 
law may be available on the Internet in some form, but I doubt that it will 
all be explicitly present in the message per se or even in the schema. For 
instance, if you agree to sell me a notebook with my name on it, it is 
quite likely that nothing in the transaction specifies that this notebook 
has become a non-fungible good because it has my name on it, and is legally 
different than a notebook without my name on it, which means you can force 
me to buy it if I order it and decide I don't want it after you print it. 
We have to look to legal precedent for that. When you sell me a notebook, 
that transaction enough many legally binding issues regarding what may be 
called paper, what may be called leather or a particular kind of leather, 
what collection procedures you may legally use in the event of non-payment, 
etc. In all likelihood, neither you nor I know a fraction of the laws and 
regulations that govern the sale of that notebook, but each of these 
defines semantics relevant to our exchange.

Making this information explicit and accessible is a difficult problem. 
Fortunately, I don't think we need to solve it in order to determine the 
name of an element or an attribute. I *do* think that we need to ensure 
that our naming method supports globally unique, persistent identifiers so 
that many systems can use the same names without ambiguity or clash.


Received on Wednesday, 17 May 2000 09:53:42 UTC