RE: Including Semantics from Assaf Arkin on 2003-02-14 (www-ws-arch@w3.org from February 2003)

From: Assaf Arkin <arkin@intalio.com>
Date: Fri, 14 Feb 2003 11:28:03 -0800
To: "Burdett, David" <david.burdett@commerceone.com>, "'Duane Nickull'" <duane@xmlglobal.com>
Cc: <www-ws-arch@w3.org>
Message-ID: <IGEJLEPAJBPHKACOOKHNKEDBDDAA.arkin@intalio.com>
Including Semantics
  -----Original Message-----
  From: Burdett, David [mailto:david.burdett@commerceone.com]
  Sent: Friday, February 14, 2003 10:36 AM
  To: 'Assaf Arkin'; Burdett, David; 'Duane Nickull'
  Cc: www-ws-arch@w3.org
  Subject: RE: Including Semantics


  Assaf

  There are interesting ideas in your email but I don't think you've
answered my original question which is how all this relates to the Semantic
Web activity and RDF ... see more detailed comments below.

  David
    -----Original Message-----
    From: Assaf Arkin [mailto:arkin@intalio.com]
    Sent: Thursday, February 13, 2003 6:33 PM
    To: Burdett, David; 'Duane Nickull'
    Cc: www-ws-arch@w3.org
    Subject: RE: Including Semantics


    What you want to have are different semantic languages and a framework
that associates all that information together. For example, XSDL would
define some of the semantics of a message. It can tell me that a purchase
order contains one or more line items, a billing address and a shipping
address.
    [David Burdett] True, but XSDL does not tell you what a shipping address
>>means<<. It might be pretty obvious based on our common experience and
therefore does not need any explanation. But this is not the case for much
of the information transported in business documents. XSDL only gives you a
structure and method of identfying individual pieces of information - it's
not enough

    This can easily turn into a philosophical discussion ;-) I'm going to
try to think of some creative analogy, but basically semantics is always a
matter of scope and context. Some semantic S may be sufficient in some
context/scope but not in another.

    I am going to start with you by agreeing to one simple principle. The
fact that I know what purchase order "means" because I've used it in many
applications doesn't imply that XSDL is giving me any of that knowledge.
XSDL only says what it says, and everything else comes from some knowledge
repository in my mind. There's a lot more information there and none of that
information is captured or derived from the XSDL. It's fair to say that 90%
of what we collective know about a purchase order is NOT captured in the
XSDL. I'm also guessing that you know twice as much as I do about purchase
orders.

    Which brings me to the second topic. Let's say we captured all the
meaning based on our collective knowledge, so all the information I ever
needed was written in some document of sort. I would deem that semantic
sufficient. But since you have more experience there is additional knowledge
not captured there. You can always argue there's more meaning and we only
captured part of it. At least in my experience, a framework for expressing
semantic should be open ended, i.e. it should allow you to specify any
number of semantics relating to the same entity and incrementally add more
and more semantics without limit. RDF is simply such a framework. It doesn't
describe any such semantics, but it allows any number of different semantics
to be added over time.

    For example, let's say we described all the semantics of a purchase
order so we can understand why, where, when and how to use it to get a
purchase order done. In my work I would like to chronicle the evolution of
the system, so I would also like to know why certain decisions were made
(e.g. why billing address and shipping address), and what are the best
practices (e.g. when to use a billing address different from the shipping
address). So I want to capture additional semantics. In your work you may
want to understand the relationship between a purchase order and the bottom
line. Does a purchase order increase or decrease revenue? And so forth.

    XSDL offers a very limited set of semantics that only guides you in
determining how to construct an XML document. That's it. And WSDL offers
some additional semantics that tell you whether to construct or read that
XML document (whether it's input to your service or output from your
service). And WSCI offers some additional semantics on where in that process
that information captured in the XSDL is shared between the two services.
But that's a very very limited set of semantics because the scope is very
simple.

    It doesn't tell me what line item means in any other context, only that
a line item is a property of a purchase order with cardinality {1,*} and a
purchase order is the input property of some operation and that this
operation is a property of some state synchronization in the choreography.
To tell me what a "line item" means in any other context I need to define
the context, a language for expressing that information, and then express
that information.


     In a different language, e.g. WSDL, I could say that a purchase order
is required as the input for an operation and that the operation does not
result in an immediate response.
    [David Burdett] Again I think you are making assumptions. For example
what do you mean by a "response". Does it mean, for example, a) "I got the
message but have done nothing with it", or b) "I've got the message and it's
structure looks OK, i.e. I haven't checked that codes (e.g. productids) are
valid, or stock availabilty", or c) "I've checked it and here's information
on the extend to which I can satisfy your order". This is all semantic
information that, I doubt would go in a WSDL definition.

    Let's say that in some way I defined cause and effect. A purchase order
causes an accept or reject response. An accept response causes a shipping
notice. A shipping notice causes an invoice. An invoice causes payment. So
all I can say is that the cause is conveyed in one operation and the effect
in another. And if I have a time constraint in the choreography I can deduct
that some upper bound exist on the time between conveyed cause and effect.

    In another language I define commitment. I would say that a purchase
order captures all information about a particular commitment, the accept
message indicates no fulfillment, the reject message indicates intent to
fulfill, the shipping notice indicates fulfillment and the invoice indicates
a commitment implied in the first commitment (X gets products -> Y gets
payment). So in the choreography definition I capture the casual order of
actions that are part of these commitments, but strictly speaking I capture
only that information. Indeed the choreography language says nothing
interesting about the commitment.

    You can introduce other languages that say interesting things about that
operation. For example, a cost language would introduce a cost property and
a way to express the cost calculated from purchase order message. So you can
say there's a property called 'cost' and determine that value of that
property given a purchase order message.
    [David Burdett] I think I get this, but if you did have such a language,
who or what would use it? It's not clear to me.

    For example, to determine the costs of the transaction (notice
plurality). Let's say the purchase order cost is $49.99 + $5.99 s/h. What
does that tell me?

    If I believe I would be using the product that in tells me that it's
cost is $55.98. If I am not sure whether I will be using the product, I may
want to return it, it tells me that the cost would be $55.98 or $5.99 (I
never get my money back for s/h). So if I am not sure of the quality of the
product and 10% of orders get returned, I can calculate the total cost which
factors all returns. If I buy this on my business account it tells me I
don't need any approval to make the purchase (below approval limit). I may
use a different process without approval, or I may conclude that the process
will complete faster because approval takes up to a day and the rest of the
order can be completed in four days.

    Another language could define an object called delivery with multiple
properties, reference the purchase order message as indicating the product
property, an accept response as indicating the agent promising to deliver,
and a delivery notice as indicating truth of delivery property. That
'delivery' object does not exist, but if you participate in the business
choreography you can draw a lot of conclusions about the delivery status by
observing how its virtual properties are modified during different states of
the process.


    On a conceptual level this is very interesting since it allows the
development of even smarter applications based on what is already there.
That logical delivery object can be defined in terms of existing purchase
order scenarios, even if you're running a COBOL application written thirty
years ago.
    [David Burdett] I agree that the being able to abstract existing
applications is important

    On a practical level, I will take a few years before we have the
understanding of how to define such semantics on a larger scale and actual
products that operate on that semantic. So right now it doesn't solve any
problem.
    [David Burdett] Who do you think would be the right organization to
develop these semantics and how to define them.

    The Semantic Web people are building the foundation for the framework in
which semantics can be expressed, ontologies can be listed, etc. We then
need a bunch of semantic languages, which I would say each organization
should define based on its expertise. For example, the W3C and OASIS would
be a good place for those semantics related specifically to Web services and
technical usage of Web services. The ebXML track is dealing with semantics
more specific to B2B but across verticals, e.g. UBL. There are organizations
that deal with more specific semantics, e.g. supply chain managment,
balanced score cards, TQM. So semantics would be invented all other the
place based on expertise in a specific domain. In our search to describe the
full meaning of a purchase order we would draw on the interactions (data
structures, choreography), the generic business meaning (line item,
address), the specific business meaning (supplier, buyer), the fulfillment
of business goals (score cards, quality, etc).

    Ontologies have existed for a long while, but I think this will be the
first large scale fully distributed project into creating new ontologies and
specifying semantics, so over the course of a few years we will develop best
practices for defining semantic languages and cross linking semantics. Since
the W3C is taking the lead on this effort, I would guess that they would be
the natural candidate for building the generic frameworks that can be used
in all these domains.

    arkin

    But if you look at a combination like WSCI + WSDL + XSDL you can see
that the semantic of WSCI express the context in which a WSDL operation is
used and the semantic of the WSDL operation expresses what the WSDL type is
used for. So we're already doing some limited semantic work on a step by
step basis. And just like the logical delivery object above, the process
that occurs between the services doesn't really exist, it's only inferred
from how they operate together, and the operation doesn't really exist, it's
only an understanding of the meaning of sending some input and receiving
some output.


    arkin

      -----Original Message-----
      From: Burdett, David [mailto:david.burdett@commerceone.com]
      Sent: Thursday, February 13, 2003 12:15 PM
      To: 'Assaf Arkin'; Burdett, David; 'Duane Nickull'
      Cc: www-ws-arch@w3.org
      Subject: RE: Including Semantics


      Assaf

      I agree with all of your email, especially the need for descriptions
at the particle level, apart from the assertion "For computer processing RDF
gives you a good framework". Perhaps it does, but for the problem in hand, I
don't see how it is directly usable now. How would you, for example,
actually use an RDF description of a business document when desiging,
building or operating a computer system that wants to generate or process
XML based business documents.

      David
        -----Original Message-----
        From: Assaf Arkin [mailto:arkin@intalio.com]
        Sent: Thursday, February 13, 2003 11:00 AM
        To: Burdett, David; 'Duane Nickull'
        Cc: www-ws-arch@w3.org
        Subject: RE: Including Semantics




          I think it really boils down to how the information is going to be
>used<. Most information in business documents ends up either being printed
or displayed for human consumption, or mapped to some internal format to
populate information in an ERP system say. In both these cases you need a
very clear definition of the meaning of the data that either a human can
understand as help when viewing a document or can be used by another human
to do a good map between external and internal formats. I don't see how RDF
would help with this and I can't imagine a software tool that could make
good use of it in this context.

          For computer processing RDF gives you a good framework and it can
also contain information for human consumption (e.g. HTML formatted text).
But practically speaking, we're still at the point where people do all that
work, so what we need is way to annotate the information and present some
textual information to the user.

          XSDL, WSDL and most other recent specifications have ways of
annotating definitions. Ideally you should be able to annotate any
definition, not just a top-level one, e.g. a particle in the XSDL content,
an operation from a port type, etc.

          The namespace by itself is insufficient because you can have
multiple definitions in the same namespace. But often some of the semantics
is captured by the namespace on its own. For example,
http://example.com/trading/futures may indicate that all related definitions
deal with trading in futures. It won't tell you what a specific data type
means, or what a particular operation does. But when you browse a repository
of type/service/process definitions, it lets you easily determine what
context you are looking at.

          arkin


          I accept I may be completely missing something - can anyone
clarify?

          David
            -----Original Message-----
            From: Assaf Arkin [mailto:arkin@intalio.com]
            Sent: Wednesday, February 12, 2003 9:49 PM
            To: Burdett, David; 'Duane Nickull'
            Cc: www-ws-arch@w3.org
            Subject: RE: Including Semantics



              -----Original Message-----
              From: www-ws-arch-request@w3.org
[mailto:www-ws-arch-request@w3.org]On Behalf Of Burdett, David
              Sent: Wednesday, February 12, 2003 4:30 PM
              To: 'Duane Nickull'
              Cc: www-ws-arch@w3.org
              Subject: Including Semantics


              Duane asked ...

              >>>One missing component I would like to see is semantics.
David - do you
              think there is a way to leverage the semantics of UBL, CCTS
for the WSAG?<<<

              Semantics is a whole big topic on its own, but here's my take
of the semantic information that you might need to define. Note I'm looking
at this from a "business use" perspective:

              1. Document Semantics. At the highest level a namespace
identifies a document as consisting of a set of fields. Within this there
are two additional levels to consider:

                a) Individual fields. Each field needs to be defined, e.g.
what does "CustomerId" mean, e.g. is it the ID by which the Customer
identifies themselves or the id which the supplier uses to identify the
customer?

                b) Fields within a document, e.g. The Customer ID could
appear can appear in multiple places in the document - how does its meaning
vary depending on where it exists.

              2. Context Dependent Semantics. The content of a message can
also depend on the context in which it is being used, for example an Invoice
in Europe is different from an Invoice in the US as it contains different
fields. Similarly an Invoice used in the travel industry contains additional
line item information (e.g flight segments) that other industries (e.g. the
chemical industry) don't need.

              3. Message Semantics. Messages >can< consist of multiple parts
where you could describe each "part" as a document. You then need to, in the
context of the message, define what each document mean, for example you
might want to attach a supplier generated delivery note when requesting a
"return materials advice" for some faulty goods. In this case the delivery
note is evidence that delivery occured. This is different from its first use
when the delivery note informs the buyer of what the supplier has shipped,
but not yet delivered.

              4. Transaction Semantics. The same message with the same
structure and same semantics can be treated differently depending on where
it is being sent and the context in which it is being used. For example
sending an Order Message to an off-site archival service for archiving would
have different meaning than sending the "identical" message to a supplier.

              So yes I think you could leverage the semantics of UBL etc,
but that is just the start and my best >guess< is that you could use header
information in a SOAP message to codify the semantics of the message ...
although this sound very non-RESTafarian ;)

              Also ... this is a trout hole ... how does the W3C work on the
Semantic Web fit in with all of this ;)

              Just looking at the perspective of Semantic Web, could we not
use RDF to create maps of semantic information?

              For example, I can describe the semantics of a type using RDF
(customerID) by referencing the type definition, but also the semantics of
the content of a type (order/billing/address vs. order/shipping/address) if
I can reference an XSD particle. And I can have both semantics, one that
applies to address in isolation, and one that extends that semantics when
address is used in some context.

              I would guess that the same is possible for transactions. For
example, e.g. the address of the invoice that is sent by activity X of
transaction Y. All I need is a way to reference a resource that can be part
of a larger resource in the RDF description and then provide that semantic
in the RDF.

              arkin







              David



              -----Original Message-----
              From: Duane Nickull [mailto:duane@xmlglobal.com]
              Sent: Wednesday, February 12, 2003 4:00 PM
              To: Burdett, David
              Cc: www-ws-arch@w3.org
              Subject: Re: Layers in the WSA (was RE: [Fwd: UN/CEFACT TMG
Releases
              e-Bus ines s Architecture Technical Specification for Public
Review])

              <SNIP/>
Received on Friday, 14 February 2003 14:29:48 UTC