Re: WSA Discovery from David Booth on 2003-10-15 (www-ws-arch@w3.org from October 2003)

From: David Booth <dbooth@w3.org>
Date: Wed, 15 Oct 2003 19:02:44 -0400
To: Paul Denning <pauld@mitre.org>, www-ws-arch@w3.org
Message-Id: <5.1.0.14.2.20031015190100.0217e490@localhost>
Paul,

Interesting thoughts.  In am in the process of re-writing the Discovery 
section now.  Is it okay if I steal parts of your message?

At 05:30 PM 10/15/2003 -0400, Paul Denning wrote:

>This message provides some thoughts that may be relevant to the discussion 
>of "discovery" in the Web Service Architecture (WSA) document.
>
>In August, MITRE held a Technical Exchange Meeting (TEM) called Web 
>Technology Convergence Symposium (WTCS).  A TEM, in general, is one way 
>that MITRE coordinates activities within MITRE and with our 
>customers.  The WTCS looked at convergence of web services (WS), semantic 
>web (SW), grid computing (GC), and agent technology (AG).
>
>Timbl gave a keynote talk [1], and I asked some questions about Discovery 
>in the context of his statement [2] that "Discovery should all be SW-based".
>
>Some follow-up discussion on this point is worth repeating.
>
>Timbl did not use the term "registry", but he did use the term "index".
>
>Paraphrasing Timbl, he feels that exposing descriptions on the web is a 
>better model than publishing to a registry (like UDDI).  When descriptions 
>are exposed, they can be harvested using spiders and "indexed".  Multiple 
>organizations may have such indexers, and free-market forces will 
>determine which "index" people use to discover what they are looking for.
>
>Note, there is a concept that discovery is done by a query to an 
>index.  Although possible, it is not likely that individual web-service 
>consumers would spider the web themselves.  I asked Timbl if he felt a 
>"standard" API would be required to query the index.  Timbl responded that 
>efforts to define a standard query language are in progress [8], 
>suggesting that the "API" to query the index would somehow involve this 
>query language.
>
>Personally, I don't see much difference between an "index" and a 
>"registry" from the inquirer's point of view.  For example, a spider could 
>harvest descriptions from the web and store the results in a private UDDI 
>registry; the implementation of the "index" could be UDDI.  The spider 
>could create a tModel for every WSDL file it finds; see [3].  The spider 
>could do some sort of automatic classification [13] and insert appropriate 
>categories in a UDDI categoryBag.
>
>Then there is the issue of metadata.  UDDI tModels (and other structures) 
>have categoryBags where information is stored to categorize the 
>entry.  For example, the categoryBag for a tModel that refers to a WSDL 
>file could be adorned with a keyValue="wsdlSpec" from the uddi-org:types 
>taxonomy.  The UDDI inquiry API (SOAP) lets you search for such Tmodels, 
>which is often used by web service development tools to help developers 
>locate WSDL files.
>
>Likewise, a spider could locate OWL-S files, and create a tModel for each 
>one it finds.  A pattern is emerging where the overviewURL in a UDDI 
>tModel points outside of the registry to some document, and the 
>categoryBag of the tModel indicates what the link points to.  It indicates 
>this using a taxonomy.  Another example of this pattern is a UDDI 
>Technical Note has been drafted for ebXML [6].  For example, the tModel 
>overviewURL can point to an ebXML CPPA document, and the tModel classified 
>as keyValue="ebXML:CPPA" using their taxonomy [6].
>
>UDDI provides a common framework for a wide variety of metadata.  If each 
>"index" were to store the results of their spider/harvest in a different 
>format, then the query strings used for discovery would be different for 
>each index.
>
>So what is the difference between "publishing to a registry" and "exposing 
>a description".  In the case of UDDI, it is not a "repository" for the 
>storage of and (direct) access to description files like WSDL.  UDDI 
>simply points to the URI of a WSDL file (or other description).  UDDI 
>stores a limited amount of metadata in the form of a categoryBag.  So with 
>UDDI, you need to first expose WSDL, then publish the tModel.  But if the 
>WSDL is already exposed, it presumably is available for some other 
>index/spider/harvester to find.  (It may not be too easy to find, unless 
>you somehow know where to start looking, such as, WSIL, RDDL within a 
>homepage, or some variant (like [5]) on RSS Autodiscovery [4]).
>
>
>I am wondering what the web service architecture should say about this stuff.
>
>I suggest it should have components like "index" as the entity (agent) 
>that requesters query to locate or discover web services and/or their 
>metadata.  "Registry" may be overloaded and may suggest to some an overly 
>centralized architecture.
>
>Given multiple indexes, a "discovery proxy agent (DPA)" would help in 
>federation of discovery.  It would need to know how to locate indexes, 
>query multiple indexes simultaneously, and consolidate the results.  It 
>may need to translate the results into a common format; perhaps this is a 
>separate architectural element, a "discovery translation agent (DTA)" or 
>discovery translation service (DTS).  The DPA may cache results.
>
>The DPA "has-a" query interface, which could be UDDI, or some other API 
>(also described using WSDL).
>The query interface "uses" a query language.  This query is likely to be 
>something that is not easily expressed in a URI, which implies that the 
>query is put into a message (pushed in a request) or file (pulled by the 
>query processor in response to being given the URI of the query file).
>
>A lot of work has been done on query technology, and I am not an expert on 
>it.  Some links: [9][10]
>
>I like the idea of a DPA because it is similar to (is-a ?) "proxy" in the 
>web architecture (or REST architectural style), which is an example of a 
>REST "component" [7]
>
>So to summarize the proposed additions to the WSA document:
>
>Revise the Resource Oriented Model [12]
>
>1a.  Add "Index" as a noun (Concept or Feature per 2.2.3)
>1b.  A Discovery Service has-an Index
>1c.  An Index has-a Query interface
>1d.  A Query interface has-a Query Language
>1e.  A Query Language is-identified-by a namespace URI
>1f.  Add Discovery Proxy Agent (DPA)
>1g.  DPA is-an agent
>1h.  Add Discovery Translation Agent (DTA) (and/or Discovery Translation 
>Service - DTS)
>1i.  A DPA queries an Index
>1j.  A DPA may use a DTA/DTS to normalize results from queries to multiple 
>indexes
>1k.  A DPA discovers Indexes
>1l.  An Index stores service metadata (which may include links to other 
>metadata stored outside the Index)
>1m.  A DPA has-a Query interface
>1n.  An Index may harvest service metadata
>1o.  An Index may provide a Publish interface
>
>2a.  An agent subscribes-to a discovery service
>2b.  A discovery service notifies an agent
>2c.  An agent has-a notification interface
>2d.  A discovery service has-a subscription interface (asynch query interface)
>
>
>[1] http://www.w3.org/2003/Talks/08-mitre-tbl/Overview.html
>[2] http://www.w3.org/2003/Talks/08-mitre-tbl/slide35-0.html
>[3] http://www.oasis-open.org/committees/uddi-spec/doc/bps.htm
>[4] http://diveintomark.org/archives/2002/05/31/more_on_rss_autodiscovery
>[5] http://lists.oasis-open.org/archives/uddi-spec/200305/msg00056.html
>[6] 
>http://www.oasis-open.org/apps/org/workgroup/uddi-spec/document.php?document_id=3589
>[7] 
>http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_3
>[8] http://www.w3.org/TR/xquery/
>[9] http://swordfish.rdfweb.org/rdfquery/
>[10] http://139.91.183.30:9090/RDF/publications/tr308.pdf
>[11] http://lists.oasis-open.org/archives/uddi-spec/200304/msg00021.html
>[12] 
>http://dev.w3.org/cvsweb/~checkout~/2002/ws/arch/wsa/wd-wsa-arch-review2.html#resource_oriented_model
>[13] http://moguntia.ucd.ie/publications/
>
>Paul

-- 
David Booth
W3C Fellow / Hewlett-Packard
Telephone: +1.617.253.1273
Received on Wednesday, 15 October 2003 19:02:49 UTC