- From: Paul Denning <pauld@mitre.org>
- Date: Wed, 15 Oct 2003 17:30:02 -0400
- To: www-ws-arch@w3.org
This message provides some thoughts that may be relevant to the discussion of "discovery" in the Web Service Architecture (WSA) document. In August, MITRE held a Technical Exchange Meeting (TEM) called Web Technology Convergence Symposium (WTCS). A TEM, in general, is one way that MITRE coordinates activities within MITRE and with our customers. The WTCS looked at convergence of web services (WS), semantic web (SW), grid computing (GC), and agent technology (AG). Timbl gave a keynote talk [1], and I asked some questions about Discovery in the context of his statement [2] that "Discovery should all be SW-based". Some follow-up discussion on this point is worth repeating. Timbl did not use the term "registry", but he did use the term "index". Paraphrasing Timbl, he feels that exposing descriptions on the web is a better model than publishing to a registry (like UDDI). When descriptions are exposed, they can be harvested using spiders and "indexed". Multiple organizations may have such indexers, and free-market forces will determine which "index" people use to discover what they are looking for. Note, there is a concept that discovery is done by a query to an index. Although possible, it is not likely that individual web-service consumers would spider the web themselves. I asked Timbl if he felt a "standard" API would be required to query the index. Timbl responded that efforts to define a standard query language are in progress [8], suggesting that the "API" to query the index would somehow involve this query language. Personally, I don't see much difference between an "index" and a "registry" from the inquirer's point of view. For example, a spider could harvest descriptions from the web and store the results in a private UDDI registry; the implementation of the "index" could be UDDI. The spider could create a tModel for every WSDL file it finds; see [3]. The spider could do some sort of automatic classification [13] and insert appropriate categories in a UDDI categoryBag. Then there is the issue of metadata. UDDI tModels (and other structures) have categoryBags where information is stored to categorize the entry. For example, the categoryBag for a tModel that refers to a WSDL file could be adorned with a keyValue="wsdlSpec" from the uddi-org:types taxonomy. The UDDI inquiry API (SOAP) lets you search for such Tmodels, which is often used by web service development tools to help developers locate WSDL files. Likewise, a spider could locate OWL-S files, and create a tModel for each one it finds. A pattern is emerging where the overviewURL in a UDDI tModel points outside of the registry to some document, and the categoryBag of the tModel indicates what the link points to. It indicates this using a taxonomy. Another example of this pattern is a UDDI Technical Note has been drafted for ebXML [6]. For example, the tModel overviewURL can point to an ebXML CPPA document, and the tModel classified as keyValue="ebXML:CPPA" using their taxonomy [6]. UDDI provides a common framework for a wide variety of metadata. If each "index" were to store the results of their spider/harvest in a different format, then the query strings used for discovery would be different for each index. So what is the difference between "publishing to a registry" and "exposing a description". In the case of UDDI, it is not a "repository" for the storage of and (direct) access to description files like WSDL. UDDI simply points to the URI of a WSDL file (or other description). UDDI stores a limited amount of metadata in the form of a categoryBag. So with UDDI, you need to first expose WSDL, then publish the tModel. But if the WSDL is already exposed, it presumably is available for some other index/spider/harvester to find. (It may not be too easy to find, unless you somehow know where to start looking, such as, WSIL, RDDL within a homepage, or some variant (like [5]) on RSS Autodiscovery [4]). I am wondering what the web service architecture should say about this stuff. I suggest it should have components like "index" as the entity (agent) that requesters query to locate or discover web services and/or their metadata. "Registry" may be overloaded and may suggest to some an overly centralized architecture. Given multiple indexes, a "discovery proxy agent (DPA)" would help in federation of discovery. It would need to know how to locate indexes, query multiple indexes simultaneously, and consolidate the results. It may need to translate the results into a common format; perhaps this is a separate architectural element, a "discovery translation agent (DTA)" or discovery translation service (DTS). The DPA may cache results. The DPA "has-a" query interface, which could be UDDI, or some other API (also described using WSDL). The query interface "uses" a query language. This query is likely to be something that is not easily expressed in a URI, which implies that the query is put into a message (pushed in a request) or file (pulled by the query processor in response to being given the URI of the query file). A lot of work has been done on query technology, and I am not an expert on it. Some links: [9][10] I like the idea of a DPA because it is similar to (is-a ?) "proxy" in the web architecture (or REST architectural style), which is an example of a REST "component" [7] So to summarize the proposed additions to the WSA document: Revise the Resource Oriented Model [12] 1a. Add "Index" as a noun (Concept or Feature per 2.2.3) 1b. A Discovery Service has-an Index 1c. An Index has-a Query interface 1d. A Query interface has-a Query Language 1e. A Query Language is-identified-by a namespace URI 1f. Add Discovery Proxy Agent (DPA) 1g. DPA is-an agent 1h. Add Discovery Translation Agent (DTA) (and/or Discovery Translation Service - DTS) 1i. A DPA queries an Index 1j. A DPA may use a DTA/DTS to normalize results from queries to multiple indexes 1k. A DPA discovers Indexes 1l. An Index stores service metadata (which may include links to other metadata stored outside the Index) 1m. A DPA has-a Query interface 1n. An Index may harvest service metadata 1o. An Index may provide a Publish interface 2a. An agent subscribes-to a discovery service 2b. A discovery service notifies an agent 2c. An agent has-a notification interface 2d. A discovery service has-a subscription interface (asynch query interface) [1] http://www.w3.org/2003/Talks/08-mitre-tbl/Overview.html [2] http://www.w3.org/2003/Talks/08-mitre-tbl/slide35-0.html [3] http://www.oasis-open.org/committees/uddi-spec/doc/bps.htm [4] http://diveintomark.org/archives/2002/05/31/more_on_rss_autodiscovery [5] http://lists.oasis-open.org/archives/uddi-spec/200305/msg00056.html [6] http://www.oasis-open.org/apps/org/workgroup/uddi-spec/document.php?document_id=3589 [7] http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_3 [8] http://www.w3.org/TR/xquery/ [9] http://swordfish.rdfweb.org/rdfquery/ [10] http://139.91.183.30:9090/RDF/publications/tr308.pdf [11] http://lists.oasis-open.org/archives/uddi-spec/200304/msg00021.html [12] http://dev.w3.org/cvsweb/~checkout~/2002/ws/arch/wsa/wd-wsa-arch-review2.html#resource_oriented_model [13] http://moguntia.ucd.ie/publications/ Paul
Received on Wednesday, 15 October 2003 17:33:26 UTC