Re: WSA Discovery

P.S.  I probably should have been clearer.  I've been approaching Discovery 
from a somewhat higher level, and although I don't necessarily agree with 
all of your suggestions, I think some of the explanations you have would be 
good to include.  Is that okay with you?

At 07:02 PM 10/15/2003 -0400, David Booth wrote:

>Paul,
>
>Interesting thoughts.  In am in the process of re-writing the Discovery 
>section now.  Is it okay if I steal parts of your message?
>
>At 05:30 PM 10/15/2003 -0400, Paul Denning wrote:
>
>>This message provides some thoughts that may be relevant to the 
>>discussion of "discovery" in the Web Service Architecture (WSA) document.
>>
>>In August, MITRE held a Technical Exchange Meeting (TEM) called Web 
>>Technology Convergence Symposium (WTCS).  A TEM, in general, is one way 
>>that MITRE coordinates activities within MITRE and with our 
>>customers.  The WTCS looked at convergence of web services (WS), semantic 
>>web (SW), grid computing (GC), and agent technology (AG).
>>
>>Timbl gave a keynote talk [1], and I asked some questions about Discovery 
>>in the context of his statement [2] that "Discovery should all be SW-based".
>>
>>Some follow-up discussion on this point is worth repeating.
>>
>>Timbl did not use the term "registry", but he did use the term "index".
>>
>>Paraphrasing Timbl, he feels that exposing descriptions on the web is a 
>>better model than publishing to a registry (like UDDI).  When 
>>descriptions are exposed, they can be harvested using spiders and 
>>"indexed".  Multiple organizations may have such indexers, and 
>>free-market forces will determine which "index" people use to discover 
>>what they are looking for.
>>
>>Note, there is a concept that discovery is done by a query to an 
>>index.  Although possible, it is not likely that individual web-service 
>>consumers would spider the web themselves.  I asked Timbl if he felt a 
>>"standard" API would be required to query the index.  Timbl responded 
>>that efforts to define a standard query language are in progress [8], 
>>suggesting that the "API" to query the index would somehow involve this 
>>query language.
>>
>>Personally, I don't see much difference between an "index" and a 
>>"registry" from the inquirer's point of view.  For example, a spider 
>>could harvest descriptions from the web and store the results in a 
>>private UDDI registry; the implementation of the "index" could be 
>>UDDI.  The spider could create a tModel for every WSDL file it finds; see 
>>[3].  The spider could do some sort of automatic classification [13] and 
>>insert appropriate categories in a UDDI categoryBag.
>>
>>Then there is the issue of metadata.  UDDI tModels (and other structures) 
>>have categoryBags where information is stored to categorize the 
>>entry.  For example, the categoryBag for a tModel that refers to a WSDL 
>>file could be adorned with a keyValue="wsdlSpec" from the uddi-org:types 
>>taxonomy.  The UDDI inquiry API (SOAP) lets you search for such Tmodels, 
>>which is often used by web service development tools to help developers 
>>locate WSDL files.
>>
>>Likewise, a spider could locate OWL-S files, and create a tModel for each 
>>one it finds.  A pattern is emerging where the overviewURL in a UDDI 
>>tModel points outside of the registry to some document, and the 
>>categoryBag of the tModel indicates what the link points to.  It 
>>indicates this using a taxonomy.  Another example of this pattern is a 
>>UDDI Technical Note has been drafted for ebXML [6].  For example, the 
>>tModel overviewURL can point to an ebXML CPPA document, and the tModel 
>>classified as keyValue="ebXML:CPPA" using their taxonomy [6].
>>
>>UDDI provides a common framework for a wide variety of metadata.  If each 
>>"index" were to store the results of their spider/harvest in a different 
>>format, then the query strings used for discovery would be different for 
>>each index.
>>
>>So what is the difference between "publishing to a registry" and 
>>"exposing a description".  In the case of UDDI, it is not a "repository" 
>>for the storage of and (direct) access to description files like 
>>WSDL.  UDDI simply points to the URI of a WSDL file (or other 
>>description).  UDDI stores a limited amount of metadata in the form of a 
>>categoryBag.  So with UDDI, you need to first expose WSDL, then publish 
>>the tModel.  But if the WSDL is already exposed, it presumably is 
>>available for some other index/spider/harvester to find.  (It may not be 
>>too easy to find, unless you somehow know where to start looking, such 
>>as, WSIL, RDDL within a homepage, or some variant (like [5]) on RSS 
>>Autodiscovery [4]).
>>
>>
>>I am wondering what the web service architecture should say about this stuff.
>>
>>I suggest it should have components like "index" as the entity (agent) 
>>that requesters query to locate or discover web services and/or their 
>>metadata.  "Registry" may be overloaded and may suggest to some an overly 
>>centralized architecture.
>>
>>Given multiple indexes, a "discovery proxy agent (DPA)" would help in 
>>federation of discovery.  It would need to know how to locate indexes, 
>>query multiple indexes simultaneously, and consolidate the results.  It 
>>may need to translate the results into a common format; perhaps this is a 
>>separate architectural element, a "discovery translation agent (DTA)" or 
>>discovery translation service (DTS).  The DPA may cache results.
>>
>>The DPA "has-a" query interface, which could be UDDI, or some other API 
>>(also described using WSDL).
>>The query interface "uses" a query language.  This query is likely to be 
>>something that is not easily expressed in a URI, which implies that the 
>>query is put into a message (pushed in a request) or file (pulled by the 
>>query processor in response to being given the URI of the query file).
>>
>>A lot of work has been done on query technology, and I am not an expert 
>>on it.  Some links: [9][10]
>>
>>I like the idea of a DPA because it is similar to (is-a ?) "proxy" in the 
>>web architecture (or REST architectural style), which is an example of a 
>>REST "component" [7]
>>
>>So to summarize the proposed additions to the WSA document:
>>
>>Revise the Resource Oriented Model [12]
>>
>>1a.  Add "Index" as a noun (Concept or Feature per 2.2.3)
>>1b.  A Discovery Service has-an Index
>>1c.  An Index has-a Query interface
>>1d.  A Query interface has-a Query Language
>>1e.  A Query Language is-identified-by a namespace URI
>>1f.  Add Discovery Proxy Agent (DPA)
>>1g.  DPA is-an agent
>>1h.  Add Discovery Translation Agent (DTA) (and/or Discovery Translation 
>>Service - DTS)
>>1i.  A DPA queries an Index
>>1j.  A DPA may use a DTA/DTS to normalize results from queries to 
>>multiple indexes
>>1k.  A DPA discovers Indexes
>>1l.  An Index stores service metadata (which may include links to other 
>>metadata stored outside the Index)
>>1m.  A DPA has-a Query interface
>>1n.  An Index may harvest service metadata
>>1o.  An Index may provide a Publish interface
>>
>>2a.  An agent subscribes-to a discovery service
>>2b.  A discovery service notifies an agent
>>2c.  An agent has-a notification interface
>>2d.  A discovery service has-a subscription interface (asynch query 
>>interface)
>>
>>
>>[1] http://www.w3.org/2003/Talks/08-mitre-tbl/Overview.html
>>[2] http://www.w3.org/2003/Talks/08-mitre-tbl/slide35-0.html
>>[3] http://www.oasis-open.org/committees/uddi-spec/doc/bps.htm
>>[4] http://diveintomark.org/archives/2002/05/31/more_on_rss_autodiscovery
>>[5] http://lists.oasis-open.org/archives/uddi-spec/200305/msg00056.html
>>[6] 
>>http://www.oasis-open.org/apps/org/workgroup/uddi-spec/document.php?document_id=3589
>>[7] 
>>http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm#sec_5_2_3
>>[8] http://www.w3.org/TR/xquery/
>>[9] http://swordfish.rdfweb.org/rdfquery/
>>[10] http://139.91.183.30:9090/RDF/publications/tr308.pdf
>>[11] http://lists.oasis-open.org/archives/uddi-spec/200304/msg00021.html
>>[12] 
>>http://dev.w3.org/cvsweb/~checkout~/2002/ws/arch/wsa/wd-wsa-arch-review2.html#resource_oriented_model
>>[13] http://moguntia.ucd.ie/publications/
>>
>>Paul
>
>--
>David Booth
>W3C Fellow / Hewlett-Packard
>Telephone: +1.617.253.1273

-- 
David Booth
W3C Fellow / Hewlett-Packard
Telephone: +1.617.253.1273

Received on Wednesday, 15 October 2003 19:29:33 UTC