11.6 Finding Services (@@ new section?)

I-026: Searching for Web Services

Searching for Web services depending on language or culture.

For this scenario, it is assumed that the web services user or developer will use UDDI to discover and describe web services, either as a provider or requester.  With respect to internationalization, there are four primary scenarios that will be discussed below:

  1. How do I search for services using my language?
  2. How do I search for and find services that are specific to my region?
  3. How do I search for and find services that can handle my locale or language preferences?
  4. How do I describe a service that handles multiple locales?

I-026.1 Searching for Service Descriptions using my language

This capability is in the Oasis UDDI specification today. 

It states in the UDDI Version 3.0.1 specification in the section on Introduction to Internationalization

1.8.4 Use of Multiple Languages and Multiple Scripts

Multinational businesses or businesses involved in international trading at times require the use of possibly several languages or multiple scripts of the same language for describing their business. The UDDI specification supports this requirement through two means, first by specifying the use of XML with its underlying Unicode representation, and second by permitting the use of the xml:lang attribute for various items such as names, addresses, and document descriptions to designate the language in which they are expressed.

Using xml:lang and multiple entries, a Service Provider can publish text information about their service in multiple languages.  The name, description, address, and personName UDDI elements MAY be adorned with the xml:lang attribute to indicate the language in which their content is expressed.  The policyDescription element contains a description of the effect of the policy implementation.  This element can be adorned with the xml:lang attribute and can appear multiple times to allow for localized versions of the policy description.  Providers are encouraged to do this for target language markets that their service may support.   

Ideally, the entity names in UDDI should also provide an Alternate Name in RFC-2277 default language, readable in English.  This provides a fall back mechanism to allow a search to identify services even if the named contents may be in a script that is not readable by the entity doing the search.

The scenario would be as follows:

@@(From the UDDI Spec - is this a candidate for GEO?)

Here are some examples from the UDDI Version 3.0.1 specification.

The following shows an example of romanization where the primary name of the business (a Chinese flower shop) is in Chinese, and its alternative name is a romanization:

<businessEntity . . . >

  ........

  <name xml:lang="zh">黄河花店</name>

  <name xml:lang="en">Huang He Hwa Dian</name>

    .....

</businessEntity>

 

The following shows an example of transliteration where the primary name of the business is in Chinese, and is a transliteration of its alternative English name:

<businessEntity . . . >

  ........

  <name xml:lang="zh">康柏電腦股份有限公司</name>

  <name xml:lang="en">Compaq Computer Taiwan Limited</name>

    .....

</businessEntity>

The following shows an example of use of multiple name elements to support a multi-script language and also the use of an acronym.  In the example, the first <name> element is the primary name of the business (a Japanese flower shop) in Japanese Kanji.  The second <name> element is the business' name transliterated into Japanese Katakana.  The third <name> element gives the business' full English name, and the fourth <name> element gives its English acronym:

<businessEntity . . . >

  ........

  <name xml:lang="ja">日本生花店</name>

  <name xml:lang="ja">ニッポンセイカテン</name>

  <name xml:lang="en">NIPPON FLOWERS </name>

  <name xml:lang="en">NF</name>

  .....

</businessEntity>

Where multiple name elements are published, the first name element is treated as the primary name, which is the name by which a business would be searched and sorted in the case of multiply-named businesses.  Client applications may use this knowledge to assist in optional rendering of a publisher's primary name or all alternative names.

Developers need to know that since the first name element is what UDDI specifies that searching and sorting is done on, they need to provide a mechanism to override the default behavior if the requester is asking for results in a language other than that of the primary entries based on any xml:lang settings in the query structures sent to the UDDI subscription API.

Although not directly related to the machine-to-machine aspects of web services communication, it is instructive to take a look at how UDDI handles postal addresses in more than one language.  The <address> element, contained in the businessEntity structure, contains a simple list of <addressLine> elements.

The following example XML fragment shows an address in two languages where the sequence of the address lines differ according to the language used.  With the use of keyName/KeyValue pair together with the codes assigned in the ubr-uddi-org:postalAddress tModel, it is possible to determine the address semantics programmatically in spite of the difference in address sequence.  Correct locale-specific ordering of address elements should be accomplished through the end user application or via XSLT, rather than as a requirement of the sequence of UDDI elements.

<address useType="Sales office" xml:lang="en" tModelKey="uddi:ubr.uddi.org:postalAddress">

   <addressLine keyName="Floor" keyValue="100">7 F</addressLine>

   <addressLine keyName="House Number" keyValue="70">No. 245 </addressLine>

   <addressLine keyName="District" keyValue="50">Sec. 1</addressLine>

   <addressLine keyName="Street" keyValue="60">Tunhua South Road</addressLine>

   <addressLine keyName="City" keyValue="40">Taipei </addressLine>

</address>

</address><address useType="Sales office" xml:lang="zh" tModelKey="uddi:ubr.uddi.org:postalAddress">

   <addressLine keyName="City" keyValue="40"> 台北市 </addressLine>

   <addressLine keyName="Street" keyValue="60"> 敦 化南路</addressLine>

   <addressLine keyName="District" keyValue="50"> 一 段</addressLine>

   <addressLine keyName="House Number" keyValue="70">  245 </addressLine>

   <addressLine keyName="Floor" keyValue="100"> 7 </addressLine>

   ...

</address>

As there is a large variation in address sub-elements of different countries, the defined canonical address structure does not attempt to include all possible address sub-elements of all countries.  Freeform address lines are therefore supported in the <address> element.

The usage of the canonical address structure is optional, but recommended, for both publishers of business entities and developers of GUIs of UDDI publishing services.

When searching for services, one can hunt for names and descriptions by using the UDDI API to pass an optional collection of string values potentially qualified with xml:lang attributes.  Since "exactMatch" is the default behavior, the value supplied for the name argument must be an exact match.  If the "approximateMatch" findQualifier is used together with an appropriate wildcard character in the name, then any businessService data contained in the specified businessEntity (or across all businesses if the businessKey is omitted or specified as empty) with matching name value will be returned. Matching occurs using wildcard matching rules. Each name MAY be marked with an xml:lang adornment.  If a language markup is specified, the search results report a match only on those entries that match both the name value and language criteria. The match on language is a leftmost case-insensitive comparison of the characters supplied. This allows one to find all services whose name begins with an "A" and are expressed in any dialect of French, for example.  Values which can be passed in the language criteria adornment MUST obey the rules governing the xml:lang data type. 

UDDI does not specify the use of variant find scenarios to allow alternatives such as accent-insensitive matching.  To aid in search retrieval, developers creating a service discovery engine under UDDI may consider alternative match mechanisms.

 

I-026.2 How do I search for and find services that are specific to my region?

The UDDI Version 3.0.1 specification states in its introduction to internationalization that UDDI provides features that enable Web service providers to describe the location of different aspects of a business or service, e.g. where it offers its products and services, where it is located, or even where it has stores, warehouses, or other branches.  This is done through categoryBags and keyedReferences. 

(from UDDI Version 3.0.1)
The optional categoryBag element allows businessEntity structures to be categorized according to published categorization systems. For example, a businessEntity might contain UNSPSC product and service categorizations that describe its product and service offering and ISO 3166 geographical regions that describe the geographical area where these products and services are offered.

As within an identifierBag , a keyedReference contains the three attributes tModelKey, keyName and keyValue. The required tModelKey refers to the tModel that represents the categorization system, and the required keyValue contains the actual categorization within this system. The optional keyName can be used to provide a descriptive name of the categorization. Omitted keyNames are treated as empty keyNames. A keyName MUST be provided in a keyedReference if its tModelKey refers to the general_keywords category system.

For example, in order to categorize a businessEntity as offering goods and services in California, USA, using the corresponding ISO 3166 tModelKey within the UDDI Business Registry, one would add the following keyedReference to the businessEntity's categoryBag:

<keyedReference
    tModelKey="uddi:ubr.uddi.org:categorization:geo3166-2"
    keyName="California, USA"
    keyValue="US-CA" />

The use of geographic categorization for services is useful for taxes, import, export, and acknowledgment of available location-specific physical services such as shipping, export, manufacturing, labor, etc.

I-026.3 Searching for services that can handle my locale or language preference. 

When searching for services that are specifically enabled to handle specific languages or locales, you are a bit on your own.  UDDI has already defined a geo3166 categorization scheme (uddi-org:iso-ch:3166:1999) to restrict services to geographic regions, but no 3066 or 639 categorization to specify locale or language capabilities of a service.  However, UDDI does provide a mechanism to publish services categorized according to any chosen predetermined taxonomies.

For suggestions on implementation methods to support language or locale categorization, see the following in the UDDI V3 Spec.
Another option is using WSDL as part of the UDDI description and gleaning locale metadata from the WSDL description.

Needless to say, a new categorization is needed to be able to communicate the locale and language capabilities of registered services with UDDI.

I-026.4 Searching for service for multiple locale-specific result sets

Just because a service may be described in multiple languages it does not necessarily mean it can operate in those languages.

Repositories and searchable meta-data about Web services need to provide support to search for services that support multiple language searches. Transport layer issues do not allow XML data structures to be used for resolution, except by reference (e.g. the receiver must down-load the WSDL asynchronously to "decode" the preference). Tags at the search layer may be necessary to allow this functionality.

As a continuation of I-026.3, UDDI, or other service repository interfaces, can be set up to provide a service locale category with multiple entries.  A query against such as discovery service could then provide a list or array of supported locales.   UDDI already provides a category based on a registered taxonomy using the ISO 3166 Geographic Code System.  This could be used as a model to either add a taxonomy based on ISO 639 Language Codes to be used in conjunction with ISO 3166, or one based on RFC-3066  - Tags for the Identification of Languages.

I-027: Fall-Back for Internationalized Web Services

When discovering new services through an interface like UDDI, there is currently no mechanism to determine
the most appropriate service match when the user's exact request cannot be met.  For instance, if the requester is looking for a "foo" service that operates in U.S. English, Spanish Spanish, and Canadian French, but cannot find Canadian French, how can the requester be pointed to the "next best" service, such as French French?

If the requester specified "ja-JP" and "ja-JP" is not available, what should the fall-back scenario be?  HTML servers are designed to walk up through a list of preferred languages as sent by a "language-accept-list" from browser to server.  However, no such mechanism exists for UDDI or SOAP.   Therefore, it is recommended to add the following to enable Web Services to work in a robust manner in international implementations.

@@suggested fall-back mechanism