<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <link rel="stylesheet" type="text/css" href=
  "http://www.w3.org/StyleSheets/TR/W3C-WD.css" />
  <meta name="generator" content=
  "HTML Tidy for Linux/x86 (vers 1st March 2004), see www.w3.org" />

  <title>Data Access Working Group User Cases: WORKING DRAFT</title>
</head>

<body>
  <h1>Data Access Working Group User Cases: WORKING DRAFT</h1>

  <h1>Table of Contents</h1>
  <!-- i need some xslt to generate this automagically -->

  <ol>
    <li><a href="#status">Document Status</a></li>

    <li><a href="#intro">Introduction</a></li>

    <li><a href="#motive">Motivation</a></li>

    <li><a href="#prodesc">Problem Description</a></li>

    <li><a href="#uc">Use Cases</a></li>

    <li><a href="#canreq">Candidate Technical Requirements</a></li>

    <li><a href="#relts">Related Technologies and Standards</a></li>
  </ol>

  <h1><a name="status" id="status">1. Document Status</a></h1>

  <p>This document is an informal artifact of the W3 Data Access Working Group
  and, as such, has no formal status. The most recent version of this document
  is available at <a href=
  "http://www.w3.org/2001/sw/DataAccess/UseCases">http://www.w3.org/2001/sw/DataAccess/UseCases</a>.
  The current working version is marked:</p>
  <pre>
$Id: UseCases.html,v 1.24 2004/04/16 21:44:17 kclark Exp $.
</pre>

  <h1><a name="intro" id="intro">2. Introduction</a></h1>

  <p><a href="http://www.w3.org/2001/sw/DataAccess">DAWG</a> members are using
  this document to structure and organize discussion about use cases related to
  RDF query language and data access standardization efforts.</p>

  <h1><a name="motive" id="motive">3. Motivation</a></h1>

  <p>The <a href="http://www.w3.org/2001/sw">Semantic Web</a> effort is mature
  enough that the existing implementations of RDF data storage servers require
  a standardized query language and data access protocol in order to achieve
  widespread data interoperability. A standard RDF query language might
  coalesce the technology intended for querying RDF data in much the same way
  that SQL did for RDBMS data. A standard way to access remote RDF storage
  servers might accomplish for the Semantic Web and data interoperability much
  of what HTTP did for the Web itself.</p>

  <h1><a name="prodesc" id="prodesc">4. Problem Description</a></h1>

  <h2>4.1. Query</h2>

  <p>Because there are no formal standards in these areas, developers in
  industry and in open source projects have created <a href=
  "http://www.w3.org/2001/11/13-RDF-Query-Rules/">many query languages for RDF
  data</a>. These languages lack both a common syntax and a common semantics.
  In fact, the extant query languages implement a significant semantic range:
  from declarative, SQL-like languages, to path languages, to rule or
  production-like systems. The existing languages also exhibit a range of
  extensibility features and builtin capabilities, including inferencing,
  distributed query, and domain-specific semantics.</p>

  <h2>4.2. Data Access</h2>

  <p>There are as many different methods of accessing remote RDF storage
  servers as there are distinct RDF storage server projects. Even where the
  basic access protocol is a standard&mdash;HTTP, SOAP, or XML-RPC&mdash;there
  isn't much ground upon which generic client support to access a wide variety
  of such servers might be developed.</p>

  <h2><a name="uc" id="uc">5. Use Cases</a></h2>

  <p>Use cases are used to determine and publicize the scope of the working
  group's technical work. Each use case describes a concrete application of the
  future DAWG recommendation, setting a user-oriented context in which the
  query language or protocol or both are used to solve a real problem. In this
  way the Working Group describes the principle benefits of DAWG, while at the
  same time creating a map of the problem space.</p>

  <h2>5.1. Personal Information Management</h2>

  <h3>5.1.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0187.html">Finding
  an email address</a></h3>

  <h4>Description</h4>

  <p>George wants to send an email message to John Smith. His personal address
  book, which includes John Smith's contact information, is stored in RDF using
  the <a href="http://xmlns.com/foaf/0.1/">FOAF vocabulary Specification</a>.
  George's email client queries his local address book service and, if there is
  only one match, sets the query result as the value of tthe "To:" field;
  otherwise it prompts George to choose the best match.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Efficiency</strong>. First, it's more efficient for the
    programmer who develops George's email client to use a query language than
    to write custom code against a low-level RDF storage interface. Second,
    execution of the query may be more efficient because query language
    implementations are often able to achieve more aggressive
    optimizations.</li>

    <li><strong>Interoperability</strong>. Applications that use a standard
    query langauge and data access protocol can submit queries to a local
    address book service or to a remote RDF-aware directory service with no
    change other than pointing to a different directory service resource.</li>
  </ul>

  <h3>5.1.2. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0072.html">Regularly
  executing a query</a></h3>

  <h4>Description</h4>

  <p>Marshall needs to update some personal financial information every day; he
  programs an off-the-shelf web agent program to execute a query every morning
  before he gets to work and every evening before he goes to bed. Marshall uses
  a wizard to formulate the query, which the web agent constructs as an HTTP
  URI. In order to fulfill Marshall's information gathering requirement, his
  web agent simply resolves the query URI.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Preserving Investment</strong>. Expressing DAWG queries as HTTP
    URIs, and returning the results of the query as the representation
    retrieved by dereferencing the URI, preserves the existing investment in
    web infrastructure.</li>
  </ul>

  <h3>5.1.3. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0041.html">Monitoring
  news events</a></h3>

  <h4>Description</h4>

  <p>Kate wants to be notified whenever there is a news item about her favorite
  television show. She is accustomed to visiting web sites every day to search
  for and read about news items that match her interests.</p>

  <p>...</p>

  <h2>5.2. Web Publishing</h2>

  <h3>5.2.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0083.html">Saying
  things about web resources</a><br /></h3>

  <h4>Description</h4>

  <p>Frannie and Zoe live in different countries and keep in daily contact via
  IRC. Zoe wrote an IRC bot that they use to keep track of things they say
  about web pages. Frannie wants to be able to republish some of the things she
  says in IRC on her weblog. So Zoe tells her about a server that accepts and
  agrees to host documents that describe what they say about web pages, and
  their IRC robot sends those documents periodically to the server.</p>

  <p>Frannie programs her weblog software to query the server that hosts their
  annotation documents. The server returns all the assertions Frannie and Zoe
  have made about webs page that Frannie writes about in her weblog; Frannie's
  weblog software then publishes the things they've said as comments.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Interoperability</strong>. Using RDF, the DAWG query language
    and data access protocol, Frannie and Zoe are able to build several
    different kinds of software systems by passing RDF documents as
    messages.</li>

    <li><strong>Content Reuse</strong></li>
  </ul>

  <h3>5.2.2. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0094.html">Discovering
  what people say about news stories</a></h3>

  <h4>Description</h4>

  <p>Abelard, an independent publisher of web publications, often needs to
  query an arbitrary list of RDF storage servers for assertions about a set of
  URIs he cares about; the URIs identify Abelard's web publications. The RDF
  storage servers are RSS feed aggregators. Abelard wants to use RDF to keep
  track of the things people say in weblogs about his publications.</p>

  <p>Abelard's client software includes support for three different query
  languages. Abelard's client software connects to each RDF storage server and
  determines whether it supports one of the three query languages it knows
  about. Abelard's client software chooses, based on priorities set by Abelard,
  to send different queries to different servers.</p>

  <p>Heloise, an aggregator of RSS feeds, publishes RDF (extracted from RSS
  feeds) on the Web using an RDF storage server. Heloise's server supports
  several RDF query languages.</p>

  <p>Heloise's server publishes its supported query language available in a
  machine readable form. It negotiates with clients in order to choose the most
  appropriate query language that they have in common.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Automated Resource Annotation Discovery</strong>. Abelard can
    use software to automate the process of tracking the things people say on
    the Web about his publications.</li>

    <li><strong>Preserve existing human investment</strong>. Abelard can
    formulate queries in a variety of query languages, which preserves his
    existing investment. Abelard's software vendor and Heloise's software
    vendor can develop and sell extensible, relatively generic systems. In
    addition to client-server coordination about query languages, they can also
    negotiation other provisioning or service capabilities, including RDF
    serialization formats, query context support, query inferencing support,
    access control models, etc.</li>

    <li><strong>Frictionless Information Exchange</strong>. Abelard and Heloise
    are able to exchange third party data in an interoperable fashion without
    requiring out-of-band, human negotiation about capabilities.</li>
  </ul>

  <h2>5.3. Financial Services</h2>

  <h3>5.3.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0184.html">Tracking
  accounts and customers</a></h3>

  <h4>Description</h4>

  <p>Bartleby manages an accounting firm that has aggressively adopted Web and
  Semantic Web technology. The firm stores information about its customers,
  other companies, and competing accounting firms in an RDF storage server; and
  it relates these entities via predicates <code>acct:accountsFor</code> and
  <code>acct:hasCustomer</code>. Bartleby wants to retrieve the names of all
  the firms which either maintain accounts for military suppliers or maintain
  accounts for the military itself.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Real world queries</strong>. The ability to use disjunction or
    union is key for real world problems. Requiring users to join their own
    results guarantees that a DAWG-QL will be confined to programmatic
    interfaces as parts of large systems.</li>
  </ul>

  <h2>5.4. Urban Planning</h2>

  <h3>5.4.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0059.html">Exploring
  my neighborhood</a></h3>

  <h4>Description</h4>

  <p>Jose learns that the U.S. Census Bureau provides some very interesting
  geographic data in its public domain Tiger database. Jose moves to a new home
  in the Thomas Circle neighborhood of Washington, DC. Jose wants to find out
  the latitude, longitude, name, and type of everything within 50 miles of his
  new home.</p>

  <p>Rather than downloading all the Tiger database files, unziping them,
  reading the docs, writing some software, and so on, Jose sends a DAWG-QL
  query to the Census Bureau's new RDF storage server and requests that the
  results be passed to an XSLT transformation service so that he can print the
  resulting XHTML.</p>

  <h4>Benefits</h4>

  <p>...</p>

  <h2>5.5. Intelligence</h2>

  <h3>5.5.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0190.html">Finding
  unknown human persons</a></h3>

  <h4>Description</h4>

  <p>Smiley works for a governmental intelligence agency. As part of his job as
  an analyst of raw human intelligence, he needs to be notified whenever the
  knowledge base contains information about people matching various properties:
  last known location, often visited web sites, and political associations.</p>

  <p>Smiley uses his web browser to setup a regular query over several
  knowledge bases by filling out a web form. Whenever there are new matches for
  Smiley's query in the knowledge base, Smiley receives an email with URIs to
  resources about the new matches; and Smiley's personal RSS feed is also
  updated with the new matches, since he uses an RSS aggregator to gather news
  every day.</p>

  <p>Since Smiley's query will operate over knowledge bases structured by
  several different ontologies, Karla, the staff programmer for Smiley's group,
  builds Smiley's query to look for <code>rdfs:subPropertyOf foaf:Person</code>
  (expecting to find properties like
  <code>terror:RegisteredForeignAgent</code>,
  <code>terror:TerroristSuspect</code>, and
  <code>humint:UnidentifiedPerson</code>). Smiley's staff programmer uses the
  DAWG-QL and the <code>foaf:Person</code> predicate, as well as several
  others, to formulate Smiley's query.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Integration</strong>. Since the system that Smiley and Karla
    have access to sits in front of a constantly evolving, heterogenous
    collection of knowledge bases, they don't want to have to update Smiley's
    query each time a new KB is available. They rely on DAWG-QL's support for
    <code>rdfs:subPropertyOf</code> to find knowledge rooted at
    <code>foaf:Person</code>, which government agencies have agreed to use as a
    common parent property to represent natural persons.</li>
  </ul>

  <h2>5.6. Supply Chain Support</h2>

  <h3>5.6.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0200.html">Finding
  information about motorcycle parts</a></h3>

  <h4>Description</h4>

  <p>Endeavour, a Triumph Motorcycle dealer, maintains a database that
  describes spare and replacement parts, including their properties and
  relationships, needed to repair its motorcycles. Ev, a Triumph repair person,
  is working on a motorcycle and a diagnostic tool produces a report that
  identifies a faulty part.</p>

  <p>Ev goes to a query interface to the vendor's parts database and asks "tell
  me about this part". In response, Ev recieves a humanly-readable description
  of the part, which provides sufficient information to determine how to obtain
  the part and whether any other dependent parts must also be replaced at the
  same time.</p>

  <h4>Benefits</h4>

  <ul>
    <li><strong>Manage complex class-property relations</strong>.</li>
  </ul>

  <h2>5.7. Software Development</h2>
  <!-- Add OWL: http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0098.html -->

  <h3>5.7.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0032.html">Finding
  input and output documents for test cases</a></h3>

  <h4>Description</h4>

  <p>Nada, a Semantic Web developer, has had a bug report from a valued user
  that indicates that a software tool is failing to correctly the N3
  representation of some of the RDF core test cases correctly. Nada wants to
  create a list of input and output documents for each of the approved test
  cases from the RDF core test suite. The list of tests resides in a single
  file.</p>

  <h4>Benefits</h4>

  <p>The value is the systematic processing of the RDF core manifest file with
  a result which is one line per input/output pair so that a script can easily
  be written to create the next stage - reading the input document, writing it
  and checking it. Writing a query, feeding it to a query processor is much
  quicker than writing a custom program to do the same.</p>

  <h3>5.7.2. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0045.html">Describing
  software configurations</a></h3>

  <h4>Description</h4>

  <p>Grace, an open souce developer, is developing a new email client, and the
  system uses a lot of configuration settings and data, some of which may be
  relevent to any email client and some of which is specific to this particular
  one. Grace would like to record all configuration data as RDF. She
  understands the basic RDF data model and knows precisely the structure of the
  information she's interested in: <code>local-username email:hasAccount
  account</code>, <code>account email:hasServer server</code>, and <code>server
  email:usesProtocol protocol</code>. But she has no expertise in programming
  graph algorithms or manipulating RDF programmatically. She would like to
  retrieve those aspects of the configuration files in which she is interested,
  i.e. the server and its protocol for a particular account or username.</p>

  <h4>Benefits</h4>

  <p>Demonstrates the value of programmatic access to local RDF
  repositories.</p><!-- Protocol use case: ask the same question of several sources:
http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0224.html

I think we've got this covered already.

This covers the same protocol issue but from the "tell me about..."
class of queries.  One of Patrick's probably is an example too.
http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0022.html
 -->

  <h2>5.8. Transportation</h2>

  <h3>5.8.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0031.html">Avoiding
  traffic jams</a></h3>

  <h4>Description</h4>

  <p>Niel wants to drive, during heavy rush hour traffice in Atlanta, GA, from
  his home to his office. His new car has both Bluetooth and wireless internet
  access. His car makes three queries to public RDF storage servers on the Web:
  the first for a up-to-date description Atlanta road conditions and
  construction projects; the second for an updated description of traffic jams;
  the third for an updated description of Atlanta roads suffering inclement
  weather.</p>

  <p>Based on this information, Niel's car suggests a different route to work,
  cutting his commute time by 12%.</p>

  <h4>Benefits</h4>

  <p>...</p>

  <h3>5.8.2. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJan/0009.html">Finding
  the cheapest flight from Boston to Chicago</a></h3>

  <h4>Description</h4>

  <p>...</p>

  <h4>Benefits</h4>

  <h2>5.9. Health Care</h2>

  <h3>5.9.1. <a href=
  "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0003.html">Ordering
  an x-ray</a></h3>

  <h4>Description</h4>

  <p>Amy, an oncologist, enters an order for a chest x-ray. She works in a
  large, multi-campus hospital with multiple radiology departments. The
  hospital complex uses RDF to describe the properties of its departments and
  the relations between them. For example,</p>

  <ul>
    <li>Campus A is a children's hospital</li>

    <li>Radiology department B is part of hospital A</li>

    <li>Radiology department C specializes in examinations of type D</li>

    <li>Urgent requests should be handled on the same campus, given
    specialization constraints</li>

    <li>Requesting department E has as its first collaboration choice
    department F</li>
  </ul>

  <p>Amy doesn't know or care about all of these relations or rules. She only
  wants to place an order and then learn where it will be executed.</p>

  <h4>Benefits</h4>

  <h4>Issues</h4>

  <p>This use case shows the need to add constraints in the query and shows the
  need for not only querying for information but also for a resolution of a
  problem: given constraints, imposing a sort order on some criteria. The
  difficulty is that that 'criteria' is not necessarily a data element in the
  rdf document but implied in the rules within the document.</p>

  <h2>5.10. Human Resources</h2>

  <h3>5.10.1. Find employees by group</h3>

  <h4>Description</h4>

  <p>A company classifies employees into three groups: management, support, and
  engineering. Each employee is assigned to exactly one of these groups. There
  exists an RDF store which encodes information about employees. This
  information also includes the office in which the employee works. For
  example, <code>#David hasGroup #Engineering</code> and <code>#David
  worksInOffice #Carlsbad</code>.</p>

  <p>The information is not complete; e.g. there may be employees whose group
  is not explicitly stated in the RDF store. The company also builds an OWL
  ontology to supplement their RDF data with semantic information. Among other
  things, this ontology contains the assertion that certain corporate locations
  contain no engineers, only management and support. A user wishes to query the
  RDF store to find all employees who are in either the management or support
  groups (and print out their names).</p>

  <p>While inferencing and OWL may be beyond the scope of this working group,
  this use case demonstrates the continuity from RDF queries to OWL queries.
  The user's question can be answered fairly well by a simple RDF store (with
  no OWL), but precisely the same query (in terms of the user's desire for
  information) can retrieve even better information if OWL data is
  available.</p>

  <p>Note that this use case is *not* subsumed by any other we are considering.
  The use of disjunction makes the OWL ontology incompatible with the naive
  "inferred triples" model.</p>

  <h1><a name="canreq" id="canreq">6. Candidate Technical Requirements</a></h1>

  <p>DAWG use cases are a pool from which to extract technical requirements.
  These requirements frame the technical scope of the working group's
  activities, including the delivery of a strawman query language at the end of
  the first phase of the working group. The working group will use the strawman
  query language as the starting point for design work in its second phase.</p>
  <!-- Rework these using the terms here:
  http://www.w3.org/2001/11/13-RDF-Query-Rules/terms -->

  <h2>6.1. General Requirements</h2>

  <h2>6.2. Query Language Requirements</h2>

  <ul>
    <li><a name="cr-01" id="cr-01">Queries with optional triples</a> (Related
    discussion: <a href=
    "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0213.html">1</a>
    and <a href=
    "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004JanMar/0185.html">2</a>.)
    [<strong>CR-01</strong>]</li>

    <li><a href=
    "http://www.w3.org/2001/11/13-RDF-Query-Rules/terms#goalChar_or">Disjunction</a>
    <a name="cr-02" id="cr-02">[<strong>CR-02</strong>]</a></li>

    <li>Queries with paths of two or more edges <a name="cr-03" id=
    "cr-03">[<strong>CR-03</strong>]</a></li>

    <li>Query results as graph entailment or treating the graph as a fixed
    object. <a name="cr-04" id="cr-04">[<strong>CR-04</strong>]</a></li>

    <li>Queries expressing arbitrary RDF data types. <!-- It's thought
    that there will be some limit to the supported dataypes, while
    others, like XSD types, will have widespread support. For
    datatypes not supported, it could fail or could send back an
    error. Others suggested that there might be a subset of minimum
    datatypes a query language should support. --> <a name="cr-05" id="cr-05">
    [<strong>CR-05</strong>]</a>.</li>

    <li>Queries expressible as URLs <a name="cr-06" id=
    "cr-06">[<strong>CR-06</strong>]</a></li>

    <li>Query results in user-selectable Internet Media Types <a name="cr-07"
    id="cr-07">[<strong>CR-07</strong>]</a>.</li>

    <li>Query results in RDF (i.e., closure) <a name="cr-08" id=
    "cr-08">[<strong>CR-08</strong>]</a></li>

    <li>Queries written in <a href=
    "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0012.html">RDF</a>
    <a name="cr-09" id="cr-09">[<strong>CR-09</strong>]</a></li>

    <li><a href=
    "http://www.w3.org/2001/11/13-RDF-Query-Rules/terms#goalChar_negFailure">Negation
    as failure</a>. (Queries for the <a href=
    "http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0008.html">non-existence</a>
    of one or more triples in a graph.) <a name="cr-10" id=
    "cr-10">[<strong>CR-10</strong>]</a></li>

    <li>Queries expressible in a syntax that is easily read and written by
    people. <a name="cr-12" id="cr-12">[<strong>CR-12</strong>]</a></li>

    <li>Queries should be executable against a local RDF storage service
    without network support (i.e., queries independent of any network
    protocol). [<strong>CR-13</strong>]</li>

    <li>Queries returning aggregate reports. [<strong>CR-14</strong>]</li>
  </ul>

  <h2>6.3. Protocol Requirements</h2>some kind of extensibility bandwidth
  efficiency:
  http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0095.html
  limit, orderining usecase:
  http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0096.html

  <h1><a name="relts" id="relts">7. Related Technologies and Standards</a></h1>

  <p>See the survey of existing RDF query language implementations: <a href=
  "http://www.w3.org/2001/11/13-RDF-Query-Rules/">RDF Query and Rules
  Framework</a>.</p>

  <ul>
    <li>RDF Core</li>

    <li>RDF Query languages

      <ul>
        <li>SQL-like</li>

        <li>Rule-like</li>

        <li>Path</li>
      </ul>
    </li>

    <li>SQL</li>

    <li>XQuery</li>

    <li>XPointer</li>

    <li>SOAP/XMLP and REST</li>
  </ul>

  <p class="contact">If you have questions about specific problems or issues in
  this document, contact <a href="mailto:kendall@monkeyfist.com">Kendall Grant
  Clark</a>.</p>
</body>
</html>

