- From: Kevin Smathers <kevin.smathers@hp.com>
- Date: Tue, 17 Jun 2003 11:33:24 -0700
- To: "Butler, Mark" <Mark_Butler@hplb.hpl.hp.com>
- Cc: www-rdf-dspace <www-rdf-dspace@w3.org>
- Message-ID: <3EEF5EF4.3060706@hp.com>
More changes for technologies.tex and simile.bib are attached. Cheers, -kls Butler, Mark wrote: >Hi Kevin > >I have incorporated your changes, except for this line: > > > > >>-----Original Message----- >>From: Kevin Smathers [mailto:kevin.smathers@hp.com] >>Sent: 17 June 2003 00:03 >>To: www-rdf-dspace >>Subject: Comments on Motivating problems >> >> >>Hi all, >> >>More to follow, but here are my first set of comments from a quick >>review of the current document status. >> >> >>-- >>======================================================== >> Kevin Smathers kevin.smathers@hp.com >> Hewlett-Packard kevin@ank.com >> Palo Alto Research Lab >> 1501 Page Mill Rd. 650-857-4477 work >> M/S 1135 650-852-8186 fax >> Palo Alto, CA 94304 510-247-1031 home >>======================================================== >>use "Standard::Disclaimer"; >>carp("This message was printed on 100% recycled bits."); >> >> >> >> -- ======================================================== Kevin Smathers kevin.smathers@hp.com Hewlett-Packard kevin@ank.com Palo Alto Research Lab 1501 Page Mill Rd. 650-857-4477 work M/S 1135 650-852-8186 fax Palo Alto, CA 94304 510-247-1031 home ======================================================== use "Standard::Disclaimer"; carp("This message was printed on 100% recycled bits.");
? patch2.txt ? relevantTechnologies/foo Index: simile.bib =================================================================== RCS file: /cvs/simile/docs/simile.bib,v retrieving revision 1.26 diff -u -r1.26 simile.bib --- simile.bib 17 Jun 2003 13:32:55 -0000 1.26 +++ simile.bib 17 Jun 2003 18:31:34 -0000 @@ -796,3 +796,13 @@ organization="MIT", howpublished="\url{http://ocw.mit.edu}"} +@MISC{shibboleth, + TITLE="{Shibboleth Initiative}", + organization="Internet2", + howpublished="\url{http://shibboleth.internet2.edu/shib-intro.html}"} + +@MISC{Damianou, + TITLE="{A Policy Framework for Management of Distributed Systems}", + author="Nicodemos Damianou", + organization="Imperial College's Policy Research Group", + howpublished="\url{http://www-dse.doc.ic.ac.uk/Research/policies/ponder/thesis-ncd.pdf}"} Index: relevantTechnologies/technologies.tex =================================================================== RCS file: /cvs/simile/docs/relevantTechnologies/technologies.tex,v retrieving revision 1.3 diff -u -r1.3 technologies.tex --- relevantTechnologies/technologies.tex 17 Jun 2003 13:33:09 -0000 1.3 +++ relevantTechnologies/technologies.tex 17 Jun 2003 18:31:34 -0000 @@ -496,7 +496,8 @@ The other naming problem is when we are using URLs to describe documents and their subcomponents i.e. identifying resources smaller than the atomic document. -Doing this with a URL is arguably convenient, in that it permanently +Doing this within a URL (according to some specification of the URL +semantics) is arguably convenient, in that it permanently binds the smaller object to its containing object, giving you the semantics that if you are looking for the smaller object it is a good subgoal to look for the containing object. @@ -515,53 +516,42 @@ that depict violence, and remove them during playback of the movie. Obviously the metadata read by the DVD player will have to include -data that identifies the parts of the overal movie that represent the -selected content. Using a URL to represent the content is insufficient --- we can't create new URL's for every possible subregion of a movie, +data that identifies the parts of the overall movie that represent the +selected content. Using an opaque URL to represent the content is +insufficient -- we can't create new URL's for every possible subregion +of a movie, and even if we did so, such an approach wouldn't help in finding an playing back parts of the movie that do not correspond to that URL. - -Naming, as is being described in section 3.2.7, has nothing to do with -the URL for the asset. The purpose of naming is to create a linkage +The purpose of naming in this context is to create a linkage between the metadata and the movie subregion. -Stepping out of our example, the purpose of Naming in this document -is to represent other assets in ways that URLs cannot. Such linkages -are neccessarily specific to the type of data being indexed so they -cannot be generalized to a single technology, but that doesn't mean -that we can't create a pattern around them. +Stepping out of our example, the purpose of Naming +is to represent other assets in ways that opaque URLs cannot. Such linkages +are semantically tied to the type of data that they index, so +so they cannot be generalized to a single technology, but that doesn't mean +that we can't create a usage pattern around them. While using URLs with semantics is one option, an alternative way to -specify a particular subpart of the movie is with a blob of RDF eg, -there is a resource foo (no semantics) and assertions -"foo fragment-of the-lord-of-the-rings", "foo start-offset 300", and -"foo end-offset 500". Whatever semantics I intend to place in the URL, I -can instead, without any loss of expressive power, place in a blob of -RDF statements. This leaves me with URLs containing no semantics at -all, which has a consistency I like. - -There are many different ways to represent the subgraph in question. You -have broken it up into three statements (and an implied statement of the -schema type), another implementation might use more statements or fewer. -In addition there are many other types of documents that could be named, -in whole or in part. - -The point of the Naming discussion is to map those statements to their -meaning, where the meaning is a subindex into a document. This makes -Name a specialization of Class. - -The issue here is not so much whether or not URNs are appropriate for -each of the names, but rather: - -by what mechanism are the names generated and assigned? -which of the URNs are URI's, and which are URL's? How can I tell? -\begin{itemize} -\item Here "A" could be a URL, but if I wish it to be location-independent -I may assign a URN and use some mapping service (PURLs, Handles). - "A" is a URL - do an http:get -\item "B" is probably a URN, not useful to attempt to resolve it. - I must map to some query on the contents of the graph represented by contents of "A". -\item "C" could be either a URN or a URL. How do I find the schema? Not sure +specify a particular subpart of the movie is with an RDF subgraph. Suppose +there is a resource some-uri (no semantics) and assertions +"some-uri fragment-of the-lord-of-the-rings", "some-uri start-offset 300", and +"some-uri end-offset 500". Whatever semantics I intend to place in the URL, I +can instead, without any loss of expressive power, place in an +RDF subgraph. This leaves the URL free of semantics and thereby confers the +benefit of restricting semantic data to RDF as the solitary format. + +Another issue here is by what mechanism are the names generated and assigned. +Which of the URNs are URI's, and which are URL's, and how can I tell? +\begin{itemize} +\item Here some-uri could be a URL, but if I wish it to be location-independent +I may assign a URN and use some mapping service (PURLs, CNRI Handles). + If some-uri is a URL do an http GET to retrieve the metadata needed to + locate the subregion of film. +\item The predicates fragment-of, start-offset, and end-offset are +probably URN's, it is not useful to attempt to resolve them. + Collectively these predicates map to some query on the contents of the + graph represented by contents of some-uri. +\item The object the-lord-of-the-rings could be either a URN or a URL. Either way there must be some means of discovering the schema or type by which the object together with the semantic subgraph should be interpreted. \end{itemize} \subsection{Processing Models} @@ -613,7 +603,7 @@ support automated discovery. \end{itemize} -It is possible to higlight this with some other processing models: +It is possible to highlight this with some other processing models: \subsubsection{Resource directory discovery via namespace processing model} @@ -780,7 +770,16 @@ \subsection{Classification} -One important issue is classification, but it has several different axes: +One important issue is classification, that is the specification of +the type of a referenced resource, and also of the role of that resource in +the current context. In RDF graphs the classification role of a +resource is obtained from the predicate that links that resource +to the current graph. When the linked resource is itself an RDF +graph then the type can be inferred from RDFS statements in the +referenced graph. When the linked resoruce is not an RDF graph there +must be some other mechanism for describing type. However, even +when type is clearly specified, the classification of a resource +has several different axes: \begin{description} \item [Metadata versus original versus abstract object] Classification @@ -809,6 +808,12 @@ \caption{\label{dissemninationdiagram}Dissemination} \end{figure} +Dissemination implies a requirement for transformational repurposing +of the data and metadata stored within the system. A graph might be +represented to a web browser using a directory metaphor, or it might +be transformed into embedded HTML to start a multimedia player to turn +MP3 data into sound. + \subsection{To Humans} Current thinking is to have an ontology describing how metadata is @@ -818,7 +823,63 @@ \subsection{To software / agents} -Policy-compliant dissemination +The most successful strategy for defining application to application +communication formats, permissions, and policies, has been application +specific. If a web site wants to share a service with another web site +then the authors of the respective sites agree on a set of ad-hoc +interfaces. Web services standards such as SOAP and XML-RPC are starting +to make a dent in these custom interfaces, but the policy and security +implementations have remained the custom work of each service provider. + +\subsubsection{Security and Policy} + +One approach to a universal view of distributed security is to express +permissions or policies in terms of split capabilities (see Alan Karp's +tech report). By communicating in terms of these capabilities cooperating +systems can enforce access controls that have been compiled from policies +which they don't themselves understand. + +A more pressing issue is not how to implement access control in general, +but rather, how to express and enforce dynamic access control policies over +RDF graphs. + +Security must be tightly coupled with Simile's access policy and event +management mechanisms. That is, access policies, event mechanisms, +and security must be interdependent. In particular Simile must define +how one can use policies to restrict access to (RDF) subgraphs. For +example, given a set of statement identifiers (e.g., reified statements), +how could one use a domain expression to specify a subset of the set +to be the target of a security policy? + +By extension, Simile may need to express and enforce ``implied'' +policies. For example, a new statement +being added to a store could cause a policy to now apply to existing +statements. + +Note that the question of how to express access policies +for RDF is different from (and more interesting than) the question of +how to express access control policies in RDF. The latter is simply +a presentation issue, and is common to all uses of RDF, while the former +strikes at the heart of what security means as applied to RDF. + +Simile has not yet committed to a model for its security, or for that matter for its policy mechanisms. In particular, the following issues need to be explored in the context of Simile: + +\begin{itemize} +\item How should we express policies over RDF subgraphs? (This seems a particularly rich area.) + +\item How can we enforce policies over RDF subgraphs? (This seems a particularly rich area.) +\item What should be the granularity of access policies? (e.g., resource, subgraph, model?) +\item What should be the scope of policies ? +\end{itemize} + +We all agreed that from Genesis's perspective, the granularity could be model-based. That is, that if Genesis provided a means of controlling access to a given model, then Simile could use it to implement its own finer-grained mechanisms. + +There are to two specific pieces of related work that are interesting: + +\begin{itemize} +\item The first is A Policy Framework for Management of Distributed Systems \cit{Damianou}, the thesis of Nicodemos Damianou, whose advisor at Imperial College, University of London, is Morris Sloman (and who has also collaborated with Dr. Emil Lupu). Daminanou is a member of the Imperial College's Policy Research Group, which has proposed Ponder, an object-oriented, declarative, programming language for specifying distributed system management and security policies. The Policy Research Group also provides associated tools for editing, compiling and managing policies in a distributed system. (Note that Francisco Garcia from Agilent is a program co-chair for the upcoming IEEE workshop on Policies for Distributed Systems and Networks, where last summer's SEED Lalana Kagal has a paper.) +\item Also of interest is the Internet2 Shibboleth initiative \cite{shibboleth}. The goal of Shibboleth is to develop an open, standards-based solution for organizations to exchange information about their users in a secure and privacy-preserving manner. +\end{itemize} \section{Distributed Resources} @@ -991,12 +1052,15 @@ For example it may hide the syntax used to express the schema, such as RDFS or OWL from the user and instead present the schema graphically. +Codified best practices, whether in the form of processes or wizards, +will go a long way toward making the schema system easy to use. + \subsection{Simplify} Applying complex classification schemes on resources could negatively impact users' ability to search for resources. It -is important to hide unnecessary detailS until userS need it. This -may be done in several ways: +is important to hide unnecessary details, and then to introduce complex +operations gradually as users need it. This may be done in several ways: \begin{itemize} \item By providing default behaviors that allow users to carry out typical @@ -1016,7 +1080,7 @@ for a user to repeat the same query multiple times on different systems. \item It may be desirable to provide mechanism to allow users to update several records simultaneously as a set rather than individually when -performing instance versioning. +performing schema versioning. \item If possible, tasks like merging of records or mapping between schemas and vocabularies should be automatic and only require user intervention when absolutely necessary. @@ -1039,17 +1103,29 @@ other uses and then making recommendations based on the items other users have searched for. -There are some limitations with the current versions of such systems. -Most notably they have no way for a user to denote the context for -their search: therefore on Amazon a user may search for very different +Such systems do have limitations. Most notably it can be difficult to +infer the context for a search: therefore on Amazon a user may search +for very different items if they are purchasing an item for a relative compared to when -they are purchasing items for themselves. Therefore making recommendations +they are purchasing items for themselves. Making recommendations based on the entire users history may not be as effective as making -recommendations based on recent search terms from the user. Also there -are potential privacy issues that need to be addressed when recording -user behavior, whether it is occurring with or without their knowledge. +recommendations based on recent search terms from the user. + +Also when combining user behaviors into groups it is important for +many applications to protect privacy. When recording +user behavior, whether it is occurring with or without their knowledge, +the system must be able to remove any personally identifying information +from the collective behavior predictors. + +Other factors include whether to have users self-categorize, or to +infer categorization from a best fit of behavior. There must also be +mechanisms for defining and refining these categories. \subsection{Policy Expression} + +In user interfaces policy expressions need to be translated into human +readable text that simply and concisely informs the users of their +rights and limitations with respect to accessed data. \subsection{Misc}
Received on Tuesday, 17 June 2003 14:34:19 UTC