W3C home > Mailing lists > Public > www-tag@w3.org > February 2009

Re: Question on the boundaries of content negotiation in the context of the Web of Data

From: ashok malhotra <ashok.malhotra@oracle.com>
Date: Wed, 18 Feb 2009 13:37:52 -0800
Message-ID: <499C7FB0.7070803@oracle.com>
To: Martin Nally <nally@us.ibm.com>
CC: www-tag@w3.org, www-tag-request@w3.org
Hi Martin:
I still live in New York but I'm in Redwood Shores periodically at 
Oracle HQ.
In fact we are hosting the TAG f2f there March 3-5.

It looks like we are going to be discussing the metadata/CN business in 
the coming weeks,
so go ahead and send the note.  Might as well have all the usecases on 
the table.

All the best, Ashok


Martin Nally wrote:
>
> Thanks, Ashok,
>
> Although I still work closely with people in RTP, I no longer live 
> there - I moved back to southern california (Orange County) in 2002 
> for family reasons (my wife is from this area). I would be very 
> interested in finding other opportunities to meet and discuss though. 
> Where in the world are you?
>
> This is the week of coincidences. The TAG discussions on content 
> negotiation are at the heart of one of our thorny problems trying to 
> build products on web architectures and technologies - your topic 
> below is at the heart of another of our thorny problems. I'm already 
> feeling a little sheepish about burdening the TAG with a long note on 
> one topic - I'm not sure if I should risk diving into a second one - 
> what do you think? If there is interest, I can describe the approach 
> we took (which is different from yours), which rocks we ran aground 
> on, and how we are attempting to get off the rocks.
>
> Best regards, Martin
>
> Martin Nally, IBM Fellow
> CTO, IBM Rational
> tel: (949)544-4691
>
> Inactive hide details for ashok malhotra 
> <ashok.malhotra@oracle.com>ashok malhotra <ashok.malhotra@oracle.com>
>
>
>                         *ashok malhotra <ashok.malhotra@oracle.com>*
>                         Sent by: www-tag-request@w3.org
>
>                         02/18/2009 03:40 PM
>                         Please respond to
>                         ashok.malhotra@oracle.com
>
> 	
>
> To
> 	
> Martin Nally/Raleigh/IBM@IBMUS
>
> cc
> 	
> www-tag@w3.org, www-tag-request@w3.org
>
> Subject
> 	
> Re: Question on the boundaries of content negotiation in the context 
> of the Web of Data
>
> 	
>
>
> Hi Martin:
> Good to hear from you!
>
> I am interested in this thread from a somewhat different viewpoint which
> is as follows:
> suppose you have a URI (which may or may not point to a document) but
> has associated with it additional information about the resource. ( I
> call this additional information, metadata)  How do you  find and  access
> individual pieces of metadata?
>
> Mark Nottingham and Eran Hammer-Lahav have published 3 IETF drafts on
> this subject.
> Their methods, though, require 2 round-trips, one to get the URIs for
> all the metadata and the second to get at the specific metadata you
> want.  My thinking is that if you know the type of the metadata you
> want, you can use content negotiation to specify that and you can access
> the metadata in a single round trip.
>
> When I suggested this some time ago, my hand was slapped and I was told
> that this was not a good use of CN.
> Now, it seems that some folks at least are thinking of broadening the
> use of CN and that may sanctify your design and other similar usecases.
>
> Are you at IBM RTP?  I ask because I will be there for a meeting March
> 10-12.
> This is the week after the TAG f2f and we may well have some interesting
> 'progress' to discuss.
>
> All the best, Ashok
>
>
> Martin Nally wrote:
> >
> > Hi, Ashok,
> >
> > I apologize in advance for the length of this email - this is
> > especially rude since I'm new to this forum. You know me personally,
> > of course, (its been a long time, I hope you are well) and so does
> > Noah Mendelsohn, but for those who don't, I work in IBM as the CTO of
> > the Rational software brand which develops products and services to
> > support our customers' software development needs.
> >
> > By coincidence I have been corresponding with Noah privately on the
> > same (or a closely related) question. In our example, we have used
> > HTML and RDF as the content types, rather than images and RDF, This is
> > more than a detail of the example for us - the HTML in question is our
> > product's AJAX web UI implementation. Below is a summary of the
> > discussion between me and Noah. You won't find any brilliant insights
> > from me to answer the question, but you will find an explanation of
> > why this question seems really, really important to us right now. You
> > will also find an appeal for help and advice. You may be amused or
> > horrified at my attempt at the bottom of the email to justify the
> > answer we want to hear despite the obvious objections.
> >
> > We are implementing products where the underlying data are exposed as
> > a web of resources accessed via HTTP. Our clients are implemented
> > using HTML and JavaScript in an AJAX style. Since both our data and
> > our UI are now on the web, we have the problem of how to relate the
> > two. We are aware that others have written about this topic – for
> > example we are aware of this document:
> > http://www.w3.org/TR/2008/NOTE-cooluris-20080331/. Unfortunately, the
> > guidance we are finding is not proving entirely satisfactory or
> > relevant to our situation, which is described below using examples
> > from GMail and YouTube. We're not implementing email or video-sharing
> > products at IBM Rational, but the parallel to our own products is
> > close enough to illustrate the point.
> >
> > The base URL for Gmail is http://mail.google.com/mail/ which appears
> > to redirect to http://mail.google.com/mail/#inbox. Within your inbox,
> > you can click on an email - if you do, Gmail will open your email and
> > the browser address bar will change to something like this:
> > http://mail.google.com/mail/#inbox/11f804dfae358bd9. An improbable
> > number of POSTs and GETs go on under the covers before this URL
> > appears and none of them would make you expect that this URL would
> > appear, but somehow it does - GMail is not simple. Security will
> > hopefully stop you from following this link to this email, but I can
> > do it. So GMail provides me with URLs for each of my emails of the
> > form http://mail.google.com/mail/#inbox/11f804dfae358bd9, and it makes
> > those URLs appear in the address field, which is where users would
> > expect they would appear. That is fine if I'm a human that wants to
> > interact with GMail, but what if I'm a client that wants to get at the
> > email itself, not the GMail UI for the email? The products we are
> > working on must support both scenarios. One option that GMail could
> > implement is to offer a "link" button like the one in Google Maps that
> > Noah brought to my attention, but instead of putting the "UI url" in
> > there, it could put the "data url". In fact, YouTube does something
> > close to this - look at the content of the "embed" field on a YouTube
> > page - it includes the URL of the video separate from the URL of the
> > YouTube page that embeds the video.
> >
> > Just for the sake of an example, lets assume we, and GMail, did like
> > YouTube does, and assume the matching "data url" for the email above
> > is http://mail.google.com/11f804dfae358bd9. Am I now in good shape?
> > From one point of view, it's not bad, because I have both URLs for my
> > email, one for a UI for humans using a browser and a second one for
> > other purposes. If I can remember which URL is for which purpose,
> > always use the right one at the right time, always email both of them
> > to others, so they can do the same and so on, then it works. Not only
> > is this a pain, but uncaught mistakes will have negative consequences,
> > like defeating searches if the wrong URL is stored in data. Much
> > simpler would be to have a single URL that just always did the right
> > thing. This is why the pattern documented here is attractive:
> > http://www.w3.org/TR/2008/NOTE-cooluris-20080331/#r303gendocument. If
> > we took this approach, we would only need the “data URL” -
> > http://mail.google.com/11f804dfae358bd9 in the GMail example. If I
> > pasted that into my browser, content negotiation could get back the
> > same HTML that is returned by real GMail URL. On the other hand if I
> > gave the URL to some other sort of program that wanted an RDF
> > repesentation or an XML representation, content-negotiation would
> > again give the right thing. This is a huge improvement in usability of
> > my solution.
> >
> > So why don't we just implement thisdesign? The objection, pointed out
> > by several of our developers (and me) is that it's a distortion to say
> > that the GMail HTML returned by
> > http://mail.google.com/mail/#inbox/11f804dfae358bd9 is a
> > representation of the email. It's more reasonable to think of it as a
> > JavaScript program that turns around and does a bunch of further GETs
> > and POSTs in whose responses are somewhere buried a representation of
> > the email. I'm guessing this is why the authors of the paper cited
> > above advised against using content negotiation for this case - it
> > seems like a hack that is not in the spirit of the web architecture.
> >
> > The solution we are considering – and that we’d like some feedback on
> > – is to use content-negotiation despite the objections. This design
> > has by far the best characteristics from a user perspective. If we had
> > less delicate design sensitivities, we’d probably just implement this
> > and not worry about justifying it - perhaps we are blind to problems
> > this will cause later. Rather than pick a different design with worse
> > user characteristics in order to fit the classic model, we choose
> > instead to invent a justification for why it’s ok, as follows.
> >
> > “HTML started life as a language for representations of web documents.
> > Browsers were user agents that took HTML representations of web
> > documents, displayed them to users and allowed then to navigate the
> > web. This is still the basis of much of the web. Over time, with the
> > addition of forms, JavaScript and AJAX, HTML acquired the capabilities
> > of a full programming language and the browsers acquired the
> > characteristics of a programmable run-time environment. Many modern
> > HTML response documents are no longer representations of anything that
> > is meaningful to users. Instead of being representations of resources
> > that are interpreted by the browser acting as a user agent, these HTML
> > documents are implementations of specialized user agents that execute
> > in the browser as a run-time platform. Given that HTML and the browser
> > now have two distinct meanings and roles – 1) document
> > representations/user agents and 2) implementations of specialized user
> > agents/run-time platforms – we permit our servers to take a more
> > liberal view of the meaning of an HTTP GET when the accept header
> > includes text/html. Our server may either return an HTML
> > representation of the requested document, or it may return the
> > implementation of a specialized user agent implemented in HTML for
> > that resource.”
> >
> > Please advise us. Is there another technical approach that we should
> > consider that has attractive characterisitcs for users? Is there a
> > better way of rationalizing the design choice that appears to work
> > best operationally?
> >
> > Best regards, Martin
> >
> > Martin Nally, IBM Fellow
> > CTO, IBM Rational
> > tel: (949)544-4691
> >
> >
> > www-tag-request@w3.org wrote on 02/18/2009 09:48:00 AM:
> >
> > > Jonathan, you said
> > >
> > > "I would think that CN is used (and intended to be used) not just for
> > > choosing between semantically equivalent entities, but also for
> > > semantic subsetting, such as abbreviated representations for mobile
> > > devices, low-resolution displays, audio vs. written, etc. Subsetting
> > > is certainly *not* equivalence."
> > >
> > > So, not equivalence but derived from?  I'm wondering how far we can
> > push this.
> > > Can CN we used to select say between a picture of a house and a text
> > > description?
> > > I was told NO, but perhaps we are rethinking this.
> > >
> > > All the best, Ashok
> > >
> > >
> > > Jonathan Rees wrote:
> > > > I started to turn this into a request for TAG telecon agendum, 
> and got
> > > > stuck on the word "equivalent".
> > > >
> > > > Just to make sure I understand you - by "equivalent" are you 
> referring
> > > > to HTTP 2616 section 13.3.3:
> > > >
> > > >    Entity tags are normally "strong validators," but the protocol
> > > >    provides a mechanism to tag an entity tag as "weak." One can
> > think of
> > > >    a strong validator as one that changes whenever the bits of an
> > entity
> > > >    changes, while a weak value changes whenever the meaning of an
> > entity
> > > >    changes. Alternatively, one can think of a strong validator 
> as part
> > > >    of an identifier for a specific entity, while a weak validator is
> > > >    part of an identifier for a set of semantically equivalent
> > entities.
> > > >
> > > > and are you specifically asking about the use of entity tags? 
>  Or were
> > > > you really asking the broader question about the use of CN that 
> people
> > > > like me were eager to answer? Because I think these are two 
> different
> > > > questions.
> > > >
> > > > If you're asking for advice on "good practice" around the use of
> > > > entity tags, the only example given in RFC 2616 is that of hit
> > > > counters, which seems quite a long way from "semantic 
> equivalence" of
> > > > an image and some RDF. I'd be surprised if anyone would argue in 
> favor
> > > > of allowing a cached PNG to be returned when RDF was available and
> > > > preferred. On the other hand, the question of under which
> > > > circumstances (if any) you are advised to use CN to choose 
> between PNG
> > > > and RDF has a very different character. Perhaps some server software
> > > > has chosen to assume co-representations are equivalent for caching
> > > > purposes, but if this is allowed by RFC 2616 I'd be very 
> interested to
> > > > hear the argument.
> > > >
> > > > I would think that CN is used (and intended to be used) not just for
> > > > choosing between semantically equivalent entities, but also for
> > > > semantic subsetting, such as abbreviated representations for mobile
> > > > devices, low-resolution displays, audio vs. written, etc. Subsetting
> > > > is certainly *not* equivalence.
> > > >
> > > > Obviously there is appeal to a slippery adjective "semantic", which
> > > > you're never going to pin down in a manner that is both rigorous and
> > > > general, but you could legitimately ask someone to list some 
> positive
> > > > and negative examples and situations where differences between
> > > > representations might or might not matter to users and/or
> > > > applications.
> > > >
> > > > Jonathan
> > > >
> > > > On Thu, Feb 12, 2009 at 7:29 AM, Michael Hausenblas
> > > > <michael.hausenblas@deri.org> wrote:
> > > >  
> > > >> Dear TAG members, dear subscribers,
> > > >>
> > > >> I would like to ask you about your opinion on the following
> > > scenario. Please
> > > >> note that (1) though I'm a member of the W3C Media Fragments WG
> > Ispeak only
> > > >> for myself, and (2) that all URIs used in the following are
> > dereferenceable
> > > >> and made out of 100% recycled electrons.
> > > >>
> > > >> Given three URIs, namely,
> > > >>
> > > >> <http://sw-app.org/sandbox/house>
> > > >>
> > > >> <http://sw-app.org/sandbox/house.png>
> > > >>
> > > >> <http://sw-app.org/sandbox/house.ttl>
> > > >>
> > > >> is it 'allowed' (that is, does it break the Web architecture) if
> > one does
> > > >> the following:
> > > >>
> > > >> $curl -I -H "Accept: image/png" http://sw-app.org/sandbox/house
> > > >> HTTP/1.1 200 OK
> > > >> Date: Thu, 12 Feb 2009 12:12:39 GMT
> > > >> Server: Apache/2.2.3 (CentOS)
> > > >> Content-Location: house.png
> > > >> Vary: negotiate,accept
> > > >> TCN: choice
> > > >> Last-Modified: Thu, 12 Feb 2009 11:54:07 GMT
> > > >> ETag: "5c0fd-2deb-462b760a7f5c0;462b77ce8a040"
> > > >> Accept-Ranges: bytes
> > > >> Content-Length: 11755
> > > >> Connection: close
> > > >> Content-Type: image/png
> > > >>
> > > >> $ curl -I -H "Accept: text/turtle" http://sw-app.org/sandbox/house
> > > >> HTTP/1.1 200 OK
> > > >> Date: Thu, 12 Feb 2009 12:13:01 GMT
> > > >> Server: Apache/2.2.3 (CentOS)
> > > >> Content-Location: house.ttl
> > > >> Vary: negotiate,accept
> > > >> TCN: choice
> > > >> Last-Modified: Thu, 12 Feb 2009 11:54:06 GMT
> > > >> ETag: "5c0fc-173-462b76098b380;462b77ce8a040"
> > > >> Accept-Ranges: bytes
> > > >> Content-Length: 371
> > > >> Connection: close
> > > >> Content-Type: text/turtle
> > > >>
> > > >> Please note that I don't ask if this works. It does. Obviously. The
> > > >> question, to put it in other words, is: is the PNG
> > *representation* derived
> > > >> via conneg from the generic resource
> > <http://sw-app.org/sandbox/house>
> > > >> equivalent to the RDF in Turtle?
> > > >>
> > > >> If not, why not? If it is, can you please point me to a finding,
> > note, a
> > > >> specification, etc. that 'normatively' defines what
> > 'equivalency'really is?
> > > >>
> > > >> Cheers,
> > > >>      Michael
> > > >>
> > > >> --
> > > >> Dr. Michael Hausenblas
> > > >> DERI - Digital Enterprise Research Institute
> > > >> National University of Ireland, Lower Dangan,
> > > >> Galway, Ireland, Europe
> > > >> Tel. +353 91 495730
> > > >> http://sw-app.org/about.html
> > > >>    
> > > >
> > > >  
> > >
> >
>
>
Received on Wednesday, 18 February 2009 21:39:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:12 GMT