Re: See Other from Hugh Glaser on 2012-03-28 (public-lod@w3.org from March 2012)

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Wed, 28 Mar 2012 21:30:12 +0000
To: Jesse Weaver <weavej3@rpi.edu>
CC: public-lod community <public-lod@w3.org>
Message-ID: <EMEW3|36db5bd9c4be6e9b60df544146067e36o2RMUI02hg|ecs.soton.ac.uk|329CE59E-26C6->
Many thanks, Jesse,
Very helpful to see the blow by blow of the process and some history.
I hope you also saw my subsequent message where I emphasised I was not knocking Facebook - there are similar experiences for any site that conforms to the current standards/best practices.

Just a few comments.
So the graph.facebook.com was a decision of Facebook, before the Linked Data provisions - yes, I now recall that.

But if FB is not interested in users being able to find their FB LD IDs, since they are only for developers, then I guess none of this matters.
My experience was as a FB user who wanted to put a nice ID in my signature, not as a developer.
And if FB has made a policy decision not to return anything other than HTML from www.facebook.com, even with Conneg, then so be it.
But I'm not sure what Facebook's story is for what I should do when I want to get a FB LD ID.
Exactly how do I explain to people how to tell me what to put in sameAs.org for them, for example?
I can tell you that a well-known SemWeb person sent me the wrong thing earlier today!
I had to hack the text of the URI (URIs are meant to be opaque, of course), and do some other stuff, to get to the LD ID.
I still think that he (oops, that narrows the identity a bit) should have been able to email me the thing in the address bar, and I should have simply been able to use that.

In response to your interest in "how Facebook could have done things better".
The main thing I had in mind was something like 
curl -i -L -H Accept:text/turtle http://www.facebook.com/501730978
would return 200 and the turtle.
And I guess that
curl -i -L -H Accept:text/turtle http://www.facebook.com/danbri
would 30x to 
http://www.facebook.com/501730978
and thence to
the same turtle.
(It could redirect the other way, of course, as the web pages do.)

I love the idea that you delivered RDF as "just another format", plugging in to the structure already there with a format conversion shim.
That is a real triumph.
I often tell people they can do that, and I will use that as a real example.

Best
Hugh

On 28 Mar 2012, at 21:25, Jesse Weaver wrote:

> Hi Hugh.
> 
> I have avoided participating in these httpRange-14 debates, but since you have brought the Facebook Linked Data into the discussion, I feel compelled to respond.  The goal (or my goal) regarding Facebook's Linked Data provided through its Graph API was to allow for sensible Linked Data RDF to be published in a way that did not interfere with maintenance of existing code and in a way that would require very little maintenance in the future.  Please see my inline comments below, and also some comments at the end.
> 
> On Mar 28, 2012, at 6:44 AM, Hugh Glaser wrote:
> 
>> Executive summary:
>> TAG, please don't come back with something that does not allow, or even encourage, sites like Facebook to offer RDF back in return for:
>> curl -L -H Accept:application/rdf+xml https://www.facebook.com/hugh.glaser
>> 
>> Challenge: Try telling me what to put in sameAs.org for the LD URI for you on Facebook.
>> 
>> Detail:
>> I support Jeni et al.'s Proposal, because it is an improvement, and seems to have some chance of success.
>> Actually, I am pretty sure I align with Giovanni and his ilk.
>> My preference is to lose the whole thing (and these discussions!) - but there is no point, I think, in proposing that because it has no chance of success.
>> 
>> When people talk about "users", they seem to mean developers.
> 
> With regard to Facebook's Graph API, it is indeed targeted toward developers (Linked Data or otherwise).
> 
>> The users I think of are the eyeballs that look at and manipulate the stuff on their screens, usually in a browser.
>> Also, when a posting on this list has:
>> "Well, if I wanted to do this, " or "Imagine…"
>> my own eyeballs sort of glaze over.
>> Well, there have been 6 years to do it or for someone else to actually feel the need to do it - if it hasn't blazed a trail in the huge range of Linked Data-enabled applications (irony intended) being used by users out there, then it probably isn't a very important use case.
>> 
>> My slightly shorter story (thanks Dan, that was great, and I read the whole thing!) involves Facebook as a LD site.
>> In fact, I think this story is complementary to Dan's, as it gives some view of the experience that Bob's users will get after Alice's consultation and the subsequent implementation.
>> This actually happened to me last night.
>> Recalling that I now have a LD ID on Facebook, I go to Facebook and get my ID (well, I think of it as my ID, and it's what I give anyone if they ask for a link to "me").
>> https://www.facebook.com/hugh.glaser
>> (I could stop there, as we all know I already have a problem, but …)
>> Being a brave little chap, before putting it in my signature as one of my LD IDs, I decide to check that this is OK, by pasting it into something that wants a LD ID, such as the W3C validator (in this case I use curl -H Accept:application/rdf+xml).
>> It actually gave a 200, so it must be OK, right?
>> Of course, this doesn't validate because the URI actually does 302 -> 200 and returns text/html in response to my curl.
>> 506 would have been possibly less helpful, by the way.
>> So I am done - nothing I can do now.
>> 
>> However, being not only brave, but also intrepid, I start googling for support.
>> I eventually (it wasn't easy), find that I should be using graph instead of www.
>> With excitement, I try
>> curl -i -L -H Accept:application/rdf+xml https://graph.facebook.com/hugh.glaser
>> Close, but no cigar.
>> I get text/javascript back.
>> More digging (I'll spare you the details)...
>> curl -i -L -H Accept:text/turtle https://graph.facebook.com/hugh.glaser
>> I cannot contain my excitement; I have some RDF at last!
>> So I can use https://graph.facebook.com/hugh.glaser as my Facebook LD ID.
>> Er, not quite.
>> The turtle this returns is
>> </720591128#>
>> 	user:id "720591128" ;
>> Ah yes, I knew I had a numeric ID, 720591128 - so it being late I guess my LD ID is https://graph.facebook.com/720591128
>> Of course, er no, not quite again.
>> I suddenly notice a little # lurking in the turtle.
>> So I finally decide that the URI I should put in my signature is
>> https://graph.facebook.com/720591128#
>> Of course, this is sufficiently ugly, compared with https://www.facebook.com/hugh.glaser
>> that I don't bother, and go to bed.
> 
> I'm surprised that perceived ugliness of a URI (although it is not so ugly to me; beauty is in the eye of the beholder) would deter someone from taking advantage of the Linked Data.  The only differences --- as you have pointed out --- is that graph should be used instead of www, the FBID 720591128 is used instead of hugh.glaser, and the Linked Data URI has (what I call) an empty fragment.  Here are the reasons for these differences:
> 1.  I think (without certainty) that it is Facebook's intention that everything at www.facebook.com be for human eyeballs.  Admittedly, there could be some RDFa, and for some pages, there is RDFa containing Open Graph Protocol markup (do not conflate the Open Graph Protocol and the Graph API).  "Raw" data is made available --- targeting developers --- via the Graph API at http://graph.facebook.com (if you click that link without adding a path, it will redirect to documentation).
> 2. The FBID is used instead of the relative "vanity URL" (e.g., /hugh.glaser) because not every user has a vanity URL, and even if each user did, not every *thing* has a vanity URL.  The Graph API provides more than just data about users, and to quote Facebook's documention ( https://developers.facebook.com/docs/reference/api/ ): "Every object in the social graph has a unique ID."
> 3. The use of the empty fragment is the easiest way to take advantage of how the Graph API works.  Prior to serving up text/turtle, the Graph API served up only JSON at, e.g., http://graph.facebook.com/720591128 .  That is the place to find data about you.  With little interference to existing code, when text/turtle is requested, the JSON is merely translated into text/turtle, making use of the internal system to provide meaningful semantics.  One of the problems is that a URI needs to be minted for instances (e.g., a user), and given httpRange-14, I have the choice of using a hash URI and returning 200 OK or using a slash URI and 303'ing to somewhere else.  Using the empty fragment seemed like the most acceptable option.  (See dialogue at the end of this email.)
> 
>> 
>> Now I'm not saying that the TAG is going to solve all these issues.
>> And there are lots of issues about 303 and # and RDFa …
>> 
>> But I think this is a real Use Case for a user, which should mean that the developer who provides this system (Facebook) is a Use Case for the TAG.
> 
> The developer of the Linked Data would be me.  I worked on this while interning at Facebook during the summer of 2011.  I have since returned to RPI to continue working toward my Ph.D.
> 
>> I could have gone through a very similar process with almost any Linked Data site, such as ePrints, myexperiment and dbpdedia (including my own, such as RKBExplorer) - it just happened I wanted Facebook last night.
>> And Linked Data people go around saying hows exciting it is that Facebook is offering Linked Data - I can't possibly use this as an example to a customer, such as Dan's Bob.
>> 
>> This whole experience is just crap.
> 
> Perhaps that experience was unpleasant.  Here's a marginally better one:
> 1. When you log into Facebook and go to your timeline (your own page), the path of the URL in the browser either looks like, e.g., /hugh.glasier or /profile.php?id=720591128 .  In the latter case, you have already found your FBID.
> 2. If you have a vanity URL, like /hugh.glasier , simply do a HTTP GET for http://graph.facebook.com/hugh.glasier , and that contains your FBID.
> 3. The URI representing you is http://graph.facebook.com/FBID# , where FBID should be the FBID number.
> 
> Yes, there is the HTTPS discrepancy, and yes, this probably isn't ideal in terms of discovering the URI that identifies a user.
> 
>> If I had trouble with this, exactly what does Facebook expect a normal user to do?
>> I'm sure we can point out ways in which Facebook might have done things better, but that is not the point.
> 
> Although I no longer work at Facebook, I would be interesting in such "ways in which Facebook might have done things better."  That discussion would be more appropriate in another thread.
> 
>> Can they actually make it easy for users using the current or proposed standards?
>> 
>> TAG, please don't come back with something that does not allow, or even encourage, sites like Facebook to offer RDF back in return for:
>> curl -H Accept:application/rdf+xml https://www.facebook.com/hugh.glaser
>> 
>> Best
>> Hugh
>> PS
>> I left the https in, because that is actually what cut and paste gave me.
>> I'm guessing that would have been a whole new thread.
>> 
> 
> http works, too, unless you're trying to access permissions-protected data, in which case you need to use https and provide a security token.  I'm not sure what the implications are regarding http/https URIs in Linked Data.  Indeed, that would be a whole new thread.
> 
>> PPS
>> If you read through to here, or even if you just skipped to here, then if you really do send me your Facebook LD URI (along with one of more other ones to pair it with), I will drop everything and put them in sameAs.org :-)
>> 
>> -- 
>> Hugh Glaser,
>>            Web and Internet Science
>>            Electronics and Computer Science,
>>            University of Southampton,
>>            Southampton SO17 1BJ
>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
>> Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
>> http://www.ecs.soton.ac.uk/~hg/
>> 
>> 
>> 
> 
> Finally, I would like to respond to an earlier comment made by Tom Heath (sorry for the incomplete-looking cut-and-paste): "a rigorous assessment of how difficult people *really* find it to understand distinctions such as 'things vs documents about things'. I've heard many people claim that they've failed to explain this (or similar) successfully to developers/adopters; my personal experience is that everyone gets it, it's no big deal (and IRs/NIRs would probably never enter into the discussion)."  My experience at Facebook agrees with Tom Heath's experience.  Understanding the distinction between "things" versus "documents about things" was easily understood.  The main source of contention was around its pragmatism and necessity.  One developer said to me (paraphrase): "I would conflate documents and things if I could."  It is a strange statement to me, but nevertheless, the distinction was understood.
> 
> In the fashion of Dan Brickley, I would like to present another _hypothetical_ dialogue, one between a proponent of Linked Data and a typical web developer (although perhaps not quite as clever and thorough as Dan's).
> 
> BEGIN DIALOGUE
> 
> Proponent: "I found a way to meaningfully publish our already-published data as Linked Data, and I've implemented a prototype."
> 
> Developer: "Since you've already done it, let's take a look."
> 
> Proponent: "Okay, go to [link]."
> 
> Developer: "Hmmmm... [skip discussion about Turtle vs. RDF/XML].  Everything looks okay, except I notice these URIs have #me at the end.  Why?  Can't we just lose the fragment?"
> 
> Proponent: "Well, URIs are used to identify things both on and off the web.  For example, no HTTP GET will ever squeeze you over a cable and pop you up in my browser."
> 
> Developer: "Sure.  So what?"
> 
> Proponent: "... so we need a way to mint URIs for both things on and off the web that makes sense with how the web already works."
> 
> Developer: "Okay, but why the fragment?"
> 
> Proponent: "I'm getting to that.  The current standard (which shall not be named) is based on the notion that any URI for which a HTTP GET returns with 200 OK (these are URIs without fragments) represents the document that is retrieved, that is, something *on* the web."
> 
> Developer: "Okay... seems logical."
> 
> Proponent: "So some conventions have been made for how to identify things *off* the web.  One is to simply add a fragment (understatement meant to avoid confusion at this point), and that can identify something *off* the web."
> 
> Developer: "So I have to have a fragment?  It seems unnecessary and ugly."
> 
> Proponent: "There is an alternative.  You can use a URI without a fragment, but then doing an HTTP GET on the URI must return a 303 which redirects to a document about the thing the URI represents."
> 
> Developer: "303?  What is that?"
> 
> Proponent: "See Other."
> 
> Developer: "Never heard of that.  I don't want to have to create another service just to 303 redirect to already-available data.  Seems superfluous.  Is there any other way?"
> 
> Proponent: "Well, we could actually let the URIs 404.  It's not ideal, but it's legal."
> 
> Developer: "No, I don't want anything to 404.  Never mind then.  What about this #me?  Why 'me'?"
> 
> Proponent: "Well, that's just a common convention for saying that [URL] returns information about [URL]#me.  #this is another common one."
> 
> Developer: "Hmmm... I don't know about that."
> 
> Proponent: "Well, if we don't want to 404, and we don't want to support 303, we'll need some kind of fragment to conform with the current standard.  We could just have an empty fragment so that the changes are minimal, both in terms of effort and appearance."
> 
> Developer: "Okay... I guess... let's go with that, then."
> 
> END DIALOGUE
> 
> Glean from the dialogue what you will.  How would I describe httpRange-14?  Minimally sufficient.
> 
> Jesse Weaver
> Ph.D. Student, Patroon Fellow
> Tetherless World Constellation
> Rensselaer Polytechnic Institute
> http://www.cs.rpi.edu/~weavej3/index.xhtml

-- 
Hugh Glaser,  
             Web and Internet Science
             Electronics and Computer Science,
             University of Southampton,
             Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 75 9533 4155 , Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/
Received on Wednesday, 28 March 2012 21:30:53 UTC