- From: Melvin Carvalho <melvincarvalho@gmail.com>
- Date: Fri, 10 Nov 2023 22:10:09 +0100
- To: Martynas Jusevičius <martynas@atomgraph.com>
- Cc: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>, Kingsley Idehen <kidehen@openlinksw.com>, public-webid@w3.org
- Message-ID: <CAKaEYhJR_95=fsVHdDzF5OyuTqYng7PMs4H4475vQQdaA-rtzg@mail.gmail.com>
pá 10. 11. 2023 v 17:52 odesílatel Martynas Jusevičius < martynas@atomgraph.com> napsal: > https://dbpedia.org/resource/Berlin#Turtle-Doc > > This makes no sense. Document URIs are inherently hash-less because > servers do not see fragment identifiers. > Good read on the hash vs slash : https://lists.w3.org/Archives/Public/www-tag/2002Mar/0121.html It's more nuanced than people think. Also consider all those years (10+?) curl sent # to the server before realising it was a bug. > > On Fri, 10 Nov 2023 at 16.20, Sebastian Hellmann < > hellmann@informatik.uni-leipzig.de> wrote: > >> Hi Kingsley, >> On 11/10/23 14:33, Kingsley Idehen wrote: >> >> >> >> I have amendments: >> >> 1. we really should go for HTTPS URLs here. We can add a note that HTTP >> URIs are the more general case, however, these are not meant here in a >> goal-oriented manner. Ultimately, we can not securely authenticate a WebID >> using HTTP, plus I can not think of a case where it would be useful to have >> a URI that is not an URL. >> >> >> >> We SHOULD encourage the use of HTTPS, but not force it on users. Most >> WebID's generated by way of SSEO are HTTPS based anyway, since Google has >> signaled their HTTPS preference to the SEO community etc.. >> >> Today, only older WebIDs are HTTP based. >> >> Whenever you want to authenticate with WebID you MUST NOT use HTTP and >> you MUST use URLs. As I said, we can add a note and say the "older WebIDs" >> were HTTP based and that it's conceptually fine. >> >> >> >> >> >> 2. I wouldn't be strict about the # and the Agent (for legacy reasons, >> i.e. LD published as '/'). I think, it can be either: >> >> >> >> "#" usage is just an option, that carries low costs that's all. >> Fundamentally, its "#" is you want to leverage resolution by way of >> implicit indirection of "/" if you want to use explicit indirection via >> content negotiation. Disambiguation is always the core objective. >> >> >> >> a) example.org/agent5 a Agent . example.org/agent5#doc a ProfileDoc >> >> b) example.org/agent5#agent a Agent . example.org/agent5 a ProfileDoc >> >> c) example.org/agent5#agent a Agent . example.org/agent5#doc a >> ProfileDoc >> >> b and c would be clearer. >> >> 3. Non-information resources can resolve directly with 200 using # >> entities. This would integrate well in REST APIs. I can see cases where >> you would want 303., so it should be acceptable to do content negotiation. >> >> >> >> It is so much easier to speak about these matters in terms of entities >> and entity description documents. Entities are uniquely identifiable things >> that comprise perceived structured represented in machine-computable form >> using an entity relationship graph. >> >> These fundamental concepts date back to the beginning of computing i.e., >> we can't compute without this kind of baseline clarity. >> >> If name something using a "#" based HTTP URI the denotation->connotation >> indirection just happens without any work. If circumstances lead to using >> "/" then content negotiation is part of the cost inherited re >> denotation->connotation indirection. There are no ways around these >> fundamental matters -- when it comes to the matter of unambiguous entity >> naming. >> >> Another analogy I used to use years ago is as follows: >> >> The projector provides a surface for perceiving what's projected. If that >> distinction doesn't exist, how to do we perceive anything bar the projector >> itself? >> >> Well, I am thinking more of a tablet than a projector. I am also a big >> fan of layered architectures and my opinion is that we should push >> semantics to the uppermost layers. I think it is a misconception that >> machines can know semantics. At the end of the days they work best >> translating strings into other strings. I think we might fare better with >> having a graph transport layer (ISO-OSI style). So URLs can be used to get >> more graph resources via HTTP, then when you have the graph, you can treat >> URIs as entities. I would consider this way more practical. >> >> >> >> >> >> 4. I am getting more an more skeptical about the "URI as names for >> things". Was this really the best way of realizing the GGG? Would it make a >> significant difference to say that "URLs as a tool to retrieve graph nodes >> and graphs that describe entities"? It would be more in line with the Web, >> that also delivers docs about entities. Semantically, most people think >> about data retrieval first and then interpret them as entities later. >> >> >> >> You can have a collection of documents comprising entities named using >> indefinite pronouns (blank nodes), but the onus of disambiguation is then >> pushed to apps, thereby handing everything off to silo vectors etc.. >> >> Not saying blank nodes here. Just saying that you use URIs to resolve to >> more graph data, the interpret the URIs in the retrieved graph as entities. >> The result is the same, but you can skip the content negotiation. >> >> >> TimBL though a lot of this through eons ago, but getting it through en >> masse has clearly been a big challenge. >> >> Maybe if we solve some things like HR-14 and the semantic web stack. >> >> My main question here is: What part of the web architecture breaks, if we >> implement conneg free /# mixed URIs? I asked this to a lot of people in >> different ways, but nobody can tell me. >> >> For this example, let's say DBpedia URIs were native https >> >> If I do `curl -H "Accept: text/turtle" >> https://dbpedia.org/resource/Berlin ` and get a 200 OK Content-type: >> text/turtle , I don't see any need to disambiguate anything. The graph >> says that https://dbpedia.org/resource/Berlin is a dbo:City . So what would >> actually break? We can add a node " >> <https://dbpedia.org/resource/Berlin%C2%A0andgeta200OKContent-type:text/turtle%C2%A0,Idon'tseeanyneed%C2%A0todisambiguateanything.Thegraphsaysthat%C2%A0https://dbpedia.org/resource/Berlinisadbo:City.Sowhatwouldactuallybreak?%C2%A0Wecanaddanode> >> https://dbpedia.org/resource/Berlin#Turtle-Doc" if we ant to talk about >> the data payload itself, if necessary. >> >> -- Sebastian >> >> >> >> >> >> 5. Using >> https://www.openlinksw.com/data/pdf/Semantic_Web_and_LLM-based_Chat_Bot_Symbiosis.pdf#page=26 >> it would be possible to make a CSV/TSV subset spec. >> >> 6. Might be good to suggest some default strings to use after # , just as >> a no-brainer suggestion for implementation, so people don't struggle >> choosing between #me, #i, #this, etc. #organisation, #person, #agent, >> #website. >> >> >> >> That's a great point! The challenge is getting the right audience to >> understand the story being told. In my experience, I've found that the >> story and the audience are typically out of sync. For instance, developers >> just want to parse stuff and implement algorithms, while architects, on the >> other hand, typically think more conceptually, lending themselves to >> matters of abstraction. >> >> Kingsley >> >>
Received on Friday, 10 November 2023 21:10:27 UTC