Re: What is a WebID? from Sebastian Hellmann on 2023-11-10 (public-webid@w3.org from November 2023)

From: Sebastian Hellmann <hellmann@informatik.uni-leipzig.de>
Date: Fri, 10 Nov 2023 16:20:25 +0100
To: Kingsley Idehen <kidehen@openlinksw.com>, public-webid@w3.org
Message-ID: <d8b1ea15-5f54-49c3-add7-6d5bbd5c63cc@informatik.uni-leipzig.de>
Hi Kingsley,

On 11/10/23 14:33, Kingsley Idehen wrote:
>
>>
>> I have amendments:
>>
>> 1. we really should go for HTTPS URLs here. We can add a note that 
>> HTTP URIs are the more general case, however, these are not meant 
>> here in a goal-oriented manner. Ultimately, we can not securely 
>> authenticate a WebID using HTTP, plus I can not think of a case where 
>> it would be useful to have a URI that is not an URL.
>
>
> We SHOULD encourage the use of HTTPS, but not force it on users. Most 
> WebID's generated by way of SSEO are HTTPS based anyway, since Google 
> has signaled their HTTPS preference to the SEO community etc..
>
> Today, only older WebIDs are HTTP based.

Whenever you want to authenticate with WebID you MUST NOT use HTTP and 
you MUST use URLs. As I said, we can add a note and say the "older 
WebIDs" were HTTP based and that it's conceptually fine.


>
>
>>
>> 2. I wouldn't be strict about the # and the Agent (for legacy 
>> reasons, i.e. LD published as '/'). I think, it can be either:
>
>
> "#" usage is just an option, that carries low costs that's all. 
> Fundamentally, its "#" is you want to leverage resolution by way of 
> implicit indirection of "/" if you want to use explicit indirection 
> via content negotiation. Disambiguation is always the core objective.
>
>
>>
>> a) example.org/agent5 a Agent . example.org/agent5#doc a ProfileDoc
>>
>> b) example.org/agent5#agent a Agent . example.org/agent5 a ProfileDoc
>>
>> c) example.org/agent5#agent a Agent . example.org/agent5#doc a 
>> ProfileDoc
>>
>> b and c would be clearer.
>>
>> 3. Non-information resources can resolve directly with 200 using # 
>> entities. This would integrate well in REST APIs.  I can see cases 
>> where you would want 303., so it should be acceptable to do content 
>> negotiation.
>
>
> It is so much easier to speak about these matters in terms of entities 
> and entity description documents. Entities are uniquely identifiable 
> things that comprise perceived structured represented in 
> machine-computable form using an entity relationship graph.
>
> These fundamental concepts date back to the beginning of computing 
> i.e., we can't compute without this kind of baseline clarity.
>
> If name something using a "#" based HTTP URI the 
> denotation->connotation indirection just happens without any work. If 
> circumstances lead to using "/" then content negotiation is part of 
> the cost inherited re denotation->connotation indirection. There are 
> no ways around these fundamental matters -- when it comes to the 
> matter of unambiguous entity naming.
>
> Another analogy I used to use years ago is as follows:
>
> The projector provides a surface for perceiving what's projected. If 
> that distinction doesn't exist, how to do we perceive anything bar the 
> projector itself?

Well, I am thinking more of a tablet than a projector.  I am also a big 
fan of layered architectures and my opinion is that we should push 
semantics to the uppermost layers.  I think it is a misconception that 
machines can know semantics. At the end of the days they work best 
translating strings into other strings.  I think we might fare better 
with having a graph transport layer (ISO-OSI style). So URLs can be used 
to get more graph resources via HTTP, then when you have the graph, you 
can treat URIs as entities. I would consider this way more practical.


>
>
>>
>> 4. I am getting more an more skeptical about the "URI as names for 
>> things". Was this really the best way of realizing the GGG? Would it 
>> make a significant difference to say that "URLs as a tool to retrieve 
>> graph nodes and graphs that describe entities"?  It would be more in 
>> line with the Web, that also delivers docs about entities. 
>> Semantically, most people think about data retrieval first and then 
>> interpret them as entities later.
>
>
> You can have a collection of documents comprising entities named using 
> indefinite pronouns (blank nodes), but the onus of disambiguation is 
> then pushed to apps, thereby handing everything off to silo vectors etc..
>
Not saying blank nodes here.  Just saying that you use URIs to resolve 
to more graph data, the interpret the URIs in the retrieved graph as 
entities. The result is the same, but you can skip the content negotiation.

>
> TimBL though a lot of this through eons ago, but getting it through en 
> masse has clearly been a big challenge.

Maybe if we solve some things like HR-14 and the semantic web stack.

My main question here is: What part of the web architecture breaks, if 
we implement conneg free /# mixed URIs? I asked this to a lot of people 
in different ways, but nobody can tell me.

For this example, let's say DBpedia URIs were native https

If I do `curl -H "Accept: text/turtle" 
https://dbpedia.org/resource/Berlin `  and get a 200 OK Content-type: 
text/turtle  , I don't see any need  to disambiguate anything. The graph 
says that https://dbpedia.org/resource/Berlin is a dbo:City . So what 
would actually break?  We can add a node 
"https://dbpedia.org/resource/Berlin#Turtle-Doc" if we ant to talk about 
the data payload itself, if necessary.

-- Sebastian


>
>
>>
>> 5. Using 
>> https://www.openlinksw.com/data/pdf/Semantic_Web_and_LLM-based_Chat_Bot_Symbiosis.pdf#page=26 
>> it would be possible to make a CSV/TSV subset spec.
>>
>> 6. Might be good to suggest some default strings to use after # , 
>> just as a no-brainer suggestion for implementation, so people don't 
>> struggle choosing between #me, #i, #this, etc. #organisation, 
>> #person, #agent, #website.
>
>
> That's a great point! The challenge is getting the right audience to 
> understand the story being told. In my experience, I've found that the 
> story and the audience are typically out of sync. For instance, 
> developers just want to parse stuff and implement algorithms, while 
> architects, on the other hand, typically think more conceptually, 
> lending themselves to matters of abstraction.
>
> Kingsley
Received on Friday, 10 November 2023 15:20:38 UTC