AW: W3C position on URIs http:// vs. https://

Appreciate your insightful post, Chris. You perfectly describe the challenge as I see it (I am an ontologist, not an ISec expert). I will certainly go through your discussion record and try to learn from it for our situation.

Von: Chris Mungall <cjmungall@lbl.gov>
Datum: Dienstag, 13. Juni 2023 um 19:01
An: Melvin Carvalho <melvincarvalho@gmail.com>
Cc: Hubauer, Thomas (T DAI SMR-DE) <thomas.hubauer@siemens.com>, semantic-web@w3.org <semantic-web@w3.org>
Betreff: Re: W3C position on URIs http:// vs. https://
I think it's important for the semantic web community to communicate clearly, simply, unambiguously, and non-dogmatically when it comes to this issue.

While I agree with many points in the TimBL article, the ship has long sailed. I can't show that article to web developers who are asking me why we don't change our PURLs to https, because chrome refuses to allow downloads of them when linked from an https site. They don't understand why we are reluctant to change, because frankly using URLs for identifiers was a pretty odd thing to do in the first place, mixing two separate concerns (semantic identity and network protocols). Browsers and http libraries can happily treat http and https as equivalent, but this is obviously a massive problem for semantic web interoperability.

The lack of guidance has led to confusion. For example, it looks like schema.org<http://schema.org/> is in some superposition state where http or https is considered canonical for semantic identifiers.

https://github.com/solid/solid-namespace/issues/21

https://github.com/linkeddata/rdflib.js/issues/550


We are faced with this problem in the OBO community, we adopted http PURLs for both OWL classes and OWL ontologies around 15 years ago, rejecting URN-based LSIDs. We are now faced with the situation where things are breaking as various pieces of web infrastructure start making life for http difficult.

We tried reading
https://www.w3.org/blog/2016/05/https-and-the-semantic-weblinked-data/

But the advice about URI and HSTS is hard to follow for a bunch of ontologists. We just want to make useful ontologies, and not be forced to be network engineers.

Our discussion and eventual decisions are recorded here, if it's useful (and comments welcome if we are doing things incorrectly):

https://github.com/OBOFoundry/purl.obolibrary.org/issues/705


Summary:

1. Our infrastructure supports both https and http URLs, for both terms and ontologies, these both 302 redirect to the relevant browser or download (using cloudflare)
2. We encourage web sites that need to link to an ontology download to use the https URLs in HTML, but to make it clear that the PURL is the http URI, and the http PURL *must* be used in RDF documents
3. Even though we support https variants of http PURLs for OWL classes, with both 302 redirecting to the same location, we strongly discourage their use in any context, because this can lead to confusion about the canonical URL to use in RDF/OWL documents. We don't want to end up in the schema.org<http://schema.org/> situation. We are building lots of tooling that will check for cases where https is used accidentally in a linked data context, as we expect this to happen a lot.

This has been sufficient to placate frustrated web developers, but it feels like we are delaying the inevitable and that there will one day be pressure to deprecate our http PURLs and switch to https. This would have a massive cost in terms of rewiring massive distributed troves of RDF data and OWL documents, database tables, and a highly painful, long, and confusing transition period. But we are hoping that this day never comes or we can delay it as long as possible, or LLMs will make the whole thing irrelevant.

On Tue, Jun 13, 2023 at 8:48 AM Melvin Carvalho <melvincarvalho@gmail.com<mailto:melvincarvalho@gmail.com>> wrote:


út 13. 6. 2023 v 17:37 odesílatel Hubauer, Thomas <thomas.hubauer@siemens.com<mailto:thomas.hubauer@siemens.com>> napsal:
Hi SemWeb community,

One of my projects is considering making some of our ontologies accessible to customers. As part of these considerations, we have been discussing resolving ontology references (e.g. for imports) which lead us to some lengthy arguments about http:// vs. https:// as protocol part in our URIs (primarily ontology URIs, potentially element URIs as well).

I am aware of a 2016 post (https://www.w3.org/blog/2016/05/https-and-the-semantic-weblinked-data/) stating that W3C currently considers http and https to be “equivalent” for w3c.org<http://w3c.org/>. However, the security guys I am working with are not too happy with this as using a http URI for downloading imported ontologies is vulnerable to a man-in-the-middle attack.

I was unable to find any more recent statement by the W3C on the use of http vs. https. Specifically, I’d be interested to understand if this community (and the W3C) intend to stick with http for the foreseeable future, of if there’s any plans to migrate some/all URIs (e.g. ontology URIs but not element URIs) to https ? Would be nice for us to understand what “the outer world” plans so we can maybe take this as a blueprint for our own guidance on URIs.

I'm with TimBL on this:

"HTTPS Everywhere" considered harmful

https://www.w3.org/DesignIssues/Security-NotTheS.html


The Semantic Web has been around for a couple of decades.  Is there any documented instance of an MITM attack on an ontology ever causing an issue?


Best regards,
Thomas

Received on Tuesday, 13 June 2023 17:22:59 UTC