Re: JSON-LD responses for core ontologies don't contain CORS headers; limits browser clients

Interestingly, at the moment the Turtle responses have the
acces-control-allow-origin header duplicated. I guess this may break
something that will see it as effectively "*, *"?

$ http -h https://www.w3.org/2000/01/rdf-schema
HTTP/1.1 200 OK
accept-ranges: bytes
access-control-allow-credentials: true
access-control-allow-headers: Link, Location, Content-Type, Accept, Vary
access-control-allow-methods: GET, HEAD, OPTIONS
access-control-allow-origin: *
access-control-allow-origin: *
access-control-expose-headers: Location, Link, Vary, Last-Modified, ETag,
Allow, Content-Length, Accept
cache-control: max-age=21600
content-length: 3812
content-location: rdf-schema.ttl
content-security-policy: upgrade-insecure-requests
content-type: text/turtle; charset=utf-8
date: Tue, 27 Oct 2020 11:26:33 GMT
etag: "ee4-4f33230d4a800;586929c7c4f1d"
expires: Tue, 27 Oct 2020 17:26:33 GMT
last-modified: Tue, 25 Feb 2014 02:53:20 GMT
strict-transport-security: max-age=15552000; includeSubdomains; preload
tcn: choice
vary: negotiate,accept,accept-charset

On Mon, 26 Oct 2020 17:02:39 +0100, Ivan Herman wrote:
> 
> 
> > On 26 Oct 2020, at 15:31, Alex Kreidler <alexkreidler2020@gmail.com> wrote:
> > 
> > Ivan, 
> > 
> > I've just tested again and can confirm that the header is properly applied to https://www.w3.org/2000/01/rdf-schema <https://www.w3.org/2000/01/rdf-schema>. However, it still isn't applied to http://www.w3.org/1999/02/22-rdf-syntax-ns <http://www.w3.org/1999/02/22-rdf-syntax-ns> when Accept: application/ld+json is set.
> > 
> > Sorry to bother once again, but I would really appreciate if you could add the same 
> > Header set Access-Control-Allow-Origin "*"
> > to the http://www.w3.org/1999/02/22-rdf-syntax-ns <http://www.w3.org/1999/02/22-rdf-syntax-ns> configuration.
> 
> Should be o.k. now. It seems that CONNEG did not work with some other mechanisms, and this must be added explicitly. (Don't ask me why…)
> 
> Cheers
> 
> Ivan
> 
> > 
> > I also completely understand your position that the other CORS settings should not be changed until a consensus is reached. Hopefully this thread will provide some level of useful information to inform those future discussions. Until then, the current settings should be fine.
> > 
> > Thanks again for your help with this.
> > 
> > Best regards,
> > Alex
> > 
> > On Mon, Oct 26, 2020 at 4:16 AM Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
> > Alex,
> > 
> > two separate things.
> > 
> > 1. The missing 
> > 
> >     Header set Access-Control-Allow-Origin "*"
> > 
> > seems to have been a difference in our own setups on the servers. I am not sure how, but that heading was automatically added to the rdf vocabulary (the value does not come from a .htaccess setting) while, in contrast, the same mechanism did not work for the rdf schema vocabulary. Clearly some apache setting somewhere (to which I do not have easily access). I've added that extra header setting explicitly and it solved this specific issue as far as I can see.
> > 
> > (As the URL says, these are parts of the very old areas of our Web site, when CORS did not even exist. Changes have been added to the server setting later, but we have the general policy to change the past with extreme parsimony, which means that these settings are very much ad-hoc. We do not want to rewrite history.)
> > 
> > 2. As for the general question on what CORS headers should be used in the first place: to be very honest, my knowledge and experience in this area is poor. While it was obviously a mistake to have a discrepancy between the rdf and rdf vocabulay, what I did was simply to copy the CORS settings of rdf to rdf schema: the former was set many years ago but the latter was never set. I had no problems setting that.
> > 
> > Changing the set is altogether a different issue, and I do not consider myself authoritative in this. These CORS settings were set, as I said, many years ago, and it is the first time this issue came up. This does not mean that your questions are not justified, it only means that I cannot change that easily. If the RDF/Linked Data community comes to some sort of a conclusion as for the the changes to apply, I am happy to do the mechanical part, but only if a consensus has been reached. Note also that removing something from the headers may be a major issue for deployed applications and, unless it is clearly wrong and/or harmful, I would strongly advise against it.
> > 
> > I hope you understand
> > 
> > Cheers
> > 
> > Ivan
> > 
> >> On 25 Oct 2020, at 20:34, Alex Kreidler <alexkreidler2020@gmail.com <mailto:alexkreidler2020@gmail.com>> wrote:
> >> 
> >> Tomasz, I actually do use your vocabulary package in the app for some sections. I do understand those problems you mentioned, but the situation in my app is one where the rdf and rdfs ontologies are one out of many properties on an object, and the user hovers over the IRI to dereference it to get additional data (e.g. a title from rdfs:label). Thus I could implement it differently by adding a cache of common ontologies, but I'm reaching for the ideal world described in the LInked Data note <https://www.w3.org/DesignIssues/LinkedData.html>. In fact, in my mind, most of the problems you mentioned would be "good problems to have," because they would indicate significant adoption of linked data technology via heavy load on the servers.
> >> 
> >> Ivan, thanks for your response. I did indeed test locally and the headers are the same for rdf-schema as for rdf-syntax-ns. However, the primary issue I was running into was that for some reason this header:
> >> access-control-allow-origin: *
> >> is present on the regular request but not the one with "Accept: application/ld+json" I'm not sure why this would be the case if it's added with:
> >> Header set Access-Control-Allow-Origin "*"
> >> to the .htaccess file. This might require some more investigation.
> >> However, I also noticed some other issues with the headers. Firstly I don't see any reason why we need access-control-allow-credentials.
> >> Then, with:
> >> access-control-allow-headers: Link, Location, Content-Type, Accept, Vary
> >> the TLDR is that this line doesn't make a difference to browser clients.
> >> Link and Content-Type are entity headers and thus can be on a response or a POST request, so they don't make sense here
> >> Location and Vary are both response headers, so they wouldn't be on a request being checked by CORS
> >> Accept is a request header, but it is already allowed by CORS in "simple requests," and doesn't trigger a preflight CORS check <https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#Preflighted_requests>, so it shouldn't be needed in access-control-allow-headers.
> >> 
> >> What I'd recommend instead are the following headers: 
> >> access-control-allow-headers: If-Match, If-None-Match, If-Modified-Since, If-Unmodified-Since, If-Range, Range
> >> access-control-allow-methods: GET, HEAD, OPTIONS
> >> access-control-allow-origin: *
> >> access-control-expose-headers: *
> >> 
> >> Since Apache handles conditional requests <https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests> for us <https://httpd.apache.org/docs/trunk/compliance.html#policyconditional>, and they can improve performance and help with caching, I think it's valuable to add them. As indicated in that first link, these headers can help libraries with caching, but can also allow resumable downloads, which applies because the server already provides responses with accept-ranges: bytes
> >> 
> >> However, if for some reason we didn't want to expose conditional requests to browsers, we could just use:
> >> 
> >> access-control-allow-origin: *
> >> access-control-expose-headers: *
> >> 
> >> as a minimal default. Let me know what you think!
> >> 
> >> On Sun, Oct 25, 2020 at 3:32 PM Alex Kreidler <alexkreidler2020@gmail.com <mailto:alexkreidler2020@gmail.com>> wrote:
> >> Tomasz, I actually do use your vocabulary package in the app for some sections. I do understand those problems you mentioned, but the situation in my app is one where the rdf and rdfs ontologies are one out of many properties on an object, and the user hovers over the IRI to dereference it to get additional data (e.g. a title from rdfs:label). Thus I could implement it differently by adding a cache of common ontologies, but I'm reaching for the ideal world described in the LInked Data note <https://www.w3.org/DesignIssues/LinkedData.html>. In fact, in my mind, most of the problems you mentioned would be "good problems to have," because they would indicate significant adoption of linked data technology via heavy load on the servers.
> >> 
> >> Ivan, thanks for your response. I did indeed test locally and the headers are the same for rdf-schema as for rdf-syntax-ns. However, the primary issue I was running into was that for some reason this header:
> >> access-control-allow-origin: *
> >> is present on the regular request but not the one with "Accept: application/ld+json" I'm not sure why this would be the case if it's added with:
> >> Header set Access-Control-Allow-Origin "*"
> >> to the .htaccess file. This might require some more investigation.
> >> However, I also noticed some other issues with the headers. Firstly I don't see any reason why we need access-control-allow-credentials.
> >> Then, with:
> >> access-control-allow-headers: Link, Location, Content-Type, Accept, Vary
> >> the TLDR is that this line doesn't make a difference to browser clients.
> >> Link and Content-Type are entity headers and thus can be on a response or a POST request, so they don't make sense here
> >> Location and Vary are both response headers, so they wouldn't be on a request being checked by CORS
> >> Accept is a request header, but it is already allowed by CORS in "simple requests," and doesn't trigger a preflight CORS check <https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS#Preflighted_requests>, so it shouldn't be needed in access-control-allow-headers.
> >> 
> >> What I'd recommend instead are the following headers: 
> >> access-control-allow-headers: If-Match, If-None-Match, If-Modified-Since, If-Unmodified-Since, If-Range, Range
> >> access-control-allow-methods: GET, HEAD, OPTIONS
> >> access-control-allow-origin: *
> >> access-control-expose-headers: *
> >> 
> >> Since Apache handles conditional requests <https://developer.mozilla.org/en-US/docs/Web/HTTP/Conditional_requests> for us <https://httpd.apache.org/docs/trunk/compliance.html#policyconditional>, and they can improve performance and help with caching, I think it's valuable to add them. As indicated in that first link, these headers can help libraries with caching, but can also allow resumable downloads, which applies because the server already provides responses with accept-ranges: bytes
> >> 
> >> However, if for some reason we didn't want to expose conditional requests to browsers, we could just use:
> >> 
> >> access-control-allow-origin: *
> >> access-control-expose-headers: *
> >> 
> >> as a minimal default. Let me know what you think!
> >> 
> >> On Sun, Oct 25, 2020 at 1:48 AM Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
> >> Alex,
> >> 
> >> I have added to the relevant .htaccess file the instruction to generate the same CORS headers to rdf-schema as for rdf-syntax-ns. It seems to work on my machine when doing a curl —head; however, I would appreciate if you checked everything on your side, too.
> >> 
> >> Thanks for flagging this.
> >> 
> >> Ivan
> >> 
> >>> On 25 Oct 2020, at 02:25, Alex Kreidler <alexkreidler2020@gmail.com <mailto:alexkreidler2020@gmail.com>> wrote:
> >>> 
> >>> Sorry, just noticed that the link I provided auto-deletes after 24 hours.
> >>> 
> >>> I've included the full output of the issue below, along with via another pastebin: https://pastebin.com/Jf3K7DW7 <https://pastebin.com/Jf3K7DW7>
> >>> 
> >>> $ http HEAD http://www.w3.org/1999/02/22-rdf-syntax-ns#type <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "Accept: application/ld+json"
> >>> HTTP/1.1 200 OK
> >>> accept-ranges: bytes
> >>> access-control-allow-credentials: true
> >>> access-control-allow-headers: Link, Location, Content-Type, Accept, Vary
> >>> access-control-allow-methods: GET, HEAD, OPTIONS
> >>> access-control-expose-headers: Location, Link, Vary, Last-Modified, ETag, Allow, Content-Length, Accept
> >>> cache-control: max-age=21600
> >>> content-length: 9198
> >>> content-location: 22-rdf-syntax-ns.jsonld
> >>> content-type: application/ld+json
> >>> date: Sun, 25 Oct 2020 00:08:07 GMT
> >>> etag: "23ee-599d39942e300;599d3996211bf"
> >>> expires: Sun, 25 Oct 2020 06:08:07 GMT
> >>> last-modified: Mon, 16 Dec 2019 15:09:32 GMT
> >>> tcn: choice
> >>> vary: negotiate,accept,accept-charset,upgrade-insecure-requests
> >>> 
> >>> 
> >>> 
> >>> $ http HEAD http://www.w3.org/1999/02/22-rdf-syntax-ns#type
> >>> HTTP/1.1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#typeHTTP/1.1> 200 OK
> >>> accept-ranges: bytes
> >>> access-control-allow-credentials: true
> >>> access-control-allow-headers: Link, Location, Content-Type, Accept, Vary
> >>> access-control-allow-methods: GET, HEAD, OPTIONS
> >>> access-control-allow-origin: *
> >>> access-control-expose-headers: Location, Link, Vary, Last-Modified, ETag, Allow, Content-Length, Accept
> >>> cache-control: max-age=21600
> >>> content-length: 6004
> >>> content-location: 22-rdf-syntax-ns.ttl
> >>> content-type: text/turtle; charset=utf-8
> >>> date: Sun, 25 Oct 2020 00:08:13 GMT
> >>> etag: "1774-599d39942e300;599d3996204ea"
> >>> expires: Sun, 25 Oct 2020 06:08:13 GMT
> >>> last-modified: Mon, 16 Dec 2019 15:09:32 GMT
> >>> tcn: choice
> >>> vary: negotiate,accept,accept-charset,upgrade-insecure-requests
> >>> 
> >>> 
> >>> 
> >>> $ http HEAD https://www.w3.org/2000/01/rdf-schema#range <https://www.w3.org/2000/01/rdf-schema#range> "Accept: application/ld+json"                  
> >>> HTTP/1.1 200 OK
> >>> accept-ranges: bytes
> >>> cache-control: max-age=21600
> >>> content-length: 5604
> >>> content-location: rdf-schema.jsonld
> >>> content-security-policy: upgrade-insecure-requests
> >>> content-type: application/ld+json
> >>> date: Sun, 25 Oct 2020 00:08:54 GMT
> >>> etag: "15e4-586929c3ace00;586929c7c4f1d"
> >>> expires: Sun, 25 Oct 2020 06:08:54 GMT
> >>> last-modified: Mon, 15 Apr 2019 14:38:48 GMT
> >>> strict-transport-security: max-age=15552000; includeSubdomains; preload
> >>> tcn: choice
> >>> vary: negotiate,accept,accept-charset
> >>> 
> >>> 
> >>> 
> >>> $ http HEAD https://www.w3.org/2000/01/rdf-schema#range
> >>> HTTP/1.1 <https://www.w3.org/2000/01/rdf-schema#rangeHTTP/1.1> 200 OK
> >>> accept-ranges: bytes
> >>> access-control-allow-origin: *
> >>> cache-control: max-age=21600
> >>> content-length: 3812
> >>> content-location: rdf-schema.ttl
> >>> content-security-policy: upgrade-insecure-requests
> >>> content-type: text/turtle; charset=utf-8
> >>> date: Sun, 25 Oct 2020 00:09:03 GMT
> >>> etag: "ee4-4f33230d4a800;586929c6c4edd"
> >>> expires: Sun, 25 Oct 2020 06:09:03 GMT
> >>> last-modified: Tue, 25 Feb 2014 02:53:20 GMT
> >>> strict-transport-security: max-age=15552000; includeSubdomains; preload
> >>> tcn: choice
> >>> vary: negotiate,accept,accept-charset
> >>> 
> >> 
> >> 
> >> ----
> >> Ivan Herman, W3C 
> >> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> >> mobile: +33 6 52 46 00 43
> >> ORCID ID: https://orcid.org/0000-0003-0782-2704 <https://orcid.org/0000-0003-0782-2704>
> >> 
> > 
> > 
> > ----
> > Ivan Herman, W3C 
> > Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> > mobile: +33 6 52 46 00 43
> > ORCID ID: https://orcid.org/0000-0003-0782-2704 <https://orcid.org/0000-0003-0782-2704>
> > 
> 
> 
> ----
> Ivan Herman, W3C 
> Home: http://www.w3.org/People/Ivan/
> mobile: +33 6 52 46 00 43
> ORCID ID: https://orcid.org/0000-0003-0782-2704
> 

-- 
Michał Politowski

Received on Tuesday, 27 October 2020 11:35:53 UTC