Re: HTTPS and the Semantic Web from Harry Halpin on 2016-05-22 (semantic-web@w3.org from May 2016)

From: Harry Halpin <hhalpin@ibiblio.org>
Date: Sat, 21 May 2016 19:10:59 -1000
To: Melvin Carvalho <melvincarvalho@gmail.com>
Cc: Pat Hayes <phayes@ihmc.us>, Nathan Rixham <nathan@webr3.org>, Phil Archer <phila@w3.org>, Semantic Web IG <semantic-web@w3.org>
Message-ID: <CAE1ny+5SKQLZmWaN6umycYmfLQwsOwBGcM80WMbeLBWxrYURGw@mail.gmail.com>
On Sat, May 21, 2016 at 1:29 AM, Melvin Carvalho <melvincarvalho@gmail.com>
wrote:

>
>
> On 21 May 2016 at 06:44, Harry Halpin <hhalpin@ibiblio.org> wrote:
>
>> Given that the Semantic Web use of HTTP URIs basically means that any use
>> of 'follow your nose' is easily subverted by anyone with access to the raw
>> HTTP stream, we should just update the Semantic Web specs and reasoners so
>> that TLS is enforced by default and HTTP = HTTP(S).
>>
>> While it is true that some normal web-pages *can* serve different content
>> at TLS than non-TLS, it's currently considered pathological.
>>
>
>> If the Semantic Web doesn't gracefully deal with the upgrade from HTTP to
>> TLS, it will date itself quite quickly and will not be usable for any
>> real-world usage (notice almost all major sites now are moving to TLS)
>> outside of enterprise use within a firewall or usages where there's no
>> 'follow your nose' effort. In the latter case, I'm not sure if using HTTP
>> URIs makes sense to begin with.
>>
>
> Opinions such as "[the semantic web] will not be usable for any real-world
> usage", is unhelpful speculation, and IMHO inappropriate for this list.
>


I think it's pretty self-evident that ''real world usage' precludes having
random people with an open source program like wireshark intercept your
communication (which could include RDF) and change it. That's why the Web
is moving to TLS, and it would be sad if the Semantic Web was left behind.

The only reason *not* to do it would if the community decided that 'follow
your nose' locations were just different than identifiers. In which case,
it seems the potential of the SemWeb is hurt by not using identifers that
do double-duty as locations.

Given the Semantic Web is still a maturing technology, I would highly
recommend folks just move, even if it means making a minor revision to the
specs to deal with the HTTPS/HTTP issue and updating reasoners/query
engines.

  cheers,
    harry


>
>
>>
>> Note that the upgrade should be relatively cost-free, see the "Let's
>> Encrypt" effort for free TLS certs.
>>
>> On Fri, May 20, 2016 at 6:04 PM, Pat Hayes <phayes@ihmc.us> wrote:
>>
>>>
>>> On May 20, 2016, at 5:02 PM, Nathan Rixham <nathan@webr3.org> wrote:
>>>
>>> ....
>>> An x:alias predicate which asserts that one name (IRI) is an alias of
>>> another name (IRI) would be very useful. <a#b> x:alias <c#d> .
>>>
>>> An x:canonical predicate which asserts <a#b> x:alias <c#d> . and that
>>> <a#b> is the preferred IRI more useful still.
>>>
>>>
>>> Just an observation - it may be that practical needs override formality
>>> - but this is not legal according to the RDF semantics. The truth of a
>>> triple aaa R bbb depends only on what the IRIs in the triple, in particular
>>> aaa and bbb, *denote*, not on their syntactic form. So x:alias would have
>>> the same semantics as owl:sameAs (and we all know what happened to *that*
>>> when it got out into the wide world.)
>>>
>>> We could sneak around this by declaring (contrary to the normative
>>> semantics, but still...) that x:alias is a new kind of property, one that
>>> quotes its arguments and is therefore referentially opaque. There would
>>> have been a time when I would have opposed this idea with some vigor, but
>>> age has mellowed me. And the internal semantic coherence of the Web can
>>> hardly get worse than it is already, so what the hell.  Just be ready for
>>> the truly awful muddle that will arise when x:alias bumps into owl:sameAs
>>> and reasoners try to figure out what the consequences might be.
>>>
>>> A better solution would be to invent, and have everyone adopt[**], a
>>> IRI-quoting-IRI convention, something like x:theIRI# , with the semantics
>>> that x:theIRI#someOtherIRI always denotes someOtherIRI. (Maybe this would
>>> need some clever character-escaping? I leave that to others to work out.)
>>> Then x:theIRI#a#b x:alias x:theIRI#c#d would mean what you want to express,
>>> above.
>>>
>>> Pat Hayes
>>>
>>> [**] There's the rub, of course.
>>>
>>>
>>> Using syntax shortcuts you could add the following triple to the turtle
>>> document at https://www.w3.org/1999/02/22-rdf-syntax-ns#
>>>
>>>    rdf: x:canonical <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
>>>
>>> Result:
>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#> a owl:Ontology .
>>> <https://www.w3.org/1999/02/22-rdf-syntax-ns#> a owl:Ontology .
>>>
>>>
>>> Point 2:
>>>
>>> Using a 307 redirect for the semantic is nice, but practically click
>>> http://www.w3.org/ns/dcat# and you are redirected, refresh and you find
>>> the client does use the redirected url for subsequent requests.
>>>
>>> As a general person or developer search w3.org for dcat and the results
>>> are https://www.google.com/search?q=site:w3.org%20dcat - the url listed
>>> is the https url.
>>>
>>> Usage of the https IRIs will enter the web of data ever increasingly,
>>> whether people say the http one should be used or not.
>>>
>>> Point 3:
>>>
>>> Practically taking a simple real world step like migrating to a CDN will
>>> often give http/2+tls thus https IRIs automatically.
>>>
>>> Test case:
>>>
>>> Alice has a wordpress/drupal site that publishes RDF automatically. She
>>> doesn't know about the RDF.
>>> Alice clicks the "free CDN" button in her hosting account.
>>> Alice now has https and http IRIs in RDF on both http:// and https://
>>> protocols.
>>>
>>> Personally I cannot think of anything easier than as best practise
>>> adding a single triple to rdf documents when migrating protocols. Anything
>>> within the black box will fail and be implemented incorrectly.
>>>
>>> On Sat, May 21, 2016 at 12:42 AM, Melvin Carvalho <
>>> melvincarvalho@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> On 20 May 2016 at 20:08, Phil Archer <phila@w3.org> wrote:
>>>>
>>>>> Not a moan about spam, or a CfP, but an actual discussion point, yay!
>>>>>
>>>>> I've just blogged about our use of HTTPS across www.w3.org which
>>>>> raises some questions for this community. Please see
>>>>> https://www.w3.org/blog/2016/05/https-and-the-semantic-weblinked-data/
>>>>
>>>>
>>>> On the one hand more security is a nice to have, but on the other, Cool
>>>> URIs dont change.  It's really hard to estimate the cost, and unintended
>>>> consequences of changing URIs.  But my feeling is that we systematically
>>>> underestimate it.
>>>>
>>>> IMHO, It's kind of a shame that http wasnt made secure, and that a new
>>>> scheme https was invented.
>>>>
>>>>
>>>>>
>>>>>
>>>>> Comments welcome.
>>>>>
>>>>> Thanks
>>>>>
>>>>> --
>>>>>
>>>>>
>>>>> Phil Archer
>>>>> W3C Data Activity Lead
>>>>> http://www.w3.org/2013/data/
>>>>>
>>>>> http://philarcher.org
>>>>> +44 (0)7887 767755
>>>>> @philarcher1
>>>>>
>>>>>
>>>>
>>>
>>> ------------------------------------------------------------
>>> IHMC                                     (850)434 8903 home
>>> 40 South Alcaniz St.            (850)202 4416   office
>>> Pensacola                            (850)202 4440   fax
>>> FL 32502                              (850)291 0667   mobile (preferred)
>>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>>>
>>>
>>>
>>>
>>>
>>>
>>
>
Received on Sunday, 22 May 2016 05:11:29 UTC