W3C home > Mailing lists > Public > public-csv-wg@w3.org > June 2015

Re: test URL redirections...

From: Ivan Herman <ivan@w3.org>
Date: Fri, 12 Jun 2015 11:21:25 +0200
Cc: Jeni Tennison <jeni@jenitennison.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-Id: <85CB2807-0811-47EE-BD18-F4BF1954D799@w3.org>
To: Gregg Kellogg <gregg@greggkellogg.net>

> On 11 Jun 2015, at 21:17 , Gregg Kellogg <gregg@greggkellogg.net> wrote:
> 
>> On Jun 11, 2015, at 4:13 AM, Ivan Herman <ivan@w3.org> wrote:
>> 
>> Gregg,
>> 
>> I must admit I am in a territory that I do not really know…
>> 
>> I did set up a redirection, through .htaccess, in http://www.w3.org/2013/csvw/ using:
> 
> Should this be 2014 or 2015? 2013 was based on the RDF WG active time, AFAIK.

/2013/csvw is the official homepage of the group (2013 being the year when it started).

> 
>> RewriteRule ^tests/(.*) http://w3c.github.io/csvw/tests/$1 [R=303]
>> 
>> However, I am not sure it will really work for the test suites. If, in a browser, I type in, say,
>> 
>> http://www.w3.org/2013/csvw/tests/test011/result.json
>> 
>> then indeed get to the relevant json file whose URI is http://w3c.github.io/csvw/tests/test011/tree-ops.csv. However, the redirection is made through a 303 flag, but that means that the browser address bar will show the w3c.github.io address, not the www.w3.org one. I do not know whether that matters.
> 
> It works for my test runner (mostly), and I think this is a reasonable way to go about it.
> 

Great.

> In the case of the JSON tests, because they contain absolute URLs, they will need to be updated with this location. (RDF tests can make use of the result location and use relative URLs internally).
> 
> Unfortunately, without doing a real proxy, we can’t also set HTTP response headers beyond the redirect. A caching proxy might also better allow HTTP caching of the results, so as to not burden the W3C infrastructure, and allow clients to reasonable perform client-side caching; it might be worth investigating with the systems team if something like this is possible.
> 

I am almost sure that the security aspects will dominate: a proxy being, essentially, github.com would not be acceptable. And, I must admit, I understand...

> Another possibility would do a post-receive hook to pull the data from GitHub on commit; we do this with rdfa.info and json-ld.org to automatically update site contents on commit. This would avoid any redirect issues. It’s implementable using several different mechanisms available in PHP, Ruby and most other infrastructures. It works by setting up a URL to receive an HTTP POST when commits are made, which causes it to do a git pull to refresh a local directory. That might be the simplest thing, if it is possible. See https://github.com/json-ld/json-ld.org/tree/master/utils for how json-ld.org does it using PHP.

I will ask, but the issue is that all this is related to CVS, too (I presume in contrast to the json-ld.org site): the W3C Web site is, for better or worse, one giant CVS repository…

I think we should not rely on this for now.

Does the test suite works, for the most part, offline? I mean if I download the test suite then I should be able to test most of the features, right? If so, we should add a notice asking people to download things from github to avoid network bottlenecks. We can then install the test suite manually on W3C when we issue a Proposed Recommendation.

> 
>> More importantly, what it requires is for the csvw clients to use a URI library that automatically handles redirection. Is that always the case? I do not know. (I tried to use the [P] flag, but that (understandably) does not work because it would instruct apache to use an external server as proxy which W3C does not allow for security reasons.)
> 
> Or it requires that developers do this on their own; I typically handle redirect myself to make sure that redirect semantics are handled properly, as most URL libraries don’t honor the details of this very well, IMO.
> 
>> Strangely enough: I tried to do a wget on the w3c address, and I got a 404. I then realized that doing a wget on the github.io address leads to a 404; I am not sure how github handles these requests.
> 
> curl -L http://www.w3.org/2013/csvw/tests/test011/result.json works okay.

O.k. I will not try to understand the difference between curl and wget:-)

Cheers

Ivan

> 
>> So… do you think it is possible to use the test cases with such caveats? Of course it would be nice but I am a bit afraid some clients may have issues handling the 303…
>> 
>> Any good ideas?
> 
> I trust developers to work it out properly. Once we settle on a permanent location, we can make that change. I do want to track down my specific test failures, though.
> 
> Gregg
> 
>> Ivan
>> 
>> P.S. I am still waiting for our system people to set up /.well-known for me.
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>> 
>> 
>> 
>> 
> 


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704





Received on Friday, 12 June 2015 09:21:37 UTC

This archive was generated by hypermail 2.3.1 : Friday, 12 June 2015 09:21:38 UTC