- From: Wouter Beek <wouter@triply.cc>
- Date: Sun, 12 Nov 2017 23:20:20 +0100
- To: Jörn Hees <j_hees@cs.uni-kl.de>
- Cc: public-sparql-dev@w3.org
- Message-ID: <CAEh2WcMC=ZV6Fn1-r_F5RNMvdz3MqDv2XUFW11rcHaLHQwvL3g@mail.gmail.com>
On Sun, Nov 12, 2017 at 5:41 PM, Jörn Hees <j_hees@cs.uni-kl.de> wrote: > > > > ``` > > $ wget --recursive --page-requisites --convert-links --no-parent > https://www.w3.org/2009/sparql/docs/tests/ > > ``` > > Hmm, most of the folders on that page actually show a directory listing in > HTML, so that wget -r can follow those. > The only two folders i found to behave in a different way are: > - https://www.w3.org/2009/sparql/docs/tests/data-sparql11/entailment/ > - https://www.w3.org/2009/sparql/docs/tests/data-sparql11/http-rdf-update/ > They seem to have their own index.html . > > I think for rdflib we used these: > https://github.com/w3c/rdf-tests/ Thanks for the pointer. When I overwrite the version that I obtained from the other web site with my wget method with this (newer?) version from Github I get a sizable diff: 114 files changed, 27245 insertions(+), 6290 deletions(-) This change is not simply additive or consisting of trivial changes. For example, there are queries under `/bindings/' that are present on the web site I scraped earlier, but that do not appear in the Github version. --- Cheers, Wouter.
Received on Sunday, 12 November 2017 22:21:23 UTC