Re: 'Official' site? from Rob McCool on 2003-01-07 (public-rdf-tap@w3.org from January 2003)

From: Rob McCool <robm@robm.com>
Date: Tue, 7 Jan 2003 10:24:10 -0800
To: Scott MacKay <scottmackay@yahoo.com>
Cc: public-rdf-tap@w3.org
Message-ID: <20030107102410.B3374@flapjack.stanford.edu>

> [sorry if this is a repeat]

[it isn't]

> Hello!
> I was curious, where is the actual maintained
> 'official' site for TAP?  I poked around
> tap.stanford.edu and could download it, but was
> unsure if that was the main site.  

That's the site.

> I did not see any documentation (normal or javadoc) and was uncertain
> how maintained the product was.

The product is maintained, progress in feature addition is not as fast as
I'd like it to be but we're moving along steadily. There are tutorials
on the download page: http://tap.stanford.edu/tap/download.html

There is no javadoc for the toolkit, since the toolkit was written in Perl
first and I didn't follow the Javadoc conventions. However, if you look at
any of the source files, each function/method will have a small paragraph
describing what it does, what is expected to come in, what is expected to
come out, and what is expected to be returned. 

> Also, it seems to be mentioned that you could poll live services (current 
> CD prices comes to mind). Are there any examples distributed which shows 
> how one would do this?

The GetData toolkit for Perl or Java can do this, but the services on 
tap.stanford.edu are currently limited to the core TAP KB. The Semantic 
Search demo was assembled by querying services that scrape HTML sites for
data. These services are not yet queryable from TAP's GetData because they 
use an older version of the protocol. 

Until about a week ago, it didn't make sense to upgrade these services to
the newest version of GetData, because they were also built on an older
Web Services toolkit that was not based on Apache or Perl or Java. We don't
want to maintain the Web Services toolkit, and so the services must be moved
to the new protocol and the new toolkit. We further have had numerous
maintenance problems with the scrapers, as they tend to break when the 
underlying HTML site changes.

These services will be dynamically queryable via the newer GetData some time 
in the next two months. A new system has been developed (about a week ago) 
to perform the HTML scraping in a server based on Apache, and in a way such 
that the scrapers will be more able to dynamically fix themselves when the
underlying HTML changes.

The core GetData toolkit is basically finished. I'm hoping to write a Python
version in the next few months. The TAPache module works reasonably well, but
is difficult to set up and install because it requires Apache development
libraries. A solution to that, such as distributing binaries or RPM's or
using an alternate development system than Apache modules, is something I'm
working on.

Hope that helps.

Received on Tuesday, 7 January 2003 13:24:48 UTC