glean.py, a GRDDL implementation based on redland and xsltproc

""" glean.py -- GRDDL client implementation

USAGE:
  python glean.py output.rdf output.rdf http://.../input.html
  python glean.py output.rdf output.rdf http://.../input input-copy

  to glean RDF statements as per transformations in input
  and write them to output.rdf.

  input-copy.html is an option local copy of the input document.


REFERENCES:
 GRDDL Data Views: Getting Started, Learning More
 http://www.w3.org/2003/g/data-view
and
 Gleaning Resource Descriptions from Dialects of Languages (GRDDL)
 http://www.w3.org/2004/01/rdxh/spec


REQUIREMENTS:
  /usr/bin/xsltproc

TODO: peeking-into-profile-docs (working; need to update reference docs)
TODO: general XML support
TODO: peeking-into-namespace docs
TODO: better command-line arg parsing/ordering

ISSUE: I'd rather use pipes than temp files,
 but spawn() is more secure than popen()
 since it allows args to be parsed individually
 rather than as one command string to be parsed
 again by a shell.


LICENSE: Open Source: Share and Enjoy.

GRDDL Workspace: http://www.w3.org/2003/g/

Copyright 2002-2003 World Wide Web Consortium, (Massachusetts
Institute of Technology, European Research Consortium for
Informatics and Mathematics, Keio University). All Rights
Reserved. This work is distributed under the W3C(R) Software License
  http://www.w3.org/Consortium/Legal/2002/copyright-software-20021231
in the hope that it will be useful, but WITHOUT ANY WARRANTY;
without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE.
"""

# See change log at end
__version__ = '$Id: glean.py,v 1.6 2004/05/28 19:16:25 connolly Exp $'

http://www.w3.org/2003/g/glean.py

More on why I wrote it separately...

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Friday, 28 May 2004 15:19:45 UTC