Re: forgiving error-handle mode?

On Nov 6, 2013, at 11:06 PM, carmen <_@whats-your.name> wrote:
> 
> i'm wondering if anyone has favored approaches to taming the XHTMLish/pedantic RDF.rb libraries and use them in a manner simialr to the way people who have been writing mildly-defective webpages have been getting them parsed/shown 

Both rdf-rdfa and rdf-microdata gems do regular HTML tag-soup,parsing in addition to stricter XHTML.

> i use JSON (with RDF schemas like SIOC/DC/FOAF URIs as key names, to ease export, should i be able to get any of this to work). data is parsed from a wide variety of non-RDF sources mostly either from the web via CSS selectors or from UNIX filesystem via custom triplrs. so none of it is RDF to begin with, and there's a certain wrangling effort. but this is the data i'm primarily interested in, not being published in RDF (tool friction is one piece of that)

Sounds like you'd like a GRDDL parser for the RDF.rb Eco-system. That could be a useful contribution, but you may be able to get what you want though via a custom scraper.

Speaking of JSON, have you checked out JSON-LD? It try's to do much if what you discuss by interpreting JSON data.

> maybe a RDF::AllowMe option can be globally set? with warnings to stderr
> 
> Addressable::URI likes to throw exceptions about URIs not being valid URIs
>  - exceptions thrown on the document writer method,
>    not individual-triple level, thus losing entire doc because some URI was wonky (tried adding a rescue right on the add_triple call but the error'd already bubbled up to a containing context somehow)

The forthcoming 1.1 release discards Addressable in favor of a native URI parse, it should be much more tolerant of bad URIs when not validating.

>  it hates statements now? im pretty sure this worked a few mnonths ago:

It likes statements! It just prefers valid statements. Any specific problem, when not validating, please provide a use case.

> RDF::WriterError: Statement #<RDF::Statement:0x5400646(<http://www.w3.org/1999/02/22-rdf-syntax-ns#Alt> </frequency> "16"^^<http://www.w3.org/2001/XMLSchema#integer> .)> is invalid
> 
> this one looks OK  (globally-resolving predicate URI would be nice) 

Resolving relative URIs is something done by a parser, in the abstract RDF syntax, all URIs must be valid.

> read sourecode and saw a "validate => :false" option, i sprinkled this into all my calls in, but it didn't help
> so i guess some other syntactic candy that created the writer like "for" doesnt take those hash args, or who knows.

The :validate option is pretty universal, at least for readers, but it means different things to different readers. It's not always easy/possible to be as lax as you might like.

> likely one could trudge through and figure out a way around each and every error, and be a true pedantic masochist, but since i'm not getting paid, and this is just to see if the RDF i/o portion of my webserver still works , in case anyone would want it ("theoretical needs" is not what drives my design decisions, i wanted a simple minimalist ruby webmail/feed-aggregator/filesystem-triplizer and it turned out that SIOC was good for message data-fields, W3 had some POSIX schema via rww.io/data.fm, DC was good for primitive title/abstract properties etc)
> 
> consider myself a pedant, and this library is not disappointing me
> in short, XHTML2 vs TagSoup, with RDF

Send in some concrete use cases and let's see what we might be able to do. (I'm not getting paid for this either.)

Gregg

Received on Thursday, 7 November 2013 14:07:07 UTC