Re: Fuzz (Windows and Mac OS X builds available) from Manu Sporny on 2009-03-19 (public-rdf-in-xhtml-tf@w3.org from March 2009)

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Thu, 19 Mar 2009 01:29:00 -0400
To: RDFa Developers <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <49C1D81C.3090407@digitalbazaar.com>

KANZAKI Masahide wrote:
> Hi Manu, thanks for good Add-ons for popular platforms.

Hope it helps you out in some way :)

> Unfortunately, Fuzz seems to stop extracting triples from <body> of a
> page with i18n characters, in addition to the issue of Fuzz Ticket #35
> (not display UTF-8 characters correctly). It presents some triples
> from <head> element, though. It would be grateful if you'll check this
> extraction issue along with #35.

If you can point me at a page that creates this error, that would be
very helpful.

Currently, there is working UTF-8 support in librdfa... but
unfortunately, all strings are down-converted to Mozilla CStrings in the
C-side code when translating the data into a form suitable for the
Javascript callback. That should be a fairly easy fix.

The i18n characters issue might be Expat freaking out on entities that
it doesn't understand, as it does from time to time. In any case, a
solid example or link to a page would be very helpful in tracking down
and fixing these bugs.

Speaking of erroring out on pages, Fuzz has a hard time on malformed
HTML/XHTML web pages. Anybody know of a good TagSoup-to-XHTML transcoder
for C? HTML Tidy isn't cutting it and Taggle is the best I've been able
to find...

-- manu

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
twitter: http://twitter.com/manusporny

Received on Thursday, 19 March 2009 05:29:38 UTC