Re: On microformat 'schemas'

Dan,

I had this discussion with DanC while he was at the 'tute [1]. He's
trying to use profile urls in microformats to generate an OWL ontology
for microformats. DanC assured me that the ontologies are very
straightforward (basic classes, properties and cardinality
restrictions). We also discussed a JSON schema to do validation and he
already has a very close syntax he uses in Python when parsing text
RFCs (like iCal). The point being that what *if* we could roundtrip
between JSON, HTML and RDF and get some validation on the way.

I'll ask him for his Python source that looked very similar to Steve
Farrell's JSON schema, except it was valid JSON and not BNF format.
It's funny how Sergio Tessaris asked me for JSON schema the first time
he saw the SPARQL XML Results in JSON note during the 2006 TP. My
first reaction was: do you really think so?

-Elias

[1] http://dig.csail.mit.edu/breadcrumbs/node/133

On 6/4/06, Dan Brickley <danbri@danbri.org> wrote:
>
> >From the "I saw this and thought of you" department...
>
> http://smackman.com/2006/06/01/an-old-idea/
>
> Has an interesting account of some current difficulties with
> microformats, as well as a proposal for a schema system. I haven't
> digested the latter, but certainly agree with the problem analysis:
>
> Excerpting,
> [[
> Ive been giving some thought to parsing microformats lately. A few
> threads seem to be converging
>
> The first is that its hard to parse microformats. You can hand-write a
> parser in a little bit of time thats 80% right. But getting all of the
> hcard rules, e.g., encoded is tricky. Its reasonable to assume,
> therefore, that there are a lot of 80% parsers out there like the one I
> wrote for my Ray Ozzie Clipboard example.
>
> The second issue relates to hatom, which uses different class names for
> the same concept at different scopes. For example, the entry title is
> called entry-title not title. I asked Ryan about this when I saw him at
> www2006, and he told me that they vacillated on this decision, but they
> settled on entry-title because people can nest other microformats inside
> hatom, and so it would be easier for the parser writers if there were no
> colliding class names, even in different microformats. In fact, he
> suggested that theyd probably made a mistake with hcard, since the class
> names were so likely to collide with other microformats. Ok, so in other
> words entry-title is a hack around the problem of it being hard to parse
> microformats, and we can expect more of these.
>
> When I bumped into Brian at the same event, I commented that
> microformats really have a problem with nesting. He agreed. He said it
> put a burden on the parser writer to potentially have to understand all
> microformats in order to reliably parse web pages that contain them.
>
> So,
>
>    1. Its a lot of trouble to write a parser
>    2. Bad parsers will proliferate
>    3. Microformats are evolving toward being easier to parse, not easier
> to create
>    4. Its not clear how you can nest microformats w/o knowing how
> parsers will behave
>    5. Users are discouraged from inventing their own specialized
> microformats, presumably because of the risk of collisions and
> difficulty others will have in parsing them
>
> [...]
> ]]
>
>
> Dan
>
>
>
>

Received on Sunday, 4 June 2006 12:46:02 UTC