W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2006

[whatwg] foreign attributes Re: several messages about XML syntax and HTML5

From: Elias Torres <elias@torrez.us>
Date: Tue, 5 Dec 2006 19:02:53 -0500
Message-ID: <905f7c910612051602q2dc52b24saaa1666959d2c076@mail.gmail.com>
On 12/5/06, Ian Hickson <ian at hixie.ch> wrote:
> On Tue, 5 Dec 2006, Elias Torres wrote:
> >
> >    <p class="ibm-order">
> >      <span property="ibm-customer">
> >       <span property="ex-name">Ian Hickson</span>
> >       (<span property="acme-id">95237032895</span>)
> >      </span>
> >      has purchased a
> >      <span property="ibm-part">
> >       <span property="ex-name">Widget x12</span>
> >       (part ID <span class="acme-id">295250X12</span>)
> >      </span>
> >     </p>
> >     <p property="ibm-order ibm-deleted">
> >      ...
> >     </p>
>
> So basically the same thing, ok. So we agree on the syntax.
>
>
> > > What would this look like in your ideal world? Could you give some
> > > examples of what the above would be like, with code samples?
> >
> > The "generic" extractor example I have in python. There's also a
> > Javascript equivalent to that code.
> >
> > http://svn.rdflib.net/trunk/rdflib/syntax/parsers/RDFaParser.py
> >
> > I'm very familiar with the code required to parse is and it's not hard
> > at all, the problem is that code is specific to that structure.
> > Everytime we have a new structure, we have to write that code. Also,
> > that code is very dependent on the tree structure.
>
> Ok... could you give an example of what the code to process data like the
> above would look like? Not the generic parser part, I mean the code that
> makes a list of the orders as {customer id, part id} tuple, with deleted
> orders omitted, or whatever it is you would do with this data.
>
>
> >    <p id="order1" class="ibm-order">
> >      <span property="ibm-customer">
> >       <span property="ex-name">Ian Hickson</span>
> >       (<span property="acme-id">95237032895</span>)
> >      </span>
> >     </p>
> >     ....
> >    <p>
> >      has purchased a
> >      <span about="order1" property="ibm-part">
> >       <span property="ex-name">Widget x12</span>
> >       (part ID <span class="acme-id">295250X12</span>)
> >      </span>
> >     </p>
>
> The key point here being the reference to an earlier blob in the same
> page, right?
>
> Interesting. That's something that currently can only really be done with
> tables, <output>, and hyperlinks; I wonder if we should add a fourth way
> that is more convenient for Microformat-like data.
>
> --
> Ian Hickson               U+1047E                )\._.,--....,'``.    fL
> http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
> Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
>

A few comments on your code. You can handle with n-levels deep, but
not really nested property. You are also assuming that the content of
the element is the entire value of the property.

<div> <span class="ibm-part-description">our part number <span
class="part-id">123</span></span></div>

Microformats is very restrictive in how you can parse the data. We
need the flexibility of specify the content everywhere, yet the
property apply to any element on the page, not just the parent
element.

Also, remember, we are going after a declarative mechanism that binds
"structure" to presentation and we don't know ahead of time all of the
properties that are attached to a structure.

-Elias
Received on Tuesday, 5 December 2006 16:02:53 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:31 UTC