Re: HTML and XML from Bijan Parsia on 2009-02-16 (www-tag@w3.org from February 2009)

From: Bijan Parsia <bparsia@cs.manchester.ac.uk>
Date: Mon, 16 Feb 2009 21:13:41 +0000
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Bijan Parsia <bparsia@cs.man.ac.uk>, www-tag@w3.org
Message-Id: <5EDC23AB-5B71-47A7-ADB5-4C96E75C0F0A@cs.manchester.ac.uk>
One more time...

On 16 Feb 2009, at 20:54, Julian Reschke wrote:

> Bijan Parsia wrote:
>>> The procedure is to run the XML through an XML parser. How to  
>>> invoke that parser is platform-specific.
>> I feel confident that if those are your instructions, that will not  
>> be sufficient.
>
> These are not instructions. And I'm not supposed to come up with them.

Then I don't understand what you think to accomplish with this  
exchange. First, you doubt my claim. I try to substantiate it and you  
move on. I point out problems with some of your recommendations and  
observations and you, afaict, just disavow them.

Given that it's evidently non trivial to come up with bullet proof  
instructions for this "simple task", I take that as concession.

>>> I only mentioned IE because that's something available to  
>>> something like 90% of the users, out of the box.
>> I think you're off track. The question was, as I understood it, was  
>> of the basic usability of XML such that it is warranted to required  
>> and expect producers to produce only well-formed XML.
>> It's pretty clearly, from our discussion alone, not at all obvious  
>> that XML is remotely usable for broad swaths of the population.  
>> It's unclear, of course, whether heroic parsing would help. But  
>> I've presented a real case where it would have.
>
> Could you please define "population"? The full population? Software  
> developers?

I would say both.

> I'll be the first one to agree that XML is not something for  
> everybody, but I thought we were talking about CS students?

I pointed to CS students since they were more likely, I think a  
priori, to do better. Prima facie, if they have difficulty, it's  
unlikely that arbitrary populations will have an easy time.

>>>> using a browser in a way that many (most) users of browsers would  
>>>> not expect to use it or a rather obscure tool. Furthermore, your  
>>>> instructions are incomplete, as I'm pretty sure that a .txt  
>>>> suffix on the file name for this content:
>>>> """<test>
>>>>   <foo>dfdf<b>fd</foo></b>
>>>> </test ref="dfsdf>"""
>>>> will load it without giving any errors. (Checked, so it did.) And  
>>>> if I serve it with the right mime type, even the .xml won't help.
>>>
>>> Yes. So? Works as designed. Teach people how to do it right.
>> I see that you aren't interested in investigating the usability of  
>> XML. Oh well.
>
> Yes, "oh well".
>
> The fact that if you feed text/plain into IE causes it to process it  
> as text/plain is a feature.
>
> If you think this is a problem, tell the students not to.

I think you are confused. I'm not asking you for pedagogic advice. If  
you read what I've written and come to the conclusion that the  
explanation for the observations I reported is that I Am The Problem,  
well, that's your business. I don't think it's a fruitful methodology.

>>>> I reiterate that it is, prima facie, non-trivial in many  
>>>> computing environments to produce well formed XML.
>>>
>>> It may not be trivial to produce it, but it *is* trivial to test it.
>> My example above shows that that's false. Furthermore, testing  
>> doesn't mean that producing it is easy. If correcting is too  
>> difficult people will give up and either publish what they have or  
>> don't publish.
>
> Testing is easy for anybody who really wants to.

Really. Interesting. I would be interested in your evidence for that.  
I also would be interested in the metric for *really* wanting too.

> And testing will tell you whether it's well-formed.

Sure. But I fail to see how this is connected to this thread.

> Now interpreting well-formedness error messages may be tricky. In  
> case of obscure messages, the problem usually can be managed by  
> making the input smaller. Just as in any other computer language.

Ok, so we agree that well formed XML is not particularly easy to  
produce. Great.

>>>> ...
>>>> In fact, the problems tended to occur in elements I didn't *care*  
>>>> about. So, in order to extract some data, I have to fix all the  
>>>> well-formedness errors *then* use my XQuery?
>>>> ...
>>>
>>> Actually, the producer is supposed to fix the bug, not the  
>>> consumer :-)
>> Thus, I should leave that data inaccessible to me until the  
>> producer fixes it?
>
> Depends on your priorities.
>
> If you decide to fix the problem yourself, how can you be sure that  
> your interpretation of the data is correct?

Interpreting data is always an issue. Being well formed won't  
eliminate that.  But, for example, I can (perhaps) communicate better  
with the producer. I can compare with other sources. I can estimate  
the probably severity of potential errors. Etc.

Cheers,
Bijan.
Received on Monday, 16 February 2009 21:14:15 UTC