- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Tue, 5 Aug 2003 17:24:03 +0300
- To: "Brian McBride" <bwm@hplb.hpl.hp.com>, "ext Martin Duerst" <duerst@w3.org>
- Cc: "rdf core" <w3c-rdfcore-wg@w3.org>, "i18n" <w3c-i18n-ig@w3.org>
- Message-ID: <000801c35b5d$3861de50$f89216ac@NOE.Nokia.com>
[sorry, Brian, for jumping in here, but...] Martin, I appreciate the position you present in the post below, but I must stress the point that the problem you present is a general problem relating to working with XML fragments, no matter what the context, and *not* a problem with RDF, nor a problem for RDF to fix. By saying this, I do not mean to suggest that the problem is not important to solve. It is. But not by RDF, and while we have bent over backwards to try to figure out some way to lessen the problem insofar as RDF is concerned, we have not come up with any solution that, all things considered, is better than what is now on the table and reflected in the latest editors drafts. Anytime an XML user wishes to deal with anything smaller than a complete XML instance, they will encounter these sorts of issues. RDF is not creating this problem. If RDF were to provide one solution, then that would simply be inconsistent with another solution provided for some other context. You seem to be big on having consistent treatment, so it puzzles me that you would seek so specialized a solution by RDF specifically. You appear to be asking us to make RDF inferior for SW purposes in order to address this problem, just a little bit, insofar as RDF alone is concerned, for the sake of some indeterminable number of XML users. Let's not try to treat the symptoms rather than find a cure. Let the XML folks tell XML users how to deal with this problem in a *general* way when dealing with XML fragments irregardless of the language of encapsulation. E.g., have someone dust off the XML Fragment Interchange [1] spec, make sure it does the right things, and then tell folks use it *everywhere* they deal with XML fragments, including with RDF. RDF is not going to be able to solve this general XML problem. Certainly not at this point, given the fact that we should have been finished with all this stuff well over a *year* ago! Can we *please* stop spinning our wheels on this and move one? Thank you. Regards, Patrick [1] http://www.w3.org/TR/xml-fragment ----- Original Message ----- From: ext Martin Duerst To: Brian McBride Cc: rdf core ; i18n Sent: 05 August, 2003 16:34 Subject: Re: RDF Use Case: scraping metadata from the web Hello Brian, At 15:16 03/08/04 +0100, Brian McBride wrote: >I'm still at the point of looking for a use case to demonstrate that >markup integrity is a real problem. For some people, it is important. For others, it may not be important. For the RDF Core WG, the graph is obviously very important, and the triples. If somebody created a new language to serialize RDF, and this new language would mess up graphs, I guess you would not be happy. If this currently happened with RDF/XML, or if some XML group changed XML so that it could happen, I guess you would not be happy. So I guess you should be able to understand that other people will not be happy at all if their markup is arbitrarily changed. It's not necessarily the people in the I18N WG who are most concerned with markup integrity (although I think we actually are). But assume some third party wants to use RDF to scrape metadata from XML documents, and this third party is concerned about markup integrity, either because s/he is just convinced that markup is crucial, or because of concerns for various round trip scenarios. After all, any user can scrape plain text literals (with language information), put them into RDF, and get them back unchanged. Do you (the RDF Core WG) or we (the I18N WG) have a detailled use case for this? Or do we all just agree, even without ever really talking about it, that it would be a very bad idea if plain literals suddenly got changed, e.g. if RDF suddenly upper-cased all plain literals? So let's assume that people in the XML community are concerned in a similar way about markup integrity as people in the RDF community are concerned about triple and graph integrity. So a person who is concerned about markup integrity does some scraping or something similar. They are faced with the following alternatives: 1) Preserve the markup, ignore the language information 2) Change the markup, squeeze in an additional element to attach language information. 4) Put the language information somewhere else 3) does not work because then language information is lost for purposes such as glyph disambiguation and text-to-speech. So the user is faced with the question: Do I preserve markup, or do I preserve language information? Seen from an I18N viewpoint, if we get to this point, we already have lost. From our experience, we know that the users unfortunately in most cases will just take the easy way out, even if they don't explicitly weight the alternatives. That means choosing 1), and thus loosing language information, the wrong thing from an i18n perspective. The fact that the users not only have to change the markup, but that they have to think about how to change it (which element to use) and that this may depend on circumstances (e.g. <div> vs. <span> in a very simple HTML case), which may significantly complicate the extraction logic, doesn't at all help pushing people towards conserving language information. Bad for i18n. There is a fourth alternative that users may take in some case, which is to strip all the markup so that they can maybe use some language info (or maybe not). Of course, loosing markup is also bad for i18n. >You suggested that your issue has to do with multiple users doing the same >thing differently and I asked you to refine the use case we have been >discussing to better illustrate your issue. > >I don't see how this use case illustrates a problem with markup integrity; >rather it assumes that problem. Yes, to some extent, we have to assume it as a problem because we know that others see it as a problem. Hope this helps. Regards, Martin.
Received on Tuesday, 5 August 2003 10:24:06 UTC