- From: Matt May <mcmay@bestkungfu.com>
- Date: Thu, 28 Dec 2000 18:54:45 -0800
- To: "Kynn Bartlett" <kynn-edapta@idyllmtn.com>
- Cc: "Anne Pemberton" <apembert@crosslink.net>, "WAI GL" <w3c-wai-gl@w3.org>
----- Original Message ----- From: "Kynn Bartlett" <kynn-edapta@idyllmtn.com> > Matt, not everyone on this list has the same depth of technical > know-how -- can you explain what the dangerous of search-and-replace > are, and why the IBM webmasters would not want to run a sed script > over their content storage system? 1) human error The script that I run replaces DOS with <acronym title="Disk Operating System>. (Note the missing close quote.) Result: every instance of "DOS" in my database is replaced with invalid markup. The entire database now contains corrupt HTML. If I did this programmatically for every acronym, I may as well burn the whole database down and start fresh, because everything is corrupt. (In many applications, such as knowledge bases, where lots of content is stored and updated concurrently, going to a backup database could lose current data, and data loss to a DBA means roughly the same as "malpractice" to a doctor.) 2) false positives "...causing a distributed <acronym title="Disk Operating System">DOS</acronym> attack..." "...a list of <acronym title="Disk Operating System">DOS</acronym> AND DON'TS..." 3) unforeseen replaces Say a tool like this defines "IF" as "infielder". (Did I mention that acronymfinder.com has 20 acronyms for "IF"?) Replace "if" with "infielder" everywhere in a document, and you've basically destroyed every script known to man: Java/JavaScript/C/Perl: <acronym title="infielder">if</acronym> (a != 4) {...} VBScript: <acronym title="infielder">IF</acronym> MYVAR <> 4 THEN ... ...and so on. Every script is broken. If I had a nickel for every time I've seen it happen... 4) impact on other systems When content is modified wholesale in a database, particularly on a large scale, every related system needs to be tested, and possibly updated: data entry apps will need to be modified to support defining terms as they're added; scripts to dump and load the data will need to be checked; client apps need to be tested as well. I'm sure there's more. This is just the stuff I've witnessed. ---- matt
Received on Thursday, 28 December 2000 21:55:01 UTC