Re: XMP and HTML 2.0 and 3.0

Michal Young (young@cs.purdue.edu)
Tue, 2 May 1995 10:01:05 -0500


Message-Id: <199505021457.JAA18240@aleta.cs.purdue.edu>
Date: Tue, 2 May 1995 10:01:05 -0500
To: babafou@ensta.fr
From: young@cs.purdue.edu (Michal Young)
Subject: Re: XMP and HTML 2.0 and 3.0
Cc: www-html@www10.w3.org

>OK but what should I use if I want a block of text to be inserted without
>any interpratation of HTML tags? I really need it. Did I miss something?

It's relatively easy to write a filter that will "protect" each piece of
html code, by replacing special characters with hexadecimal character
codes.  I use the following C function in a filter for program listings
(I'm cutting and pasting from a larger file, so I hope I don't mess it up).



/*
 * Purify a string: escape whatever needs escaping Pass it a buffer large
 * enough for the purified string to fit (allow about 3 times max string
 * length) Return values is a shared reference to the buffer, for convenience
 * in using Purify_String in a call to printf. Do NOT use the same buffer in
 * two nested function calls (e.g., a printf with two string arguments),
 * since this would alias the buffer.
 */
char           *Purify_String(char *s, char *buffer);

/*
 * Escmap[ch] = 0 if ch can be echoed as itself, -1 if ch should be printed
 * as decimal code
 */
static int      escmap[256];

void
init_charmap()
{
        /*
         * We will escape: Non-graphic characters and the html special
         * characters < > & "
         */
        static char    *impure = "<>&\"\\";
        char            ch;
        for (ch = 0; ch <= 126; ++ch) {
                escmap[ch] = 0; /* Default: just print it */
                if (!isprint(ch)) {
                        escmap[ch] = -1;
                };
        };
        {
                int             i;
                for (i = 127; i <= 255; ++i) {
                        escmap[(char) i] = -1;
                };
                for (i = 0; ch = impure[i]; ++i) {
                        escmap[impure[i]] = -1;
                };
        };                      /* Nested declaration block */
}



/* other stuff here ... */

/*
 * Purify a string: escape whatever needs escaping Pass it a buffer large
 * enough for the purified string to fit (max 5 times string length). Return
 * values is a shared reference to the buffer, for convenience in using
 * Purify_String in a call to printf. Do NOT use the same buffer in two
 * nested function calls (e.g., a printf with two string arguments), since
 * this would alias the buffer.
 */
char           *
Purify_String(char *s, char *buffer)
{
        char            ch;
        int             si, bi; /* Index into input string, output buffer */
        for (si = 0, bi = 0; s[si]; si++) {
                ch = s[si];
                if (escmap[ch] == 0) {
                        buffer[bi++] = ch;
                } else if (escmap[ch] == -1) {
                        sprintf(&buffer[bi], "&#%03d;", ch);
                        bi += 6;
                } else {
                        buffer[bi++] = '?';
                };
        };
        buffer[bi] = 0;
        return buffer;
}

----------------------
Michal Young
Purdue University
Software Engineering Research Center
Department of Computer Sciences
1398 Computer Science Building
West Lafayette, IN  47907-1398
voice: 317-494-6023
fax:   317-494-0739
URL:   http://www.cs.purdue.edu/people/young
-----------------------