xHTML output

I'm using tidy (4Aug00 source) to produce xHTML output and then trying
to verify this against the W3C DTDs
I have a number of problems/solutions which relate to attribute handling
when the tidy output is given to an XML parser.

1. If XmlOut is on then I remove and attributes which are marked as
PROPRIETARY. I do this in Checkattribute (attrs.c) as follows - not sure
of validity of using RemoveAttribute here

        if (XmlOut && attribute->versions & VERS_PROPRIETARY)
        {
                ReportAttrError(lexer, node, attval->attribute,
PROPRIETARY_ATTR_VALUE);
                RemoveAttribute(node, attval);
        }

2. If attribute is valid name but is not found in table (FindAttribute
returns null), I remove this attribute in ParseAttrs (lexer.c)
            av->value = value;
            av->dict = FindAttribute(av);
            if (XmlOut && av->dict == NULL)
            {
                    ReportAttrError(lexer, lexer->token, attribute,
BAD_ATTRIBUTE_VALUE);
                    MemFree(attribute);
                    MemFree(value);
            }
            else
            {
                    list = av;
            }

3. At the moment tidy leaves the case of all attributes as is. However
in certain cases, the DTD expects lower case input, eg <form
method="POST"> would give error. There is no simple solution to this, eg
the title attribute case should remain as is, though I was intending
extending the attribute table to incorporate a tolower flag which would
then be accessed in PPrintAttributeValue(pprint.c)
        if (XmlOut && attrflag_to lower)
            PPrintChar(ToLower(c), mode)
        else
            PPrintChar(c, mode)
Suggestions on this one welcome

4. Perhaps not surprisingly, tidy does not verify attributes against a
tag i.e. it does not check if attribute is allowed on this tag. This is
causing me some grief. There are, I suppose, at least two solution to
this. (a) maintain a table within the tags/attrs module and check to see
if valid, if not remove. (b)perhaps more elegantly, allow tidy to verify
against a given DTD (become a validating parser) and remove not allowed
attributes!!

Thanks for comments/suggestions
Tony

--
===============
Tony Goodwin
mailto:tony.goodwin@bfs.phone.com

Received on Tuesday, 3 October 2000 09:15:09 UTC