Tidy users: please provide feedback about expected behaviours

Cross-posted to
[1]: https://lists.w3.org/Archives/Public/public-htacg/ (public-htacg@w3.org)
[2]: https://sourceforge.net/p/tidy/mailman/tidy-develop (tidy-develop@lists.sourceforge.net)
[3]: https://lists.w3.org/Archives/Public/html-tidy/ (html-tidy@w3.org)


Good day list subscribers:

This solicitation is to ask for your feedback on what the desired, default behaviour for Tidy should be given certain conditions.

To-date there has been some discussion about HTML5 versus previous HTML version behaviour: https://github.com/htacg/tidy-html5/issues/169

However in order to bring it to a broader audience I’m asking for your suggestions here.

The general question is this: assuming no hints from the `--doctype` option and a missing `<!DOCTYPE html> declaration, should Tidy 5.0.0 assume that it is attempting to Tidy HTML5, or a previous version of HTML? Please consider that one of Tidy’s use cases is tidying/diagnosing “snippets” of HTML, such as with the `--body-only` option, and so it’s unsafe to assume that a doctype declaration will always be present.

The assumption for Tidy’s default behaviour affects validation, particularly for anchors surrounding block level elements. The example taken from the tracker above:

<a href="l1"><p>one</p></a>

…can result in:

<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body>
<a href="l1"></a>
<p>one</p>
</body>
</html>

The future (and indeed the present) is HTML5, and at first this seems like a reasonable assumption to make. On the other hand making this assumptions can _seriously_ affect backwards compatibility for legacy HTML that still lacks legacy DTD’s.


Rather than repeat all of the pros and cons here, I will direct you again to the [tracker](https://github.com/htacg/tidy-html5/issues/169) in the event you would like to see current discussion.

We welcome your feedback on any of these mailing lists, or on the github tracker.


Thank you.

---
Jim Derry
HTACG

Received on Tuesday, 10 March 2015 09:55:08 UTC