Re: The Future of the Build System from Linss, Peter on 2015-11-02 (public-css-testsuite@w3.org from November 2015)

From: Linss, Peter <peter.linss@hp.com>
Date: Mon, 2 Nov 2015 12:25:53 +0000
To: Geoffrey Sneddon <me@gsnedders.com>
CC: Public CSS Test suite mailing list <public-css-testsuite@w3.org>
Message-ID: <1E1D2D9A-DE6D-489F-8866-5D403683FBF8@hp.com>
> On Nov 2, 2015, at 2:03 AM, Geoffrey Sneddon <me@gsnedders.com> wrote:
> 
> On Mon, Nov 2, 2015 at 7:45 AM, Linss, Peter <peter.linss@hp.com <mailto:peter.linss@hp.com>> wrote:
> 
> > On Nov 1, 2015, at 12:53 AM, Geoffrey Sneddon <me@gsnedders.com <mailto:me@gsnedders.com>> wrote:
> >
> > Hi!
> >
> > Yeah, it's me again. (I feel like I've single-handedly increased the traffic on here sevenfold. Sorry!)
> >
> > I believe that we have rough consensus (mostly from TPAC rather than on-list discussions) that we should ensure that it's possible to run the test suite without building anything except a manifest. To that end (mostly also with rough consensus):
> >
> >  * We should ensure the build system is capable of handling HTML as source files and generating XHTML from them. (We're now at a point where every known user of the CSS test suites support HTML, but not all support XHTML—Servo is the notable exception here; furthermore, HTML content vastly dominates and hence should be the preferred format.)
> 
> It already does.
> 
> There's a comment in w3ctestlib saying that that is untested—hence the ensuring it's capable. (On the other hand, we seem to rely on this in some places anyway!)

Probably old cruft. The w3ctestlib code is quite a bit crufty. We most definitely rely on the format conversion in both directions.

> 
> >  * We should convert the source files to HTML. (We should probably move away from HTML 4.01/XHTML 1.1 to HTML 5 which we're at it.)
> 
> My concern here is introducing unnecessary churn. We have a lot of result data in the test harness and a change to the test source causes the old results to no longer apply.
> 
> There’s a path forward here as the test harness has the capability to know that different revisions are equivalent (for metadata changes, etc), we just need to make sure that the conversion process gives this data to the test harness so we don’t lose thousands of tests’ result data.
> 
> Hmm, yeah, that sounds like something we need to work out.
> 
> (It should also go without saying that the build tools should be used to do the conversion so that the new source files are identical to the build output.)
> 
> I can help here to make sure we do this right.
> 
> Does the build system not do some path rewriting within tests, or am I misremembering how much it actually does? I'm also unsure if we should use the build output anyway, given that'll significantly increase the size of the diffs (due to attribute reordering and whitespace changes).

It does do path rewriting for the references as part of a normal build. But that’s easily defeat-able for a conversion script.

IIRC the w3ctestlib currently preserves attribute ordering and doesn’t affect whitespace. (At points in the past it didn’t preserve attribute ordering so again there may be out of date comments.) If you diff build tests and their original source file the only differences you should see would be the reference paths and maybe some DTD changes.

> 
> >  * We should introduce a lint tool given building now becomes optional, and we need to introduce a manifest builder for the unbuilt test suite. We should make sure the lint tool is a superset of the warnings from the build system.
> 
> Sure. There’s already some code for this in Shepherd and w3ctestlib, we can do some factoring to make a stand-alone linter.
> 
> We should probably make sure that everything agrees what's an error and what's a warning. And mark things as expected failures (so that we can simply check that any new commit doesn't add any new failures.)
> 
> The build tools also already generate manifests, making a manifest-only build should be trivially turning off parts of the build process. Let me know if you need a different manifest format.
> 
> I *think* the only thing we need different is the whole path for the test/ref fields, because obviously an extensionless id isn't very useful there.

Ok, should be trivial to write.

> 
> >  * We should make the build system capable of grabbing authorship from hg/git if it isn't explicitly defined.
> 
> Shepherd already has the concept of a file ‘owner’ in addition to the authors (the owner is the first person to commit the file). I can add an API to expose that data. Note that (especially for older files) the owner often had nothing to do with creating or editing the test. Furthermore, there are often edits (by myself for example) that do nothing but correct spec links, reference links, or other minor metadata issues (or rename files), these kinds of edits should not be considered ‘authorship’.
> 
> Also, Shepherd already has all the revision information for every file in the repo, so again, I can add an API to expose this data easily.
> 
> Hmm, this goes back to the whole metadata discussion… I don't really know!
> 
> /g
Received on Monday, 2 November 2015 12:26:28 UTC