Validator from Tal Leming on 2010-11-04 (public-webfonts-wg@w3.org from November 2010)

From: Tal Leming <tal@typesupply.com>
Date: Thu, 4 Nov 2010 10:39:37 -0400
To: WOFF Working Group FONT <public-webfonts-wg@w3.org>
Message-Id: <0E0FD003-5C59-4544-AD30-FBF74557E627@typesupply.com>

Hello Everyone,

Since the validator was mentioned in the F2F minutes, I thought I'd chime in about it. It is a bit of a work in progress, but you can get it from my SVN here:

http://svn.typesupply.com/packages/woffTools/trunk/Lib/woffTools/tools/validate.py

You can view it here:

http://code.typesupply.com/browser/packages/woffTools/trunk/Lib/woffTools/tools/validate.py

It operates as a stand-alone Python script. (It can, and should, move out of my woffTools package. It isn't really related to that package anymore.) It shouldn't have any dependencies outside of the Python standard library. To run it, do the following in the command line environment of your choice:

python validate.py yourfile.woff

Next to the WOFF an HTML file will appear.

I have been going over the spec to make sure that everything that can be tested is covered by the validator. I still have a few testable assertions to add and I have questions about some others. Specifically, these are not yet implemented:

http://dev.w3.org/webfonts/WOFF/spec/#conform-private-last
http://dev.w3.org/webfonts/WOFF/spec/#conform-diroverlap-reject
http://dev.w3.org/webfonts/WOFF/spec/#conform-overlap-reject

I'm going to add support for these soon. I'm not an XML expert, so I need some advice on this one:

http://dev.w3.org/webfonts/WOFF/spec/#conform-metadata-encoding

Would it be safe to do the following to determine if the encoding is UTF-8 or UTF-16?

if the metadata string starts with "<?xml ":
consider everything before the first instance of ">" to be the first line:
if the first line contains "encoding="UTF-8"" or "encoding="UTF-16"":
encoding is valid
if not, if the first line contains "encoding="[A-Za-z0-9._-]+"":
encoding is invalid
if not:
the assumption is that the encoding is UTF-8 and therefore valid

I have some other things to do in addition to these. You can read them in the top of the code if you are interested.

I've written an extensive number of unit tests for the validator. You can see them, and be bored to tears by them, here:

http://code.typesupply.com/browser/packages/woffTools/trunk/Lib/woffTools/test/test_validate.py

I'll write more about these later.

Any comments, advice, bug reports, patches, etc. would be appreciated.

Tal

Received on Thursday, 4 November 2010 14:40:09 UTC