I think this one is unwinnable.  I'm sure that Jonathon Chetwynd would 
object to this one if were still on this list, as it would lock out his 
clients, young adults with learning difficulties.

More generally, this is security by obscurity.  It only works because 
not enough people are using your question pool too make it worth the 
while of cataloguing it.  If you use patterns, instead, like:

What is the sum of %d and %d

and it becomes common, they will parse the question and do the arithmetic.

The standard tests work because they require particularly human skills 
which are not well emulated by software.

In practice, your system will work, but only because only a very small 
number of people are using it and the value of compromising the sites 
that use it not high enough.
