Testing lists

We show here how to use the test generator with a very simple specification of lists. The module LIST contains the constructors nil and __ for lists, and the definition of the function reverse:

(fmod LIST is
  pr NAT .
  
  sort List .
  subsort Nat < List .
  
  op nil : -> List [ctor] .
  op __ : List List -> List [ctor assoc id: nil] .
  
  var N : Nat .
  var L : List .
  
  op reverse : List -> List .
  eq [rev1] : reverse(nil) = 0 .
  eq [rev2] : reverse(N L) = reverse(L) N .
endfm)

Although we could generate test cases for the reverse function shown above, since we want to test whether it fulfills some properties, we can define them in another module. For example, the function test in the module REV_LIST below states a property on the concatenation of lists. Note that the equation for test has not been labeled, since we expect the property to be correctly specified:

(fmod REV_LIST is
  pr LIST .
  
  vars L1 L2 : List .
  var  N : Nat .

  op test : List List -> Bool .
  eq test(L1, L2) = reverse(L2) reverse(L1) == reverse(L1 L2) .
endfm)

Now, we can use the test case generator to check this property. We start by using the default global branch coverage strategy with the command:

Maude> (test in REV_LIST : test .)

1 test cases were generated.

1 test cases have to be checked by the user:

	1. The term test(nil,0) has been reduced to false

With only one term the tool has covered all the labeled statements (in this case, they are the equations rev1 and rev2). We notice that this term leads to an incorrect reduction, since the property fails. We can invoke the debugger with the command:

Maude> (invoke debugger with user test case 1 .)

Declarative debugging of wrong answers started.

Is this reduction (associated with the equation rev2) correct?

reverse(0) -> 0 0

The debugger allows the user to find the bug by answering questions about the computation, see here for details and here for this session.

We can also compute test cases for this specification with the function coverage (also known as call coverage) strategy. First, we change the strategy with:

Maude> (function coverage .)

Function Coverage selected

Once the strategy has been selected, the test cases are generated with the command:

Maude> (test in REV_LIST : test .)

0 test cases were generated.

Note that no test cases have been generated! This is due to the fact that it searches for test cases that execute all the recursive calls of test, but in our case there is no such calls. To deal with these cases, the tool provides a more powerful command that allows the user to specify the function whose calls he wants to tests; note that this option does not make sense with global branch coverage, since it simply tries to execute all possible statements. In our case, we want to test the calls to the fuction reverse, so we have to use the command:

Maude> (test in REV_LIST : test wrt reverse .)

2 test cases were generated.

2 test cases have to be checked by the user:

	1. The term test(0,0) has been reduced to false

	2. The term test(nil,nil) has been reduced to false

In this case, two terms are necessary to cover all the possible calls. Again, we could invoke the debugger to debug any of these wrong reductions. However, do we need to check whether the reductions are correct? Although it is not usual for Maude users to have a specification of their specifications in order to check the correctness of test cases against it, it is really easy to create one when we are dealing with boolean properties: it always has to return true. The module CORRECT_LIST below illustrates how to do that:

(fmod CORRECT_LIST is
  pr LIST .
  
  vars L1 L2 : List .
  
  op test : List List -> Bool .
  eq test(L1, L2) = true .
endfm)

We indicate the test generator to consider this module as correct with:

Maude> (correct test module CORRECT_LIST .)

CORRECT_LIST selected as correct module for testing.

Now, if the user tries to compute test cases for a specification the tool will not compute a coverage. Instead, it will check all the terms against the correct module. So we obtain now the following result:

Maude> (test in REV_LIST : test .)

144 test cases were generated.

144 test cases are incorrect with respect to the correct module.

These erroneous test cases can be displayed with the command

and debugged with

Maude> (invoke debugger with incorrect test case 1 .)

Declarative debugging of wrong answers started.

Is this reduction (associated with the equation rev2) correct?

reverse(0 0) -> 0 0 0

Moreover, since the tool has not to compute a coverage, the testing process is considerably faster, and thus we can test much terms (although in this case, with 144 erroneous terms, we are pretty sure the specification is wrong). We increment the number of test cases with:

Maude> (set test bound 4 .)

Bound for tests fixed.

Maude> (test in REV_LIST : test .)

8464 test cases were generated.

8464 test cases are incorrect with respect to the correct module.