paint-brush
Verification of a Rust Implementation of Knuth’s Dancing Links Using ACL2: Dancing Links in Rustby@gitflow

Verification of a Rust Implementation of Knuth’s Dancing Links Using ACL2: Dancing Links in Rust

tldt arrow

Too Long; Didn't Read

In this paper, researchers describe an implementation of the Dancing Links optimization in the Rust programming language.
featured image - Verification of a Rust Implementation of Knuth’s Dancing Links Using ACL2: Dancing Links in Rust
What is GitFlow? The Collaborative Git Alternative HackerNoon profile picture

Author:

(1) David S. Hardin, Cedar Rapids, IA USA [email protected].

In this section, we describe an array-based circular doubly-linked list (CDLL) employing Knuth’s “Dancing Links” optimization, realized using our RAR Rust subset. The CDLL data structure implementation constitutes over 700 lines of Rust code, which becomes 890 lines of code when translated to ACL2.

6.1 Definitions

First, we present the basic RAR declaration for the CDLL.



Rust data structure declarations are similar to those in C, but struct elements are declared by specifying the element name, followed by the : separator, then the element type. Also note that Rust pragmas may be given using the derive attribute. In the declaration above, the array nodeArr holds the list element nodes. Each element has next and prev indices. Note that indices in Rust are normally declared to be of the usize type. Note also that by using array indices instead of references, we get around


Figure 4: cdll_remove() function in RAR.


Rust ownership model issues with circular data structures. The alloc field of the CDLLNode structure is declared to be a two bit unsigned field, but its only allowed values are two non-zero values: 2 (not currently allocated), and 3 (allocated). The reason for this has to do with the details of ACL2 untyped record reasoning, which will be discussed in Section 6.2.


The Dancing Links operators cdll_remove and cdll_restore are presented in Figures 4 and 5, respectively. Rust functions begin with the fn keyword, followed by the function name, a parenthesized list of parameters, the -> (returns) symbol, the return type name, followed by the function body (delimited by a curly brace pair). A function parameter list element consists of the parameter name, the : symbol, then the parameter type. Additional parameter modifiers, for example mut, may be present to indicate that the parameter is changed in the function body. Within the function body, the syntax is similar to other C-like languages, but local variable declarations begin with let, and use the variable name, :, variable type declaration syntax. A local variable declaration may also require the mut modifier if that local variable is updated after its initialization.

6.2 Translation to ACL2

We use Plexi to transpile the RAR source to RAC (not shown), then use the RAC translator to convert the resulting RAC source to ACL2. The translation of cdll_restore() appears in Fig. 6.


Figure 5: cdll_restore () function in RAR.


Figure 6: cdll_restore () function translated to ACL2 using the RAC tools.


The first thing to note about Fig. 6 is that, even though we are two translation steps away from the original RAR source, the translated function is nonetheless quite readable, which is a rare thing for machine-generated code. Another notable observation is that struct and array ‘get’ and ‘set’ operations become untyped record operators, AG and AS, respectively — these are slight RAC-specific customizations of the usual ACL2 untyped record operators. Further, IF1 is a RAC-specific macro, and LOG>, LOG=, LOG<, and LOGIOR1 are all RTL functions. Thus, much of the proof effort involved with RAR code is reasoning about untyped records and RTL — although not a lot of RTL-specific knowledge is needed, at least in our experience.


One aspect of untyped records that can be tricky is that record elements that take on the default value are not explicitly stored in the association list for the record. For RAC untyped records, that default value is zero. Thus, it is easy for a given record to attain a nil value. When reasoning about arrays of such records, it is often desirable to be able to state that the array size remains constant. Thus, for example, for the CDLL array nodeArr of Section 6.1, we ensure that all CDLLNode elements of that array are non-nil by making sure that the alloc fields of the CDLLNode elements are always non-zero (2 or 3).

Once we have translated the circular doubly-linked list functions into ACL2, we can begin to prove theorems about the data structure implementation. We begin by defining a “well-formedness” predicate for CDLLs.



Given this definition of a good CDLL state, we can prove functional correctness theorems for Dancing Links operations, of the sort stated below. Note that this proof requires some detailed well-formedness hypotheses related to the prev and next indices for the nth element:



ACL2 performs the correctness proof for this cdll_restore of cdll_remove theorem automatically. In addition to the Dancing Links operator proofs, we have proved approximately 160 theorems related to the CDLL data structure, including theorems about cdll_cns() (cons equivalent), cdll_- rst() (cdr equivalent), cdll_snc() (add to end of data structure), cdll_tsr() (delete from end of data structure), cdll_nth(), etc. All of these proofs will be made publicly available in the ACL2 workshop books repository.


This paper is available on arxiv under CC 4.0 license.