Authors:
(1) Zhan Ling, UC San Diego and equal contribution;
(2) Yunhao Fang, UC San Diego and equal contribution;
(3) Xuanlin Li, UC San Diego;
(4) Zhiao Huang, UC San Diego;
(5) Mingu Lee, Qualcomm AI Research and Qualcomm AI Research
(6) Roland Memisevic, Qualcomm AI Research;
(7) Hao Su, UC San Diego.
Motivation and Problem Formulation
Deductively Verifiable Chain-of-Thought Reasoning
Conclusion, Acknowledgements and References
A Deductive Verification with Vicuna Models
C More Details on Answer Extraction
E More Deductive Verification Examples
For the results in Tab. 2 of the main paper, We use “Do you think the above reasoning process is correct? Let’s think step by step.” as the zero-shot prompt to verify an entire reasoning chain at once. We also design a two-shot prompt for reasoning chain verification as shown in Tab. 12, which covers one correct reasoning chain and one incorrect reasoning chain.
To instruct models to generate reasoning chains in the Natural Program format that facilitates step-by-step deductive verification, we have designed four distinct prompts to address different types of problems. These include:
Math word problems, as illustrated in Tab. 13, covering GSM8K, MATH, and AddSub datasets.
Math word problems with multiple-choice options, illustrated in Tab. 14, covering the AQuA dataset.
Date-related problems, illustrated in Tab. 15, covering the Date dataset.
Last Letters problems, illustrated in Tab. 16, covering the Last Letters dataset.
We have designed a general one-shot prompt for the deductive verification of a single reasoning step on different datasets, as shown in Tab. 17. This prompt serves to instruct language models to generate the deductive validity of each reasoning step as illustrated in Sec. 4.2 and the top-right box of Fig. 1 of the main paper.
This paper is available on arxiv under CC BY 4.0 DEED license.