This story draft by @textmodels has not been reviewed by an editor, YET.
We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.
Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.
Authors:
(1) Gladys Tyen, University of Cambridge, Dept. of Computer Science & Technology, ALTA Institute, and Work done during an internship at Google Research (e-mail: gladys.tyen@cl.cam.ac.uk);
(2) Hassan Mansoor, Google Research (e-mail: hassan@google.com);
(3) Victor Carbune, Google Research (e-mail: vcarbune@google.com);
(4) Peter Chen, Google Research and Equal leadership contribution (chenfeif@google.com);
(5) Tony Mak, Google Research and Equal leadership contribution (e-mail: tonymak@google.com).
Conclusion, Limitations, and References
Table 8: Mistake finding accuracy across 5 tasks for correctmis and incorrectmis traces. The combined scores of Table 8a and Table 8b make up Table 4.
Figure 3: Screenshot of the user interface for a question from the tracking shuffled objects task.
This paper is available on arxiv under CC 4.0 license.