paint-brush

This story draft by @escholar has not been reviewed by an editor, YET.

Qualitative Analysis

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Authors:

(1) Hyungjoo Chae, Yonsei University;

(2) Yeonghyeon Kim, Yonsei University;

(3) Seungone Kim, KAIST AI;

(4) Kai Tzu-iunn Ong, Yonsei University;

(5) Beong-woo Kwak, Yonsei University;

(6) Moohyeon Kim, Yonsei University;

(7) Seonghwan Kim, Yonsei University;

(8) Taeyoon Kwon, Yonsei University;

(9) Jiwan Chung, Yonsei University;

(10) Youngjae Yu, Yonsei University;

(11) Jinyoung Yeo, Yonsei University.

Table of Links

Abstract and 1. Introduction

2 Think-and-Execute

3 Experimental Setup

4 Results

5 Analysis

6 Related Work

7 Limitations and Discussion

8 Conclusion and References


A Experimental Details

B Details of Think-and-Execute

C Prompts Used in Our Experiments

D Human-written Pseudocode Prompts

E Generated Analyses

F Generated Pseudocode Prompts

G Qualitative Analysis

G Qualitative Analysis

We conduct a qualitative analysis by comparing the outputs of our approach (THINKAND-EXECUTE) with those of the baseline methods. This comparison is presented across Tables7,8,9,10,11,12, and 13.


Table 7: A comparison of results for Dyck Languages between the baseline methods and THINK-AND-EXECUTE.


Table 8: A comparison of results for Geometric Shapes between the baseline methods and THINK-AND-EXECUTE.


Table 9: A comparison of results for Navigate between the baseline methods and THINKAND-EXECUTE.


Table 10: A comparison of results for Reasoning about Colored Objects Shapes between the baseline methods and ours.


Table 11: A comparison of results for Temporal Sequences between the baseline methods and THINK-AND-EXECUTE.


Table 12: A comparison of results for Tracking Shuffled Objectives between the baseline methods and THINK-AND-EXECUTE.


Table 13: A comparison of results for Web of lies between the baseline methods and THINKAND-EXECUTE.


This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
EScholar: Electronic Academic Papers for Scholars@escholar
We publish the best academic work (that's too often lost to peer reviews & the TA's desk) to the global tech community

Topics

Around The Web...