paint-brush
Orca 2: Enhancing Reasoning in Smaller Language Models - Prompts used in Evaluationby@textmodels
113 reads

Orca 2: Enhancing Reasoning in Smaller Language Models - Prompts used in Evaluation

by Writings, Papers and Blogs on Text Models
Writings, Papers and Blogs on Text Models HackerNoon profile picture

Writings, Papers and Blogs on Text Models

@textmodels

We publish the best academic papers on rule-based techniques, LLMs,...

May 29th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

This paper is available on arxiv(https://arxiv.org/pdf/2311.11045.pdf) under CC 4.0 license. Authors: Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadall.
featured image - Orca 2: Enhancing Reasoning in Smaller Language Models - Prompts used in Evaluation
1x
Read by Dr. One voice-avatar

Listen to this story

Writings, Papers and Blogs on Text Models HackerNoon profile picture
Writings, Papers and Blogs on Text Models

Writings, Papers and Blogs on Text Models

@textmodels

We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.

About @textmodels
LEARN MORE ABOUT @TEXTMODELS'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Authors:

(1) Arindam Mitra;

(2) Luciano Del Corro, work done while at Microsoft;

(3) Shweti Mahajan, work done while at Microsoft;

(4) Andres Codas, denote equal contributions;

(5) Clarisse Simoes, denote equal contributions;

(6) Sahaj Agarwal;

(7) Xuxi Chen, work done while at Microsoft;;

(8) Anastasia Razdaibiedina, work done while at Microsoft;

(9) Erik Jones, work done while at Microsoft;

(10) Kriti Aggarwal, work done while at Microsoft;

(11) Hamid Palangi;

(12) Guoqing Zheng;

(13) Corby Rosset;

(14) Hamed Khanpour;

(15) Ahmed Awadall.

Abstract and Introduction

Preliminaries

Teaching Orca 2 to be a Cautious Reasoner

Technical Details

Experimental Setup

Evaluation Results

Limitations

Conclusions and References

A. AGIEval Subtask Metrics

B. BigBench-Hard Subtask Metrics

C. Evaluation of Grounding in Abstractive Summarization

D. Evaluation of Safety

E. Prompts used in Evaluation

F. Illustrative Example from Evaluation Benchmarks and Corresponding Model Outpu

E Prompts used in Evaluation

We provide a list of prompts used for evaluation below:


Table 15: Table describes the prompts used for evaluating all models with empty. The prompts are simple and only aim at giving the models hints about answer format to improve the parsing of model responses. For tasks, where the question were formatted as a prompt, the input is used as is. Examples from all datasets are shown in Appendix F

Table 15: Table describes the prompts used for evaluating all models with empty. The prompts are simple and only aim at giving the models hints about answer format to improve the parsing of model responses. For tasks, where the question were formatted as a prompt, the input is used as is. Examples from all datasets are shown in Appendix F


This paper is available on arxiv under CC 4.0 license.


L O A D I N G
. . . comments & more!

About Author

Writings, Papers and Blogs on Text Models HackerNoon profile picture
Writings, Papers and Blogs on Text Models@textmodels
We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
X REMOVE AD