Authors:
(1) Lewis Tunstall,  Equal contribution and The H4 (Helpful, Honest, Harmless, Huggy) Team (email: lewis@huggingface.co);
(2) Edward Beeching, Equal contribution and The H4 (Helpful, Honest, Harmless, Huggy) Team;
(3) Nathan Lambert, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(4) Nazneen Rajani, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(5) Kashif Rasul, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(6) Younes Belkada, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(7) Shengyi Huang, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(8) Leandro von Werra, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(9) Clementine Fourrier, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(10) Nathan Habib, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(11) Nathan Sarrazin, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(12) Omar Sanseviero, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(13) Alexander M. Rush, The H4 (Helpful, Honest, Harmless, Huggy) Team;
(14) Thomas Wolf, The H4 (Helpful, Honest, Harmless, Huggy) Team. Table of Links Abstract and Introduction
Related Work
Method
Experimental Details
Results and Ablations
Conclusions and Limitations , Acknowledgements and References
Appendix A APPENDIX A.1 QUALITATIVE EXAMPLES To qualitatively compare the responses from our dSFT and dDPO models, we choose prompts from a few domains of MT-Bench, as well as some adversarial prompts to test each model’s capability to follow instructions with false premises or harmful intent. Completions for the adversarial prompts were generated with nucleus sampling(top-p = 0.95) and T = 0.7. A.2 SFT IS A REQUIRED STEP BEFORE DPO In Table 3 we ran an ablation to see whether SFT is necessary prior to the DPO step. We observed a significant reduction in performance in both the MT-Bench and AlpacaEval scores when the SFT step is skipped. After a qualitative evaluation of the MT-Bench generations, we observe that the pure DPO model struggles to learn the chat template: This paper is available on arxiv under CC 4.0 license. Authors: (1) Lewis Tunstall,  Equal contribution and The H4 (Helpful, Honest, Harmless, Huggy) Team (email: lewis@huggingface.co); (2) Edward Beeching, Equal contribution and The H4 (Helpful, Honest, Harmless, Huggy) Team; (3) Nathan Lambert, The H4 (Helpful, Honest, Harmless, Huggy) Team; (4) Nazneen Rajani, The H4 (Helpful, Honest, Harmless, Huggy) Team; (5) Kashif Rasul, The H4 (Helpful, Honest, Harmless, Huggy) Team; (6) Younes Belkada, The H4 (Helpful, Honest, Harmless, Huggy) Team; (7) Shengyi Huang, The H4 (Helpful, Honest, Harmless, Huggy) Team; (8) Leandro von Werra, The H4 (Helpful, Honest, Harmless, Huggy) Team; (9) Clementine Fourrier, The H4 (Helpful, Honest, Harmless, Huggy) Team; (10) Nathan Habib, The H4 (Helpful, Honest, Harmless, Huggy) Team; (11) Nathan Sarrazin, The H4 (Helpful, Honest, Harmless, Huggy) Team; (12) Omar Sanseviero, The H4 (Helpful, Honest, Harmless, Huggy) Team; (13) Alexander M. Rush, The H4 (Helpful, Honest, Harmless, Huggy) Team; (14) Thomas Wolf, The H4 (Helpful, Honest, Harmless, Huggy) Team. Authors: Authors: (1) Lewis Tunstall,  Equal contribution and The H4 (Helpful, Honest, Harmless, Huggy) Team (email: lewis@huggingface.co); (2) Edward Beeching, Equal contribution and The H4 (Helpful, Honest, Harmless, Huggy) Team; (3) Nathan Lambert, The H4 (Helpful, Honest, Harmless, Huggy) Team; (4) Nazneen Rajani, The H4 (Helpful, Honest, Harmless, Huggy) Team; (5) Kashif Rasul, The H4 (Helpful, Honest, Harmless, Huggy) Team; (6) Younes Belkada, The H4 (Helpful, Honest, Harmless, Huggy) Team; (7) Shengyi Huang, The H4 (Helpful, Honest, Harmless, Huggy) Team; (8) Leandro von Werra, The H4 (Helpful, Honest, Harmless, Huggy) Team; (9) Clementine Fourrier, The H4 (Helpful, Honest, Harmless, Huggy) Team; (10) Nathan Habib, The H4 (Helpful, Honest, Harmless, Huggy) Team; (11) Nathan Sarrazin, The H4 (Helpful, Honest, Harmless, Huggy) Team; (12) Omar Sanseviero, The H4 (Helpful, Honest, Harmless, Huggy) Team; (13) Alexander M. Rush, The H4 (Helpful, Honest, Harmless, Huggy) Team; (14) Thomas Wolf, The H4 (Helpful, Honest, Harmless, Huggy) Team. Table of Links Abstract and Introduction Related Work Method Experimental Details Results and Ablations Conclusions and Limitations , Acknowledgements and References Appendix Abstract and Introduction Abstract and Introduction Related Work Related Work Method Method Experimental Details Experimental Details Results and Ablations Results and Ablations Conclusions and Limitations , Acknowledgements and References Conclusions and Limitations , Acknowledgements and References Appendix Appendix A APPENDIX A.1 QUALITATIVE EXAMPLES To qualitatively compare the responses from our dSFT and dDPO models, we choose prompts from a few domains of MT-Bench, as well as some adversarial prompts to test each model’s capability to follow instructions with false premises or harmful intent. Completions for the adversarial prompts were generated with nucleus sampling(top-p = 0.95) and T = 0.7. A.2 SFT IS A REQUIRED STEP BEFORE DPO In Table 3 we ran an ablation to see whether SFT is necessary prior to the DPO step. We observed a significant reduction in performance in both the MT-Bench and AlpacaEval scores when the SFT step is skipped. After a qualitative evaluation of the MT-Bench generations, we observe that the pure DPO model struggles to learn the chat template: This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Zephyr: Direct Distillation of LM Alignment: Appendix

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

AI-Driven Creativity: QDAIF Shines in Generating Diverse and High-Quality Texts

Announcing launch of SmartReader — An AI-powered feedback analysis platform

Quality-Diversity through AI Feedback: Appendix

Quality-Diversity through AI Feedback: Approach

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

AI-Driven Creativity: QDAIF Shines in Generating Diverse and High-Quality Texts

Announcing launch of SmartReader — An AI-powered feedback analysis platform

Quality-Diversity through AI Feedback: Appendix

Quality-Diversity through AI Feedback: Approach

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps