Author:
(1) Mingda Chen. Table of Links Abstract


Acknowledgements


1 INTRODUCTION


1.1 Overview


1.2 Contributions


2 BACKGROUND


2.1 Self-Supervised Language Pretraining


2.2 Naturally-Occurring Data Structures


2.3 Sentence Variational Autoencoder


2.4 Summary


3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING


3.1 Improving Language Representation Learning via Sentence Ordering Prediction


3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training


3.3 Summary


4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA


4.1 Learning Entity Representations from Hyperlinks


4.2 Learning Discourse-Aware Sentence Representations from Document Structures


4.3 Learning Concept Hierarchies from Document Categories


4.4 Summary


5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY


5.1 Disentangling Semantics and Syntax in Sentence Representations


5.2 Controllable Paraphrase Generation with a Syntactic Exemplar


5.3 Summary


6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS


6.1 Long-Form Data-to-Text Generation


6.2 Long-Form Text Summarization


6.3 Story Generation with Constraints


6.4 Summary


7 CONCLUSION


APPENDIX A - APPENDIX TO CHAPTER 3


APPENDIX B - APPENDIX TO CHAPTER 6


BIBLIOGRAPHY 1.1 Overview Learning from Improved Self-Supervision (Chapter 3). Adapting plain text for training signals (also known as self-supervision) is the driving force behind recent breakthroughs in NLP. Approaches like BERT (Devlin et al., 2019) and GPT-3 (Brown et al., 2020) effectively transfer knowledge in plain text to various downstream NLP tasks. Recent research has demonstrated potential flaws in BERT’s learning objectives (Yang et al., 2019; Liu et al., 2019) and has improved GPT-3’s downstream task performance using human-annotated resources (Mishra et al., 2021; Wei et al., 2022). In this thesis, we present techniques to improve the self-supervised training objectives without requiring extra human annotations. Learning from Rich Data Structures: Wikipedia Articles (Chapter 4). Pretrained language models primarily use learning objectives related to word prediction based on nearby context (Mikolov et al., 2013b; Peters et al., 2018; Devlin et al., 2019). In this thesis, we present approaches to leverage the rich article structures in Wikipedia for learning vector representations of various texts. Learning from Rich Data Structures: Paired Data (Chapter 5). Much of the recent NLP work on learning disentangled representations and controllable generation has focused on disentangling and controlling attributes such as sentiment (Hu et al., 2017; Shen et al., 2017a) or formality (Ficler and Goldberg, 2017). In this thesis, we show that leveraging paired data structures enables us to disentangle the semantics and syntax in sentence representations and control the syntax of output sentences using a sentential exemplar. Building Evaluation Tasks from Textual Resources (Chapter 6). We construct various text generation datasets from fan-contributed websites. The rich information provided on these websites allows these new datasets to have different focuses (e.g., long-form text rather than single sentence generation) and domains (e.g., television series rather than news) compared to prior work in the same task setting. We show that their unique characteristics lead to challenging research questions. This paper is available on arxiv under CC 4.0 license. Author: (1) Mingda Chen. Author: Author: (1) Mingda Chen. Table of Links Abstract Acknowledgements 1 INTRODUCTION 1.1 Overview 1.2 Contributions 2 BACKGROUND 2.1 Self-Supervised Language Pretraining 2.2 Naturally-Occurring Data Structures 2.3 Sentence Variational Autoencoder 2.4 Summary 3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING 3.1 Improving Language Representation Learning via Sentence Ordering Prediction 3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training 3.3 Summary 4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA 4.1 Learning Entity Representations from Hyperlinks 4.2 Learning Discourse-Aware Sentence Representations from Document Structures 4.3 Learning Concept Hierarchies from Document Categories 4.4 Summary 5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY 5.1 Disentangling Semantics and Syntax in Sentence Representations 5.2 Controllable Paraphrase Generation with a Syntactic Exemplar 5.3 Summary 6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS 6.1 Long-Form Data-to-Text Generation 6.2 Long-Form Text Summarization 6.3 Story Generation with Constraints 6.4 Summary 7 CONCLUSION APPENDIX A - APPENDIX TO CHAPTER 3 APPENDIX B - APPENDIX TO CHAPTER 6 BIBLIOGRAPHY Abstract Abstract Abstract Acknowledgements Acknowledgements Acknowledgements 1 INTRODUCTION 1 INTRODUCTION 1 INTRODUCTION 1 INTRODUCTION 1.1 Overview 1.1 Overview 1.1 Overview 1.2 Contributions 1.2 Contributions 1.2 Contributions 2 BACKGROUND 2 BACKGROUND 2 BACKGROUND 2 BACKGROUND 2.1 Self-Supervised Language Pretraining 2.1 Self-Supervised Language Pretraining 2.1 Self-Supervised Language Pretraining 2.2 Naturally-Occurring Data Structures 2.2 Naturally-Occurring Data Structures 2.2 Naturally-Occurring Data Structures 2.3 Sentence Variational Autoencoder 2.3 Sentence Variational Autoencoder 2.3 Sentence Variational Autoencoder 2.4 Summary 2.4 Summary 2.4 Summary 3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING 3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING 3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING 3 IMPROVING SELF-SUPERVISION FOR LANGUAGE PRETRAINING 3.1 Improving Language Representation Learning via Sentence Ordering Prediction 3.1 Improving Language Representation Learning via Sentence Ordering Prediction 3.1 Improving Language Representation Learning via Sentence Ordering Prediction 3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training 3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training 3.2 Improving In-Context Few-Shot Learning via Self-Supervised Training 3.3 Summary 3.3 Summary 3.3 Summary 4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA 4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA 4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA 4 LEARNING SEMANTIC KNOWLEDGE FROM WIKIPEDIA 4.1 Learning Entity Representations from Hyperlinks 4.1 Learning Entity Representations from Hyperlinks 4.1 Learning Entity Representations from Hyperlinks 4.2 Learning Discourse-Aware Sentence Representations from Document Structures 4.2 Learning Discourse-Aware Sentence Representations from Document Structures 4.2 Learning Discourse-Aware Sentence Representations from Document Structures 4.3 Learning Concept Hierarchies from Document Categories 4.3 Learning Concept Hierarchies from Document Categories 4.3 Learning Concept Hierarchies from Document Categories 4.4 Summary 4.4 Summary 4.4 Summary 5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY 5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY 5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY 5 DISENTANGLING LATENT REPRESENTATIONS FOR INTERPRETABILITY AND CONTROLLABILITY 5.1 Disentangling Semantics and Syntax in Sentence Representations 5.1 Disentangling Semantics and Syntax in Sentence Representations 5.1 Disentangling Semantics and Syntax in Sentence Representations 5.2 Controllable Paraphrase Generation with a Syntactic Exemplar 5.2 Controllable Paraphrase Generation with a Syntactic Exemplar 5.2 Controllable Paraphrase Generation with a Syntactic Exemplar 5.3 Summary 5.3 Summary 5.3 Summary 6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS 6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS 6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS 6 TAILORING TEXTUAL RESOURCES FOR EVALUATION TASKS 6.1 Long-Form Data-to-Text Generation 6.1 Long-Form Data-to-Text Generation 6.1 Long-Form Data-to-Text Generation 6.2 Long-Form Text Summarization 6.2 Long-Form Text Summarization 6.2 Long-Form Text Summarization 6.3 Story Generation with Constraints 6.3 Story Generation with Constraints 6.3 Story Generation with Constraints 6.4 Summary 6.4 Summary 6.4 Summary 7 CONCLUSION 7 CONCLUSION 7 CONCLUSION 7 CONCLUSION APPENDIX A - APPENDIX TO CHAPTER 3 APPENDIX A - APPENDIX TO CHAPTER 3 APPENDIX A - APPENDIX TO CHAPTER 3 APPENDIX B - APPENDIX TO CHAPTER 6 APPENDIX B - APPENDIX TO CHAPTER 6 APPENDIX B - APPENDIX TO CHAPTER 6 BIBLIOGRAPHY BIBLIOGRAPHY BIBLIOGRAPHY 1.1 Overview Learning from Improved Self-Supervision (Chapter 3). Adapting plain text for training signals (also known as self-supervision) is the driving force behind recent breakthroughs in NLP. Approaches like BERT (Devlin et al., 2019) and GPT-3 (Brown et al., 2020) effectively transfer knowledge in plain text to various downstream NLP tasks. Recent research has demonstrated potential flaws in BERT’s learning objectives (Yang et al., 2019; Liu et al., 2019) and has improved GPT-3’s downstream task performance using human-annotated resources (Mishra et al., 2021; Wei et al., 2022). In this thesis, we present techniques to improve the self-supervised training objectives without requiring extra human annotations. Learning from Improved Self-Supervision Learning from Rich Data Structures: Wikipedia Articles (Chapter 4). Pretrained language models primarily use learning objectives related to word prediction based on nearby context (Mikolov et al., 2013b; Peters et al., 2018; Devlin et al., 2019). In this thesis, we present approaches to leverage the rich article structures in Wikipedia for learning vector representations of various texts. Learning from Rich Data Structures: Learning from Rich Data Structures: Paired Data (Chapter 5). Much of the recent NLP work on learning disentangled representations and controllable generation has focused on disentangling and controlling attributes such as sentiment (Hu et al., 2017; Shen et al., 2017a) or formality (Ficler and Goldberg, 2017). In this thesis, we show that leveraging paired data structures enables us to disentangle the semantics and syntax in sentence representations and control the syntax of output sentences using a sentential exemplar. Learning from Rich Data Structures: Building Evaluation Tasks from Textual Resources (Chapter 6). We construct various text generation datasets from fan-contributed websites. The rich information provided on these websites allows these new datasets to have different focuses (e.g., long-form text rather than single sentence generation) and domains (e.g., television series rather than news) compared to prior work in the same task setting. We show that their unique characteristics lead to challenging research questions. Building Evaluation Tasks from Textual Resources This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Leveraging Natural Supervision for Language Representation Learning and Generation: Overview

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

Leveraging Natural Supervision: Appendix A - Appendix to Chapter 3

Leveraging Natural Supervision: Naturally-Occurring Data Structures

Leveraging Natural Supervision for Language Representation Learning and Generation: Introduction

Leveraging Natural Supervision for Language Representation Learning and Generation: Contributions

Leveraging Natural Supervision for Language Representation Learning and Generation: Bibliography

102 Languages, One Model: The Multimodal AI Breakthrough You Need to Know

Leveraging Natural Supervision: Appendix A - Appendix to Chapter 3

Leveraging Natural Supervision: Naturally-Occurring Data Structures

Leveraging Natural Supervision for Language Representation Learning and Generation: Introduction

Leveraging Natural Supervision for Language Representation Learning and Generation: Contributions

Leveraging Natural Supervision for Language Representation Learning and Generation: Bibliography

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps