Leveraging Natural Supervision for Language Representation Learning: Background Summary

Written by textmodels | Published 2024/06/01
Tech Story Tags: llm-natural-supervision | llm-self-supervision | llm-language-pretraining | llm-word-prediction | ai-language-modeling | ai-vector-representations | ai-neural-models | ai-sentence-representations

TLDRIn this study, researchers describe three lines of work that seek to improve the training and evaluation of neural models using naturally-occurring supervision.via the TL;DR App

Author:

(1) Mingda Chen.

2.4 Summary

In this chapter, we describe the background materials needed for the remainder of this thesis. In Chapter 3, we present our contributions to improving self-supervised training objectives for language model pretraining. The new training objectives help enhance the quality of general language representations and model performance on few-shot learning. Chapter 4 presents our contributions to exploit naturallyoccurring data structures on Wikipedia for entity and sentence representations and textual entailment. Chapter 5 presents our contributions on leveraging freelyavailable parallel corpora for disentangling semantic and syntactic representations. Then we apply the technique to controlling the syntax of generated sentences using a sentential exemplar. Chapter 6 presents our contributed datasets for data-to-text generation, abstractive summarization, and story generation. They are tailored from naturally-occurring textual resources and have unique challenges in their respective task settings.

This paper is available on arxiv under CC 4.0 license.


Written by textmodels | We publish the best academic papers on rule-based techniques, LLMs, & the generation of text that resembles human text.
Published by HackerNoon on 2024/06/01