paint-brush
Transfer Learning for Natural Language Processingby@pranoyradhakrishnan
2,379 reads
2,379 reads

Transfer Learning for Natural Language Processing

by Pranoy RadhakrishnanFebruary 17th, 2018
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

<strong>Transfer learning</strong> is aimed to make use of valuable knowledge in a <strong>source domain</strong> to help model performance in a <strong>target domain.</strong>

Company Mentioned

Mention Thumbnail
featured image - Transfer Learning for Natural Language Processing
Pranoy Radhakrishnan HackerNoon profile picture

Transfer learning is aimed to make use of valuable knowledge in a source domain to help model performance in a target domain.

Why do we need Transfer Learning for NLP?

In NLP applications, especially when we do not have large enough datasets for solving a task(called the target task T ), we would like to transfer knowledge from other tasks S to avoid overfitting and to improve the performance of T.

Two Scenarios

Transferring knowledge to a semantically similar/same task but with a different dataset.

  • Source task (S)-A Large dataset for binary sentiment classification
  • Target task (T)- A small dataset for binary sentiment classification

Transferring knowledge to a task that is semantically different but shares the same neural network architecture so that neural parameters can be transferred.

  • Source task (S)- A large dataset for binary sentiment classification
  • Target task (T) - A small dataset for 6-way question classification (e.g., location, time, and number)

Transfer Methods

Parameter initialization (INIT).

The INIT approach first trains the network on S, and then directly uses the tuned parameters to initialize the network for T . After transfer, we may fix the parameters in the target domain.i e fine tuning the parameters of T.

Multi-task learning (MULT)

MULT, on the other hand, simultaneously trains samples in both domains.

Multi Task Learning

Combination (MULT+INIT)

We first pretrain on the source domain S for parameter initialization, and then train S and T simultaneously.

Model Performance on

Parameter initialization (INIT) , MULT and MULT+INIT

  • Transfer learning of semantically equivalent tasks appears to be successful.
  • There is no big improvement for semantically different tasks.

Conclusion

The Neural Transfer Learning in NLP depends largely on how similar in semantics the source and target datasets are.

Reference

https://arxiv.org/abs/1603.06111