Transfer learning is aimed to make use of valuable knowledge in a source domain to help model performance in a target domain.
In NLP applications, especially when we do not have large enough datasets for solving a task(called the target task T ), we would like to transfer knowledge from other tasks S to avoid overfitting and to improve the performance of T.
Transferring knowledge to a semantically similar/same task but with a different dataset.
Transferring knowledge to a task that is semantically different but shares the same neural network architecture so that neural parameters can be transferred.
The INIT approach first trains the network on S, and then directly uses the tuned parameters to initialize the network for T . After transfer, we may fix the parameters in the target domain.i e fine tuning the parameters of T.
MULT, on the other hand, simultaneously trains samples in both domains.
Multi Task Learning
We first pretrain on the source domain S for parameter initialization, and then train S and T simultaneously.
The Neural Transfer Learning in NLP depends largely on how similar in semantics the source and target datasets are.