paint-brush
How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning? by@fewshot

How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning?

by The FewShot Prompting Publication
The FewShot Prompting Publication  HackerNoon profile picture

The FewShot Prompting Publication

@fewshot

Spearheading research, publications, and advancements in few-shot learning, and redefining...

August 27th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Regularization methods like LwF and SI struggle with class-incremental learning, showing performance similar to naive fine-tuning. Replay-based methods, while effective initially, face challenges with growing memory and computation needs. An analysis of experience replay with varying buffer sizes reveals diminishing returns, with performance dropping after extensive task sequences.
featured image - How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning?
1x
Read by Dr. One voice-avatar

Listen to this story

The FewShot Prompting Publication  HackerNoon profile picture
The FewShot Prompting Publication

The FewShot Prompting Publication

@fewshot

Spearheading research, publications, and advancements in few-shot learning, and redefining artificial intelligence.

About @fewshot
LEARN MORE ABOUT @FEWSHOT'S
EXPERTISE AND PLACE ON THE INTERNET.

Authors:

(1) Sebastian Dziadzio, University of Tübingen (sebastian.dziadzio@uni-tuebingen.de);

(2) Çagatay Yıldız, University of Tübingen;

(3) Gido M. van de Ven, KU Leuven;

(4) Tomasz Trzcinski, IDEAS NCBR, Warsaw University of Technology, Tooploox;

(5) Tinne Tuytelaars, KU Leuven;

(6) Matthias Bethge, University of Tübingen.

Abstract and 1. Introduction

2. Two problems with the current approach to class-incremental continual learning

3. Methods and 3.1. Infinite dSprites

3.2. Disentangled learning

4. Related work

4.1. Continual learning and 4.2. Benchmarking continual learning

5. Experiments

5.1. Regularization methods and 5.2. Replay-based methods

5.3. Do we need equivariance?

5.4. One-shot generalization and 5.5. Open-set classification

5.6. Online vs. offline

Conclusion, Acknowledgments and References

Supplementary Material

5.1. Regularization methods

In this section, we compare our method to standard regularization methods: Learning without Forgetting (LwF) [17] and Synaptic Intelligence (SI) [39]. We use implementations from Avalanche [21]. We provide details of the hyperparameter choice in the supplementary material. As shown in Fig. 4, such regularization methods are ill-equipped to deal with the class-incremental learning scenario and perform no better than naive fine-tuning.

5.2. Replay-based methods

Replay-based methods retain a subset of past training data that is then mixed with the current training data for every task. While this can be a viable strategy to preserve accuracy over many tasks, it results in ever-growing memory and computation requirements, unless the buffer size is bounded. In this section, we investigate the effect of buffer size on performance for standard experience replay with reservoir sampling. While there are replay-based methods that improve on this baseline, we are interested in investigating the fundamental limits of rehearsal over long time horizons and strip away the confounding effects of data augmentation, generative replay, sampling strategies, pseudo-rehearsal etc. Figure 5 shows test accuracy for experience replay with different buffer sizes. Storing enough past samples lets the model maintain high test accuracy, but even with a buffer of 20,000 images the performance eventually starts to deteriorate. Note that after 200 tasks a balanced buffer will only contain 10 samples per class.


A note on implementation In an attempt to make the replay baseline stronger, we first add the data from the current task to the buffer and then train the model exclusively on the


Figure 5. A comparison of our approach to experience replay with different buffer sizes. The plot for our method is an average of 5 runs.

Figure 5. A comparison of our approach to experience replay with different buffer sizes. The plot for our method is an average of 5 runs.


buffer, effectively discounting the influence of the current task over time [28]. A more standard version of experience replay would mix current and past data in equal proportions in each mini-batch, likely leading to diminished performance on previous tasks. The supplementary material includes a comparison to this other replay baseline, as well as to a version of experience replay with no memory constraint but with a compute budget matching our approach.


This paper is available on arxiv under CC 4.0 license.


L O A D I N G
. . . comments & more!

About Author

The FewShot Prompting Publication  HackerNoon profile picture
The FewShot Prompting Publication @fewshot
Spearheading research, publications, and advancements in few-shot learning, and redefining artificial intelligence.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
X