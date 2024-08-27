Search icon
ReadWrite
see notifications
Notifications
see more
    paint-brush
    How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning? by@fewshot

    How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning?

    by The FewShot Prompting Publication August 27th, 2024
    Read on Terminal Reader
    Read this story w/o Javascript
    tldt arrow

    Too Long; Didn't Read

    Regularization methods like LwF and SI struggle with class-incremental learning, showing performance similar to naive fine-tuning. Replay-based methods, while effective initially, face challenges with growing memory and computation needs. An analysis of experience replay with varying buffer sizes reveals diminishing returns, with performance dropping after extensive task sequences.
    featured image - How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning?
    The FewShot Prompting Publication HackerNoon profile picture

    Authors:

    (1) Sebastian Dziadzio, University of Tübingen ([email protected]);

    (2) Çagatay Yıldız, University of Tübingen;

    (3) Gido M. van de Ven, KU Leuven;

    (4) Tomasz Trzcinski, IDEAS NCBR, Warsaw University of Technology, Tooploox;

    (5) Tinne Tuytelaars, KU Leuven;

    (6) Matthias Bethge, University of Tübingen.

    Abstract and 1. Introduction

    2. Two problems with the current approach to class-incremental continual learning

    3. Methods and 3.1. Infinite dSprites

    3.2. Disentangled learning

    4. Related work

    4.1. Continual learning and 4.2. Benchmarking continual learning

    5. Experiments

    5.1. Regularization methods and 5.2. Replay-based methods

    5.3. Do we need equivariance?

    5.4. One-shot generalization and 5.5. Open-set classification

    5.6. Online vs. offline

    Conclusion, Acknowledgments and References

    Supplementary Material

    5.1. Regularization methods

    In this section, we compare our method to standard regularization methods: Learning without Forgetting (LwF) [17] and Synaptic Intelligence (SI) [39]. We use implementations from Avalanche [21]. We provide details of the hyperparameter choice in the supplementary material. As shown in Fig. 4, such regularization methods are ill-equipped to deal with the class-incremental learning scenario and perform no better than naive fine-tuning.

    5.2. Replay-based methods

    Replay-based methods retain a subset of past training data that is then mixed with the current training data for every task. While this can be a viable strategy to preserve accuracy over many tasks, it results in ever-growing memory and computation requirements, unless the buffer size is bounded. In this section, we investigate the effect of buffer size on performance for standard experience replay with reservoir sampling. While there are replay-based methods that improve on this baseline, we are interested in investigating the fundamental limits of rehearsal over long time horizons and strip away the confounding effects of data augmentation, generative replay, sampling strategies, pseudo-rehearsal etc. Figure 5 shows test accuracy for experience replay with different buffer sizes. Storing enough past samples lets the model maintain high test accuracy, but even with a buffer of 20,000 images the performance eventually starts to deteriorate. Note that after 200 tasks a balanced buffer will only contain 10 samples per class.


    A note on implementation In an attempt to make the replay baseline stronger, we first add the data from the current task to the buffer and then train the model exclusively on the


    Figure 5. A comparison of our approach to experience replay with different buffer sizes. The plot for our method is an average of 5 runs.


    buffer, effectively discounting the influence of the current task over time [28]. A more standard version of experience replay would mix current and past data in equal proportions in each mini-batch, likely leading to diminished performance on previous tasks. The supplementary material includes a comparison to this other replay baseline, as well as to a version of experience replay with no memory constraint but with a compute budget matching our approach.


    This paper is available on arxiv under CC 4.0 license.


    Bosch
    L O A D I N G
    . . . comments & more!

    About Author

    The FewShot Prompting Publication HackerNoon profile picture
    The FewShot Prompting Publication @fewshot
    Spearheading research, publications, and advancements in few-shot learning, and redefining artificial intelligence.
    Read my storiesRead My Stories

    TOPICS

    purcat-imgmachine-learning #neural-networks #machine-learning-benchmarks #continual-learning-for-llm #class-agnostic-network #disentangled-learning #open-set-classification #class-incremental-learning #forward-and-backward-transfer

    THIS ARTICLE WAS FEATURED IN...

    Permanent on Arweave
    Read on Terminal Reader Terminal
    Read this story w/o Javascript Lite
    Also published here
    X

    RELATED STORIES

    Article Thumbnail
    Fine-tuned LLMs Know More, Hallucinate Less With Few-Shot Sequence-to-Sequence Semantic Parsing
    by fewshot
    Jun 07, 2024
    #llms
    Article Thumbnail
    Disentangled Continual Learning: Separating Memory Edits from Model Updates
    by fewshot
    Aug 27, 2024
    #neural-networks
    Article Thumbnail
    Two Problems With the Current Approach to Class-Incremental Continual Learning
    by fewshot
    Aug 27, 2024
    #neural-networks
    Article Thumbnail
    Unlocking New Potential in Continual Learning with the Infinite dSprites Framework
    by fewshot
    Aug 27, 2024
    #neural-networks
    Article Thumbnail
    How Disentangled Learning Tackles Catastrophic Forgetting
    by fewshot
    Aug 27, 2024
    #neural-networks
    Join HackerNoonloading
    Latest technology trends. Customized Experience. Curated Stories. Publish Your Ideas