Authors:
(1) Sebastian Dziadzio, University of Tübingen (sebastian.dziadzio@uni-tuebingen.de);
(2) Çagatay Yıldız, University of Tübingen;
(3) Gido M. van de Ven, KU Leuven;
(4) Tomasz Trzcinski, IDEAS NCBR, Warsaw University of Technology, Tooploox;
(5) Tinne Tuytelaars, KU Leuven;
(6) Matthias Bethge, University of Tübingen. Table of Links Abstract and 1. Introduction 2. Two problems with the current approach to class-incremental continual learning 3. Methods and 3.1. Infinite dSprites 3.2. Disentangled learning 4. Related work 4.1. Continual learning and 4.2. Benchmarking continual learning 5. Experiments 5.1. Regularization methods and 5.2. Replay-based methods 5.3. Do we need equivariance? 5.4. One-shot generalization and 5.5. Open-set classification 5.6. Online vs. offline Conclusion, Acknowledgments and References Supplementary Material 3. Methods In this section, we describe two important contributions of this work: a software package for generating arbitrarily long continual learning benchmarks and a conceptual disentangled learning framework accompanied by an example implementation. We would like to emphasize that this work aims to provide a new perspective on knowledge transfer in continual learning, and to propose new benchmarks for evaluating continual learning methods. Our implementation serves as a proof of concept, spotlighting the potential of equivariance learning, and is not intended as a practical method for general use. 3.1. Infinite dSprites We introduce idSprites, a novel framework inspired by dSprites [23], designed for easy creation of arbitrarily long continual learning benchmarks. A single idSprites benchmark consists of T tasks, where each task is an n-fold classification of procedurally generated shapes. Similar to dSprites, each shape is observed in all possible combinations of the following FoVs: color, scale, orientation, horizontal position, and vertical position. Figure 2 shows an example batch of images with four FoVs and two values per factor (in general, our implementation allows for arbitrary granularity). The canonical form corresponds to a scale of 1, orientation of 0, and horizontal and vertical positions of 0.5. We only use a single color in our experiments for simplicity and to save computation. The shapes are generated by first randomly sampling the number of vertices from a discrete uniform distribution over a closed integer interval Ja, bK, then constructing a regular polygon on a unit circle, randomly perturbing the polar coordinates of each vertex, and finally connecting the perturbed vertices with a closed spline of the order randomly chosen from {1, 3}. All shapes are then scaled and centered so that their bounding boxes are the same size and their centers of mass align in the canonical form. We also make orientation identifiable by painting one half of the shape black. The number of tasks T, the number of shapes per task n, the vertex number interval [a, b], the exact FoV ranges, and the parameters of noise distributions for radial and angular coordinates are set by the user, providing the flexibility to control the length and difficulty of the benchmark. The framework also provides access to the ground truth values of the individual FoVs. We will release idSprites as a Python package and hope it will unlock new research directions in continual classification, transfer learning, and continual disentanglement. This paper is available on arxiv under CC 4.0 license. Authors: (1) Sebastian Dziadzio, University of Tübingen (sebastian.dziadzio@uni-tuebingen.de); (2) Çagatay Yıldız, University of Tübingen; (3) Gido M. van de Ven, KU Leuven; (4) Tomasz Trzcinski, IDEAS NCBR, Warsaw University of Technology, Tooploox; (5) Tinne Tuytelaars, KU Leuven; (6) Matthias Bethge, University of Tübingen. Authors: Authors: (1) Sebastian Dziadzio, University of Tübingen (sebastian.dziadzio@uni-tuebingen.de); (2) Çagatay Yıldız, University of Tübingen; (3) Gido M. van de Ven, KU Leuven; (4) Tomasz Trzcinski, IDEAS NCBR, Warsaw University of Technology, Tooploox; (5) Tinne Tuytelaars, KU Leuven; (6) Matthias Bethge, University of Tübingen. Table of Links Abstract and 1. Introduction Abstract and 1. Introduction 2. Two problems with the current approach to class-incremental continual learning 2. Two problems with the current approach to class-incremental continual learning 3. Methods and 3.1. Infinite dSprites 3. Methods and 3.1. Infinite dSprites 3.2. Disentangled learning 3.2. Disentangled learning 4. Related work 4. Related work 4.1. Continual learning and 4.2. Benchmarking continual learning 4.1. Continual learning and 4.2. Benchmarking continual learning 5. Experiments 5. Experiments 5.1. Regularization methods and 5.2. Replay-based methods 5.1. Regularization methods and 5.2. Replay-based methods 5.3. Do we need equivariance? 5.3. Do we need equivariance? 5.4. One-shot generalization and 5.5. Open-set classification 5.4. One-shot generalization and 5.5. Open-set classification 5.6. Online vs. offline 5.6. Online vs. offline Conclusion, Acknowledgments and References Conclusion, Acknowledgments and References Supplementary Material Supplementary Material 3. Methods In this section, we describe two important contributions of this work: a software package for generating arbitrarily long continual learning benchmarks and a conceptual disentangled learning framework accompanied by an example implementation. We would like to emphasize that this work aims to provide a new perspective on knowledge transfer in continual learning, and to propose new benchmarks for evaluating continual learning methods. Our implementation serves as a proof of concept, spotlighting the potential of equivariance learning, and is not intended as a practical method for general use. 3.1. Infinite dSprites We introduce idSprites, a novel framework inspired by dSprites [23], designed for easy creation of arbitrarily long continual learning benchmarks. A single idSprites benchmark consists of T tasks, where each task is an n-fold classification of procedurally generated shapes. Similar to dSprites, each shape is observed in all possible combinations of the following FoVs: color, scale, orientation, horizontal position, and vertical position. Figure 2 shows an example batch of images with four FoVs and two values per factor (in general, our implementation allows for arbitrary granularity). The canonical form corresponds to a scale of 1, orientation of 0, and horizontal and vertical positions of 0.5. We only use a single color in our experiments for simplicity and to save computation. The shapes are generated by first randomly sampling the number of vertices from a discrete uniform distribution over a closed integer interval Ja, bK, then constructing a regular polygon on a unit circle, randomly perturbing the polar coordinates of each vertex, and finally connecting the perturbed vertices with a closed spline of the order randomly chosen from {1, 3}. All shapes are then scaled and centered so that their bounding boxes are the same size and their centers of mass align in the canonical form. We also make orientation identifiable by painting one half of the shape black. The number of tasks T, the number of shapes per task n, the vertex number interval [a, b], the exact FoV ranges, and the parameters of noise distributions for radial and angular coordinates are set by the user, providing the flexibility to control the length and difficulty of the benchmark. The framework also provides access to the ground truth values of the individual FoVs. We will release idSprites as a Python package and hope it will unlock new research directions in continual classification, transfer learning, and continual disentanglement. This paper is available on arxiv under CC 4.0 license. This paper is available on arxiv under CC 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Unlocking New Potential in Continual Learning with the Infinite dSprites Framework

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Close Look at Misalignment in Pretraining Datasets

Disentangled Continual Learning: Separating Memory Edits from Model Updates

Two Problems With the Current Approach to Class-Incremental Continual Learning

How Disentangled Learning Tackles Catastrophic Forgetting

Assessing Generalization and Open-Set Classification in Continual Learning Experiments

How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning?

A Close Look at Misalignment in Pretraining Datasets

Disentangled Continual Learning: Separating Memory Edits from Model Updates

Two Problems With the Current Approach to Class-Incremental Continual Learning

How Disentangled Learning Tackles Catastrophic Forgetting

Assessing Generalization and Open-Set Classification in Continual Learning Experiments

How Effective Are Standard Regularization and Replay Methods for Class-Incremental Learning?

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps