Listen to this story
Anchoring provides a steady start, grounding decisions and perspectives in clarity and confidence.
Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.
Authors:
(1) Nora Schneider, Computer Science Department, ETH Zurich, Zurich, Switzerland (nschneide@student.ethz.ch);
(2) Shirin Goshtasbpour, Computer Science Department, ETH Zurich, Zurich, Switzerland and Swiss Data Science Center, Zurich, Switzerland (shirin.goshtasbpour@inf.ethz.ch);
(3) Fernando Perez-Cruz, Computer Science Department, ETH Zurich, Zurich, Switzerland and Swiss Data Science Center, Zurich, Switzerland (fernando.perezcruz@sdsc.ethz.ch).
2 Background
3.1 Comparison to C-Mixup and 3.2 Preserving nonlinear data structure
4 Experiments and 4.1 Linear synthetic data
4.2 Housing nonlinear regression
4.3 In-distribution Generalization
4.4 Out-of-distribution Robustness
5 Conclusion, Broader Impact, and References
A Additional information for Anchor Data Augmentation
In this section, we introduce Anchor Data Augmentation (ADA), a domain-independent data augmentation method inspired by AR. ADA does not require previous knowledge about the data invariances nor manually engineered transformations. As opposed to existing domain-agnostic data augmentation methods [10, 45, 46], we do not require training of an expensive generative model, and the augmentation only adds marginally to the computation complexity of the training. In addition, since ADA originates from a causal regression problem, it can be readily applied to regression problems. Even when ADA does not improve performance, its effect on performance remains minimal.
Figure 1: Comparison of ADA augmentations on a nonlinear Cosine data model. For a larger partition size, ADA augmentations are more accurate due to the high local variability of the Cosine function. We used k-means clustering to construct A and γ ∈ {1/2, 2/3, 1.03/2, 2.0}.
This paper is available on arxiv under CC0 1.0 DEED license.