paint-brush
Evaluating ADA: Experimental Results on Linear and Housing Datasetsby@anchoring
127 reads

Evaluating ADA: Experimental Results on Linear and Housing Datasets

by Anchoring
Anchoring HackerNoon profile picture

Anchoring

@anchoring

Anchoring provides a steady start, grounding decisions and perspectives in...

November 14th, 2024
Read on Terminal Reader
Read this story in a terminal
Print this story
Read this story w/o Javascript
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Experiments with ADA show improved performance in low-data regimes across multiple models, including linear regression and MLPs. ADA is compared to C-Mixup and other augmentation methods, with results highlighting its effectiveness in boosting model performance even in overparameterized settings.
featured image - Evaluating ADA: Experimental Results on Linear and Housing Datasets
1x
Read by Dr. One voice-avatar

Listen to this story

Anchoring HackerNoon profile picture
Anchoring

Anchoring

@anchoring

Anchoring provides a steady start, grounding decisions and perspectives in clarity and confidence.

About @anchoring
LEARN MORE ABOUT @ANCHORING'S
EXPERTISE AND PLACE ON THE INTERNET.
0-item

STORY’S CREDIBILITY

Academic Research Paper

Academic Research Paper

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

Authors:

(1) Nora Schneider, Computer Science Department, ETH Zurich, Zurich, Switzerland (nschneide@student.ethz.ch);

(2) Shirin Goshtasbpour, Computer Science Department, ETH Zurich, Zurich, Switzerland and Swiss Data Science Center, Zurich, Switzerland (shirin.goshtasbpour@inf.ethz.ch);

(3) Fernando Perez-Cruz, Computer Science Department, ETH Zurich, Zurich, Switzerland and Swiss Data Science Center, Zurich, Switzerland (fernando.perezcruz@sdsc.ethz.ch).

Abstract and 1 Introduction

2 Background

2.1 Data Augmentation

2.2 Anchor Regression

3 Anchor Data Augmentation

3.1 Comparison to C-Mixup and 3.2 Preserving nonlinear data structure

3.3 Algorithm

4 Experiments and 4.1 Linear synthetic data

4.2 Housing nonlinear regression

4.3 In-distribution Generalization

4.4 Out-of-distribution Robustness

5 Conclusion, Broader Impact, and References


A Additional information for Anchor Data Augmentation

B Experiments

4 Experiments

We experimentally investigate and compare the performance of ADA. First, we use ADA in an in-distribution setting for a linear regression problem (Section 4.1), in which we show that even in this case, ADA provides improved performance in the low data regime. Second, in Section 4.2, we apply ADA and C-Mixup to the California and Boston Housing datasets as we increase the number of training samples. In the last two subsections, we replicate the in-distribution generalization (Section 4.3) and the out-of-distribution Robustness (Section 4.4) from the C-Mixup paper [49]. In [49] the authors further assess a task generalization experiment. However, the corresponding code was not publicly provided, and a comparison could not be easily made.

4.1 Linear synthetic data

Using synthetic linear data, we investigate if ADA can improve model performance in an overparameterized setting compared to C-Mixup, vanilla augmentation, or classical expected risk minimization (ERM). Additionally, we analyze the sensitivity of our approach to the choice of γ and the number of augmentations.


Data: The generated data follows a standard linear structure


image


Models and Comparisons: We investigate and compare the impact of ADA using two different models with varying complexity: a linear Ridge regression and a multilayer perceptron (MLP) with one hidden layer and 10 units with ReLU activation. Using an MLP with more hidden layers shows similar results (see Appendix B.1 for details).


image


For the Ridge regression model, we increase the dataset by a factor of 10 by sampling from the respective augmentation methods and subsequently compute the regression estimators. In contrast, for the MLP, we implement the augmentation methods on a minibatch level. Specifically, we incorporate vanilla augmentation by adding Gaussian noise to each batch, apply C-Mixup after sampling from the beta distribution in each batch, and finally, apply ADA after sampling from the defined gamma values in each batch.


image


In summary, even in the simplest of cases, in which we should not expect gains from ADA (or C-Mixup), these data augmentation strategies provide gains in performance when the number of training examples is not sufficient to achieve the error floor.


This paper is available on arxiv under CC0 1.0 DEED license.


L O A D I N G
. . . comments & more!

About Author

Anchoring HackerNoon profile picture
Anchoring@anchoring
Anchoring provides a steady start, grounding decisions and perspectives in clarity and confidence.

TOPICS

THIS ARTICLE WAS FEATURED IN...

Permanent on Arweave
Read on Terminal Reader
Read this story in a terminal
 Terminal
Read this story w/o Javascript
Read this story w/o Javascript
 Lite
Also published here
Hackernoon
X
Threads
Bsky
X REMOVE AD