This story draft by @escholar has not been reviewed by an editor, YET.

A Self-explaining Neural Architecture for Generalizable Concept Learning: Hyperparameter Settings

EScholar: Electronic Academic Papers for Scholars HackerNoon profile picture
0-item

Authors:

(1) Sanchit Sinha, University of Virginia ([email protected]);

(2) Guangzhi Xiong, University of Virginia ([email protected]);

(3) Aidong Zhang, University of Virginia ([email protected]).

Table of Links

Abstract and 1 Introduction

2 Related Work

3 Methodology and 3.1 Representative Concept Extraction

3.2 Self-supervised Contrastive Concept Learning

3.3 Prototype-based Concept Grounding

3.4 End-to-end Composite Training

4 Experiments and 4.1 Datasets and Networks

4.2 Hyperparameter Settings

4.3 Evaluation Metrics and 4.4 Generalization Results

4.5 Concept Fidelity and 4.6 Qualitative Visualization

5 Conclusion and References

Appendix

4.2 Hyperparameter Settings

RCE Framework: We utilize the Mean Square Error as the reconstruction loss and set sparsity regularizer λ to 1e-5 for all datasets. The weights ω1 = ω2 = 0.5 are utilized for digit, while they are set at ω1 = 0.8 and ω2 = 0.2 for object tasks.


Learning: We utilize the lightly[1] library for implementing SimCLR transformations [Chen, 2020]. We set the temperature parameter (τ ) to 0.5 by default [Xu et al., 2019] for all datasets. The hyperparameters for each transformation are defaults utilized from SimCLR. The training objective is Contrastive Cross Entropy (NTXent) [Chen, 2020]. Figure 4 depicts an example of various transformations along with the adjudged positive and negative transformations. For the training procedure, we utilize the SGD optimizer with momentum set to 0.9 and a cosine decay scheduler with an initial learning rate set to 0.01. We train each dataset for 10000 iterations with early stopping. The regularization parameters of λ1 and λ2 are set to 0.1 respectively. For Digits, β is set to 1 while it is set to 0.5 for objects. For further details, refer to Appendix.


Table 2: Domain generalization performance for the Office-Home Dataset with domains Art (A), Clipart (C), Product (P) and Real (R).


Table 3: Domain generalization performance for the [Left] DomainNet dataset with domains Real (R), Clipart (C), Picture (P), and Sketch (S) and [Right] VisDA dataset with domains Real (R) and 3-Dimensional visualizations (3D).


Table 4: Domain generalization performance for the Digit datasets with domains MNIST (M), USPS (U) and SVHN (S). In addition, we also report the results of multiple source domain adaptation to the target domains in the Appendix.


This paper is available on arxiv under CC BY 4.0 DEED license.


[1] https://github.com/lightly-ai/lightly

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks