Authors: (1) Sanchit Sinha, University of Virginia (sanchit@virginia.edu); (2) Guangzhi Xiong, University of Virginia (hhu4zu@virginia.edu); (3) Aidong Zhang, University of Virginia (aidong@virginia.edu). Authors: Authors: (1) Sanchit Sinha, University of Virginia (sanchit@virginia.edu); (2) Guangzhi Xiong, University of Virginia (hhu4zu@virginia.edu); (3) Aidong Zhang, University of Virginia (aidong@virginia.edu). Table of Links Abstract and 1 Introduction Abstract and 1 Introduction 2 Related Work 2 Related Work 3 Methodology and 3.1 Representative Concept Extraction 3 Methodology and 3.1 Representative Concept Extraction 3.2 Self-supervised Contrastive Concept Learning 3.2 Self-supervised Contrastive Concept Learning 3.3 Prototype-based Concept Grounding 3.3 Prototype-based Concept Grounding 3.4 End-to-end Composite Training 3.4 End-to-end Composite Training 4 Experiments and 4.1 Datasets and Networks 4 Experiments and 4.1 Datasets and Networks 4.2 Hyperparameter Settings 4.2 Hyperparameter Settings 4.3 Evaluation Metrics and 4.4 Generalization Results 4.3 Evaluation Metrics and 4.4 Generalization Results 4.5 Concept Fidelity and 4.6 Qualitative Visualization 4.5 Concept Fidelity and 4.6 Qualitative Visualization 5 Conclusion and References 5 Conclusion and References Appendix Appendix 3 Methodology In this section, we first provide a detailed description of our proposed learning pipeline, including (a) the Representative Concept Extraction (RCE) framework which incorporates a novel Salient Concept Selection Network in addition to the Concept and Relevance Networks, (b) Self-Supervised Contrastive Concept Learning (CCL) which enforces domain invariance among learned concepts, and (c) a Prototype-based Concept Grounding (PCG) regularizer that mitigates the problem of concept-shift among domains. We then provide details for the end-to-end training procedure with additional Concept Fidelity regularization which ensures concept consistency among similar samples. 3.1 Representative Concept Extraction 3.2 Self-supervised Contrastive Concept Learning Even though the RCE framework generates representative concepts, the concepts extracted are adulterated with domain noise thus limiting their generalization. In addition, with limited training data, the concept extraction process is not robust. Self-supervised learning contrastive training objectives are the most commonly used paradigm [Thota and Leontidis, 2021] for learning robust visual features in images. We incorporate self-supervised contrastive learning to learn domain invariant concepts, termed CCL. 3.3 Prototype-based Concept Grounding Concept Fidelity Regularization. Concept fidelity attempts to enforce the similarity of concepts through a similarity measure s(·, ·) of data instances from the same class in the same domain. Formally, Concept Fidelity Regularization. 3.4 End-to-end Composite Training Overall, the training objective can be formalized as a weighted sum of CCL and PCG objectives: where λ1 and λ2 are tunable hyperparameters controlling the strength of contrastive learning and prototype grounding regularization. The end-to-end training objective can be represented as: The tunable hyperparameter β controls the effect of generalization and robustness on the RCE framework. Note that a higher value of β makes the concept learning procedure brittle and unable to adapt to target domains. However, a very low value of β makes the concept learning procedure overfit on the source domain, implying a tradeoff between concept generalization and performance. This paper is available on arxiv under CC BY 4.0 DEED license. This paper is available on arxiv under CC BY 4.0 DEED license. available on arxiv