4 Steps to Achieve Domain-Invariant Concept Learning in AI Systems

by Activation FunctionApril 8th, 2025
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Discover a novel AI methodology combining representative concept extraction, contrastive learning, and prototype grounding for improved concept generalization.
featured image - 4 Steps to Achieve Domain-Invariant Concept Learning in AI Systems
Activation Function HackerNoon profile picture
0-item

Authors:

(1) Sanchit Sinha, University of Virginia (sanchit@virginia.edu);

(2) Guangzhi Xiong, University of Virginia (hhu4zu@virginia.edu);

(3) Aidong Zhang, University of Virginia (aidong@virginia.edu).

Abstract and 1 Introduction

2 Related Work

3 Methodology and 3.1 Representative Concept Extraction

3.2 Self-supervised Contrastive Concept Learning

3.3 Prototype-based Concept Grounding

3.4 End-to-end Composite Training

4 Experiments and 4.1 Datasets and Networks

4.2 Hyperparameter Settings

4.3 Evaluation Metrics and 4.4 Generalization Results

4.5 Concept Fidelity and 4.6 Qualitative Visualization

5 Conclusion and References

Appendix

3 Methodology

In this section, we first provide a detailed description of our proposed learning pipeline, including (a) the Representative Concept Extraction (RCE) framework which incorporates a novel Salient Concept Selection Network in addition to the Concept and Relevance Networks, (b) Self-Supervised Contrastive Concept Learning (CCL) which enforces domain invariance among learned concepts, and (c) a Prototype-based Concept Grounding (PCG) regularizer that mitigates the problem of concept-shift among domains. We then provide details for the end-to-end training procedure with additional Concept Fidelity regularization which ensures concept consistency among similar samples.

3.1 Representative Concept Extraction

Figure 1: The proposed Representative Concept Extraction (RCE) framework. The networks F and H respectively extract concepts and associated relevance scores and A aggregates them. Network G reconstructs original input from the concepts while T selects the most representative concepts to the prediction.








Figure 2: Self-supervised contrastive concept learning. Images sampled from a set of positive X+ and negative samples X− associated with an anchor image x. Green arrows depict direction of maximizing similarity, red arrows depict direction of minimizing similarity.


3.2 Self-supervised Contrastive Concept Learning

Even though the RCE framework generates representative concepts, the concepts extracted are adulterated with domain noise thus limiting their generalization. In addition, with limited training data, the concept extraction process is not robust. Self-supervised learning contrastive training objectives are the most commonly used paradigm [Thota and Leontidis, 2021] for learning robust visual features in images. We incorporate self-supervised contrastive learning to learn domain invariant concepts, termed CCL.




3.3 Prototype-based Concept Grounding





Figure 3: Prototype-based concept grounding (PCG). Concept grounding ensures the concept representations learned from both source and target domains are grounded to a representative concept representation prototype (Green).



Concept Fidelity Regularization. Concept fidelity attempts to enforce the similarity of concepts through a similarity measure s(·, ·) of data instances from the same class in the same domain. Formally,




3.4 End-to-end Composite Training

Overall, the training objective can be formalized as a weighted sum of CCL and PCG objectives:





where λ1 and λ2 are tunable hyperparameters controlling the strength of contrastive learning and prototype grounding regularization. The end-to-end training objective can be represented as:





The tunable hyperparameter β controls the effect of generalization and robustness on the RCE framework. Note that a higher value of β makes the concept learning procedure brittle and unable to adapt to target domains. However, a very low value of β makes the concept learning procedure overfit on the source domain, implying a tradeoff between concept generalization and performance.


This paper is available on arxiv under CC BY 4.0 DEED license.


Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks