Authors:
(1) Sanchit Sinha, University of Virginia (sanchit@virginia.edu);
(2) Guangzhi Xiong, University of Virginia (hhu4zu@virginia.edu);
(3) Aidong Zhang, University of Virginia (aidong@virginia.edu).
Table of Links
3 Methodology and 3.1 Representative Concept Extraction
3.2 Self-supervised Contrastive Concept Learning
3.3 Prototype-based Concept Grounding
3.4 End-to-end Composite Training
4 Experiments and 4.1 Datasets and Networks
4.3 Evaluation Metrics and 4.4 Generalization Results
4.5 Concept Fidelity and 4.6 Qualitative Visualization
3 Methodology
In this section, we first provide a detailed description of our proposed learning pipeline, including (a) the Representative Concept Extraction (RCE) framework which incorporates a novel Salient Concept Selection Network in addition to the Concept and Relevance Networks, (b) Self-Supervised Contrastive Concept Learning (CCL) which enforces domain invariance among learned concepts, and (c) a Prototype-based Concept Grounding (PCG) regularizer that mitigates the problem of concept-shift among domains. We then provide details for the end-to-end training procedure with additional Concept Fidelity regularization which ensures concept consistency among similar samples.
3.1 Representative Concept Extraction
3.2 Self-supervised Contrastive Concept Learning
Even though the RCE framework generates representative concepts, the concepts extracted are adulterated with domain noise thus limiting their generalization. In addition, with limited training data, the concept extraction process is not robust. Self-supervised learning contrastive training objectives are the most commonly used paradigm [Thota and Leontidis, 2021] for learning robust visual features in images. We incorporate self-supervised contrastive learning to learn domain invariant concepts, termed CCL.
3.3 Prototype-based Concept Grounding
Concept Fidelity Regularization. Concept fidelity attempts to enforce the similarity of concepts through a similarity measure s(·, ·) of data instances from the same class in the same domain. Formally,
3.4 End-to-end Composite Training
Overall, the training objective can be formalized as a weighted sum of CCL and PCG objectives:
where λ1 and λ2 are tunable hyperparameters controlling the strength of contrastive learning and prototype grounding regularization. The end-to-end training objective can be represented as:
The tunable hyperparameter β controls the effect of generalization and robustness on the RCE framework. Note that a higher value of β makes the concept learning procedure brittle and unable to adapt to target domains. However, a very low value of β makes the concept learning procedure overfit on the source domain, implying a tradeoff between concept generalization and performance.
This paper is available on arxiv under CC BY 4.0 DEED license.