Authors: (1) Goran Muric, InferLink Corporation, Los Angeles, (California gmuric@inferlink.com); (2) Ben Delay, InferLink Corporation, Los Angeles, California (bdelay@inferlink.com); (3) Steven Minton, InferLink Corporation, Los Angeles, California (sminton@inferlink.com). Authors: Authors: (1) Goran Muric, InferLink Corporation, Los Angeles, (California gmuric@inferlink.com); (2) Ben Delay, InferLink Corporation, Los Angeles, California (bdelay@inferlink.com); (3) Steven Minton, InferLink Corporation, Los Angeles, California (sminton@inferlink.com). Table of Links Abstract and 1 Introduction Abstract and 1 Introduction 1.1 Motivation 1.1 Motivation 2 Related Work and 2.1 Prompting techniques 2 Related Work and 2.1 Prompting techniques 2.2 In-context learning 2.2 In-context learning 2.3 Model interpretability 2.3 Model interpretability 3 Method 3 Method 3.1 Generating questions 3.1 Generating questions 3.2 Prompting LLM 3.2 Prompting LLM 3.3 Verbalizing the answers and 3.4 Training a classifier 3.3 Verbalizing the answers and 3.4 Training a classifier 4 Data and 4.1 Clinical trials 4 Data and 4.1 Clinical trials 4.2 Catalonia Independence Corpus and 4.3 Climate Detection Corpus 4.2 Catalonia Independence Corpus and 4.3 Climate Detection Corpus 4.4 Medical health advice data and 4.5 The European Court of Human Rights (ECtHR) Data 4.4 Medical health advice data and 4.5 The European Court of Human Rights (ECtHR) Data 4.6 UNFAIR-ToS Dataset 4.6 UNFAIR-ToS Dataset 5 Experiments 5 Experiments 6 Results 6 Results 7 Discussion 7 Discussion 7.1 Implications for Model Interpretability 7.1 Implications for Model Interpretability 7.2 Limitations and Future Work 7.2 Limitations and Future Work Reproducibility Reproducibility Acknowledgment and References Acknowledgment and References A Questions used in ICE-T method A Questions used in ICE-T method 3.2 Prompting LLM The LLMs are prompted in two occasions. First, they are prompted to obtain the set of secondary questions Q, as described in Section 3.1. Second, for each document, we prompt the LLM with the document and corresponding secondary questions. Then, for each question qi the output ai of the LLM is collected, creating a set of outputs for each doc-ument. The textual outputs are then assigned a numerical value and transformed into a feature vector vi, through the verbalization process explained in Section 3.3. 3.3 Verbalizing the answers The output of the LLM in response to each prompt is limited to one of three possible values: Yes, No, or Unknown, depending on the answer to the question posed in the prompt. These responses are subsequently assigned numerical values for analysis, with “Yes” translating to 1, “No” to 0, and “Unknown” to 0.5. 3.4 Training a classifier To train a classifier, we use a set V of low-dimensional numerical vectors, where |V=n+1 and corresponding labels X, where each vector vi has a corresponding binary label xi. Vectors V are obtained from the training textual data after prompting LLM to generate n + 1 outputs that are then assigned a numerical value. A classifier is then trained using a 5-fold cross-validation process and grid search for the best parameters. A choice of a specific classification algorithm will depend on the size of training data, values distribution and desired performance on a specific classification metric. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license. available on arxiv available on arxiv