Table of Links Abstract and Introduction
SylloBio-NLI
Empirical Evaluation
Related Work
Conclusions
Limitations and References A. Formalization of the SylloBio-NLI Resource Generation Process B. Formalization of Tasks 1 and 2 C. Dictionary of gene and pathway membership D. Domain-specific pipeline for creating NL instances and E Accessing LLMs F. Experimental Details G. Evaluation Metrics H. Prompting LLMs - Zero-shot prompts I. Prompting LLMs - Few-shot prompts J. Results: Misaligned Instruction-Response K. Results: Ambiguous Impact of Distractors on Reasoning L. Results: Models Prioritize Contextual Knowledge Over Background Knowledge M Supplementary Figures and N Supplementary Tables L Results: Models Prioritize Contextual Knowledge Over Background Knowledge The lack of statistically significant differences (Fig. 7) in accuracy between biologically factual and artificial datasets across generalized modus ponens and generalized modus tollens schemes suggests that the models’ reasoning capabilities rely more on stated contextual knowledge and logical structure than on pre-existing background knowledge. This holds true for both accuracy and reasoning accuracy, as well as in both ZS and FS settings: models that perform well on a given scheme maintain their performance even when factual gene names are replaced by synthetic names, and the same consistency is observed for models with weaker performance. This ability to maintain accuracy with synthetic gene names in the artificial set demonstrates that models can abstract and apply logical reasoning independently of their internal domain-specific knowledge. Authors:
(1) Magdalena Wysocka, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom;
(2) Danilo S. Carvalho, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and Department of Computer Science, Univ. of Manchester, United Kingdom;
(3) Oskar Wysocki, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and ited Kingdom 3 I;
(4) Marco Valentino, Idiap Research Institute, Switzerland;
(5) André Freitas, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom, Department of Computer Science, Univ. of Manchester, United Kingdom and Idiap Research Institute, Switzerland. This paper is available on arxiv under CC BY-NC-SA 4.0 license. Table of Links Abstract and Introduction SylloBio-NLI Empirical Evaluation Related Work Conclusions Limitations and References Abstract and Introduction Abstract and Introduction SylloBio-NLI SylloBio-NLI Empirical Evaluation Empirical Evaluation Related Work Related Work Conclusions Conclusions Limitations and References Limitations and References A. Formalization of the SylloBio-NLI Resource Generation Process A. Formalization of the SylloBio-NLI Resource Generation Process B. Formalization of Tasks 1 and 2 B. Formalization of Tasks 1 and 2 C. Dictionary of gene and pathway membership C. Dictionary of gene and pathway membership D. Domain-specific pipeline for creating NL instances and E Accessing LLMs D. Domain-specific pipeline for creating NL instances and E Accessing LLMs F. Experimental Details F. Experimental Details G. Evaluation Metrics G. Evaluation Metrics H. Prompting LLMs - Zero-shot prompts H. Prompting LLMs - Zero-shot prompts I. Prompting LLMs - Few-shot prompts I. Prompting LLMs - Few-shot prompts J. Results: Misaligned Instruction-Response J. Results: Misaligned Instruction-Response K. Results: Ambiguous Impact of Distractors on Reasoning K. Results: Ambiguous Impact of Distractors on Reasoning L. Results: Models Prioritize Contextual Knowledge Over Background Knowledge L. Results: Models Prioritize Contextual Knowledge Over Background Knowledge M Supplementary Figures and N Supplementary Tables M Supplementary Figures and N Supplementary Tables L Results: Models Prioritize Contextual Knowledge Over Background Knowledge The lack of statistically significant differences (Fig. 7) in accuracy between biologically factual and artificial datasets across generalized modus ponens and generalized modus tollens schemes suggests that the models’ reasoning capabilities rely more on stated contextual knowledge and logical structure than on pre-existing background knowledge. This holds true for both accuracy and reasoning accuracy, as well as in both ZS and FS settings: models that perform well on a given scheme maintain their performance even when factual gene names are replaced by synthetic names, and the same consistency is observed for models with weaker performance. This ability to maintain accuracy with synthetic gene names in the artificial set demonstrates that models can abstract and apply logical reasoning independently of their internal domain-specific knowledge. Authors: (1) Magdalena Wysocka, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom; (2) Danilo S. Carvalho, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and Department of Computer Science, Univ. of Manchester, United Kingdom; (3) Oskar Wysocki, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and ited Kingdom 3 I; (4) Marco Valentino, Idiap Research Institute, Switzerland; (5) André Freitas, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom, Department of Computer Science, Univ. of Manchester, United Kingdom and Idiap Research Institute, Switzerland. Authors: Authors: (1) Magdalena Wysocka, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom; (2) Danilo S. Carvalho, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and Department of Computer Science, Univ. of Manchester, United Kingdom; (3) Oskar Wysocki, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and ited Kingdom 3 I; (4) Marco Valentino, Idiap Research Institute, Switzerland; (5) André Freitas, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom, Department of Computer Science, Univ. of Manchester, United Kingdom and Idiap Research Institute, Switzerland. This paper is available on arxiv under CC BY-NC-SA 4.0 license. This paper is available on arxiv under CC BY-NC-SA 4.0 license. available on arxiv

Part of HackerNoon's growing list of open-source research papers, promoting free access to academic material.

LLMs Rely on Contextual Knowledge Over Background Knowledge

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

A Formalization of the SylloBio-NLI Resource Generation Process

New Study Reveals AI's Weak Spots in Medical Logic

How AI Models Handle Complex Biomedical Reasoning

How Effective Are AI Models at Biomedical Problem Solving?

Testing AI's Ability to Think Logically: A Breakthrough in Biomedical Research

Enhancing Syllogistic Reasoning in Biomedical NLI: Key Insights and Challenges

A Formalization of the SylloBio-NLI Resource Generation Process

New Study Reveals AI's Weak Spots in Medical Logic

How AI Models Handle Complex Biomedical Reasoning

How Effective Are AI Models at Biomedical Problem Solving?

Testing AI's Ability to Think Logically: A Breakthrough in Biomedical Research

Enhancing Syllogistic Reasoning in Biomedical NLI: Key Insights and Challenges

Light-Mode

Classic

Newspaper

Minty

Dark-Mode

Neon Noir

Minty

HN StartUps