paint-brush

This story draft by @largemodels has not been reviewed by an editor, YET.

Few-shot Prompting for Logical Reasoning Tasks in Biological Pathways

featured image - Few-shot Prompting for Logical Reasoning Tasks in Biological Pathways
Large Models HackerNoon profile picture
0-item

Table of Links

  1. Abstract and Introduction
  2. SylloBio-NLI
  3. Empirical Evaluation
  4. Related Work
  5. Conclusions
  6. Limitations and References


A. Formalization of the SylloBio-NLI Resource Generation Process

B. Formalization of Tasks 1 and 2

C. Dictionary of gene and pathway membership

D. Domain-specific pipeline for creating NL instances and E Accessing LLMs

F. Experimental Details

G. Evaluation Metrics

H. Prompting LLMs - Zero-shot prompts

I. Prompting LLMs - Few-shot prompts

J. Results: Misaligned Instruction-Response

K. Results: Ambiguous Impact of Distractors on Reasoning

L. Results: Models Prioritize Contextual Knowledge Over Background Knowledge

M Supplementary Figures and N Supplementary Tables

I Prompting LLMs - Few-shot prompts

I.1 TASK 1

Context: Suppose you are a specialist with existing knowledge about a signaling and metabolic molecules and their relations organized into biological pathways and processes.


Instructions: Given premises marked with the letter P and the following number and the conclusion marked with the letter C, determine whether the conclusion logically follows from these premises.


Relevance: If the conclusion logically follows from the premises, you need to return ’True’. If the conclusion does not follow logically from the premises, you need to return ’False’.


Constraint: The output should be a single word <True> or <False>.


Demonstration:


"P1: " "Every member of Diseases of hemostasis pathway is a member of Disease pathway"


"P2: " "Gene GP1BB is a member of Diseases of hemostasis pathway"


"C:" "Gene GP1BB is a member of Disease pathway"


The correct answer is: True


"P1: " "Every member of Infectious disease pathway is a member of Disease pathway"


"P2: " "Gene PKQQ is a member of Infectious disease pathway"


"C:" "Gene PKQQ is a member of Infectious disease pathway"


The correct answer is: True


"P1: " "Every member of SLC transporter disorders pathway is a member of Disorders of transmembrane transporters pathway"


"P2: " "Gene AXZY is a member of SLC transporter disorders pathway"


"C:" "Gene AXZY is not a member of Disorders of transmembrane transporters pathway"


The correct answer is: False


"P1: " "Every member of HIV Life Cycle pathway is a member of HIV Infection pathway"


"P2: " "Gene MLLX is a member of HIV Life Cycle pathway"


"C:" "Gene MLLW is a member of HIV Infection pathway"


The correct answer is: False


"P1: " "Every member of ABC transporter disorders pathway is a member of Disorders of transmembrane transporters pathway"


"P2: " "Gene PSMC5 is a member of ABC transporter disorders pathway"


"C:" "It is true that Gene PSMC5 is a member of Disorders of transmembrane transporters pathway"

I.2 TASK 2

Context: Suppose you are a specialist with existing knowledge about a signaling and metabolic molecules and their relations organized into biological pathways and processes.


Instructions*: Given premises marked with the letter P and the following number and the conclusion marked with the letter C, determine whether the conclusion logically follows from these premises.*


Relevance: If the conclusion logically follows from the premises, you need to return ’True’. If the conclusion does not follow logically from the premises, you need to return ’False’. Specify the premises you used to determine whether the conclusion logically follows from the premises, and only these premises.


Constraint: The output should be a single word <True> or <False> and the numbers of the selected premises after the decimal point, like <True, P1, P2>.


"P1: " "Every member of Diseases of hemostasis pathway is a member of Disease pathway"


"P2: " "Every member of NS1 Mediated Effects on Host Pathways pathway is a member of Influenza Infection pathway" "P3: " "Gene AABC is a member of Diseases of hemostasis pathway" "


C:" "Gene AABC is a member of Disease pathway"


The correct answer is: True, P1, P3


"P1: " "Every member of SARS-CoV Infections pathway is a member of Viral Infection Pathways pathway"


"P2: " "Every member of Infectious disease pathway is a member of Disease pathway"


"P3: " "Gene PKQQ is a member of Infectious disease pathway"


"C:" "Gene PKQQ is a member of Infectious disease pathway"


The correct answer is: True, P2, P3


"P1: " "Every member of SLC transporter disorders pathway is a member of Disorders of transmembrane transporters pathway"


"P2: " "Gene AXZY is a member of SLC transporter disorders pathway"


"C:" "Gene AXZY is not a member of Disorders of transmembrane transporters pathway"


The correct answer is: False


"P1: " "Every member of HIV Life Cycle pathway is a member of HIV Infection pathway"


"P2: " "Gene MLLX is a member of HIV Life Cycle pathway"


"C:" "Gene MLLW is a member of HIV Infection pathway"


The correct answer is: False


"P1: " "Every member of ABC transporter disorders pathway is a member of Disorders of transmembrane transporters pathway"


"P2: " "Gene PSMC5 is a member of ABC transporter disorders pathway"


"C:" "It is true that Gene PSMC5 is a member of Disorders of transmembrane transporters pathway"


Authors:

(1) Magdalena Wysocka, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom;

(2) Danilo S. Carvalho, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and Department of Computer Science, Univ. of Manchester, United Kingdom;

(3) Oskar Wysocki, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom and ited Kingdom 3 I;

(4) Marco Valentino, Idiap Research Institute, Switzerland;

(5) André Freitas, National Biomarker Centre, CRUK-MI, Univ. of Manchester, United Kingdom, Department of Computer Science, Univ. of Manchester, United Kingdom and Idiap Research Institute, Switzerland.


This paper is available on arxiv under CC BY-NC-SA 4.0 license.