Authors:
(1) Senthujan Senkaiahliyan M. Mgt, is with the Institute for Health Policy Management and Evaluation, Faculty of Public Health, University of Toronto and Peter Munk Cardiac Centre, University Health Network, Toronto ON, Canada;
(2) Augustin Toma MD, is with the Department of Medical Biophysics, Faculty of Medicine, University of Toronto, Toronto, ON, Canada;
(3) Jun Ma PhD, is with Peter Munk Cardiac Centre, University Health Network; Department of Laboratory Medicine and Pathobiology, University of Toronto; Vector Institute, Toronto, ON Canada;
(4) An-Wen Chan MD, is with the Institute for Health Policy Management and Evaluation, Faculty of Public Health and with the Division of Dermatology, Department of Medicine, University of Toronto, Toronto, ON, Canada;
(5) Andrew Ha MD, is with Peter Munk Cardiac Centre, University Health Network and the Division of Cardiology, Department of Medicine, University of Toronto, Toronto, ON, Canada;
(6) Kevin R. An MD, is with the Division of Cardiac Surgery, Department of Surgery, University of Toronto, Toronto, ON, Canada;
(7) Hrishikesh Suresh MD, is with the Division of Neurosurgery, Department of Surgery, University of Toronto, Toronto, ON, Canada;
(8) Barry Rubin MD, is with Peter Munk Cardiac Centre, University Health Network and the Division of Vascular Surgery, Department of Surgery, University of Toronto, Toronto, ON, Canada;
(9) Bo Wang PhD (Corresponding Author) is with Peter Munk Cardiac Centre, University Health Network; Department of Laboratory Medicine and Pathobiology and Department of Computer Science, University of Toronto; Vector Institute, Toronto, Canada. E-mail: [email protected].
Table of Links
Abstract and 1. Introduction GPT-4V(ision)
5. Discussion and Limitations, and References
2. DATA COLLECTION
2.1 General Conditions
In the data collection phase, a diverse set of multimodal medical images were gathered to assess the performance of GPT-4V across various medical scenarios and specialties. The breakdown of multimodal images is presented in Table 1, showcasing different modalities and their respective counts. These images were sourced from open-source libraries and repositories found on the internet.
2.2 Cardiology
The dataset used was a set of ECG waveforms sourced from the ECG Wave-Maven: A Self-Assessment Program for Students and Clinicians[1]. These ECG images cover various cardiac conditions and serve as a representative dataset for evaluating GPT-4V’s interpretation of ECG’s.
2.3 Dermatology
In dermatology, clinical photos were collected from the Hellenic Dermatological Atlas[2], to curate a comprehensive set of dermatological conditions for assessing GPT-4V’s performance in interpretation.
This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license.