Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/36966
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKhan, Sulaimanen_UK
dc.contributor.authorBiswas, Md Rafiulen_UK
dc.contributor.authorMurad, Alinaen_UK
dc.contributor.authorAli, Hazraten_UK
dc.contributor.authorShah, Zubairen_UK
dc.date.accessioned2025-04-03T00:03:01Z-
dc.date.available2025-04-03T00:03:01Z-
dc.date.issued2024-08-07en_UK
dc.identifier.urihttp://hdl.handle.net/1893/36966-
dc.description.abstractRecent developments in multimodal large language models (MLLMs) have spurred significant interest in their potential applications across various medical imaging domains. On the one hand, there is a temptation to use these generative models to synthesize realistic-looking medical image data, while on the other hand, the ability to identify synthetic image data in a pool of data is also significantly important. In this study, we explore the potential of the Gemini (gemini-1.0-pro-visionlatest) and GPT-4V (gpt-4-vision-preview) models for medical image analysis using two modalities of medical image data. Utilizing synthetic and real imaging data, both Gemini AI and GPT-4V are first used to classify real versus synthetic images, followed by an interpretation and analysis of the input images. Experimental results demonstrate that both Gemini and GPT4 could perform some interpretation of the input images. In this specific experiment, Gemini was able to perform slightly better than the GPT-4V on the classification task. In contrast, responses associated with GPT-4V were mostly generic in nature. Our early investigation presented in this work provides insights into the potential of MLLMs to assist with the classification and interpretation of retinal fundoscopy and lung X-ray images. We also identify key limitations associated with the early investigation study on MLLMs for specialized tasks in medical image analysis.en_UK
dc.language.isoenen_UK
dc.publisherIEEEen_UK
dc.relationKhan S, Biswas MR, Murad A, Ali H & Shah Z (2024) An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging. <i>2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI)</i>, San Jose, CA, USA, 07.08.2024-09.08.2024. https://doi.org/10.1109/iri62200.2024.00056en_UK
dc.rights.urihttp://www.rioxx.net/licenses/under-embargo-all-rights-reserveden_UK
dc.subjectLLMen_UK
dc.subjectChatGPTen_UK
dc.subjectGemini AIen_UK
dc.subjectMultimodal dataen_UK
dc.subjectRetinaen_UK
dc.subjectLungen_UK
dc.titleAn Early Investigation into the Utility of Multimodal Large Language Models in Medical Imagingen_UK
dc.typeConference Paperen_UK
dc.rights.embargodate2999-12-31en_UK
dc.identifier.doi10.1109/iri62200.2024.00056en_UK
dc.citation.issn2835-5776en_UK
dc.citation.publicationstatusPublisheden_UK
dc.citation.peerreviewedRefereeden_UK
dc.type.statusVoR - Version of Recorden_UK
dc.author.emailali.hazrat@stir.ac.uken_UK
dc.citation.conferencedates2024-08-07 - 2024-08-09en_UK
dc.citation.conferencelocationSan Jose, CA, USAen_UK
dc.citation.conferencename2024 IEEE International Conference on Information Reuse and Integration for Data Science (IRI)en_UK
dc.citation.date08/10/2024en_UK
dc.citation.isbn979-8-3503-5118-7en_UK
dc.contributor.affiliationHamad Bin Khalifa Universityen_UK
dc.contributor.affiliationHamad Bin Khalifa Universityen_UK
dc.contributor.affiliationFoundation University, Islamabaden_UK
dc.contributor.affiliationSohar Universityen_UK
dc.contributor.affiliationHamad Bin Khalifa Universityen_UK
dc.identifier.wtid2069246en_UK
dc.contributor.orcid0000-0003-3058-5794en_UK
dc.date.accepted2024-06-17en_UK
dcterms.dateAccepted2024-06-17en_UK
dc.date.filedepositdate2024-11-12en_UK
rioxxterms.apcnot requireden_UK
rioxxterms.typeConference Paper/Proceeding/Abstracten_UK
rioxxterms.versionVoRen_UK
local.rioxx.authorKhan, Sulaiman|en_UK
local.rioxx.authorBiswas, Md Rafiul|en_UK
local.rioxx.authorMurad, Alina|en_UK
local.rioxx.authorAli, Hazrat|0000-0003-3058-5794en_UK
local.rioxx.authorShah, Zubair|en_UK
local.rioxx.projectInternal Project|University of Stirling|https://isni.org/isni/0000000122484331en_UK
local.rioxx.freetoreaddate2024-11-12en_UK
local.rioxx.licencehttp://www.rioxx.net/licenses/under-embargo-all-rights-reserved||en_UK
local.rioxx.filenameAn_Early_Investigation_into_the_Utility_of_Multimodal_Large_Language_Models_in_Medical_Imaging.pdfen_UK
local.rioxx.filecount1en_UK
local.rioxx.source979-8-3503-5118-7en_UK
Appears in Collections:Computing Science and Mathematics Conference Papers and Proceedings

Files in This Item:
File Description SizeFormat 
An_Early_Investigation_into_the_Utility_of_Multimodal_Large_Language_Models_in_Medical_Imaging.pdfFulltext - Published Version1.15 MBAdobe PDFUnder Permanent Embargo    Request a copy


This item is protected by original copyright



Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.