A Neuro -Symbolic Incremental Learner Model for the Visual Question Answering Task

Johnston, Penny

Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/35988

Appears in Collections:	Computing Science and Mathematics eTheses
Title:	A Neuro -Symbolic Incremental Learner Model for the Visual Question Answering Task
Author(s):	Johnston, Penny
Supervisor(s):	Swingler, Kevin
Keywords:	Neuro-Symbolic Incremental Learning Deep Learning Autoencoders Gaussian Mixture Models Digital Assistant
Issue Date:	30-Sep-2023
Publisher:	University of Stirling
Citation:	P. Johnston, K. Nogueira and K. Swingler, "GMM-IL: Image Classification Using Incrementally Learnt, Independent Probabilistic Models for Small Sample Sizes," in IEEE Access, vol. 11, pp. 25492-25501, 2023, https://doi.org/10.1109/ACCESS.2023.3255795 P. Johnston, K. Nogueira and K. Swingler, "NS-IL: Neuro-Symbolic Visual Question Answering Using Incrementally Learnt, Independent Probabilistic Models for Small Sample Sizes," in IEEE Access, vol. 11, pp. 141406-141420, 2023, https://doi.org/10.1109/ACCESS.2023.3341007
Abstract:	This research is motivated by the challenge of providing accurate and contextually relevant answers to natural language questions about visual scenes, particularly in support of individuals with visual impairments. Neural-Symbolic computing aims to unlock the potential of both the robust learning capabilities found in neural networks and the reasoning and interpretability of symbolic representation through their integration. This thesis introduces a Neuro-Symbolic Incremental Learner designed specifically for the Visual Question Answering Task. The system incrementally learns visual classes and symbolic facts to answer natural language questions about visual scenes. Using Deep Learning, a feature space is created from which visual classes are learnt as independent probability distributions. This allows for the easy addition of new classes even with limited data, mitigating the catastrophic forgetting typical of traditional neural networks. The incorporation of classification by category allows visual classes to not be limited to just objects but can also include other categories such as attributes. A knowledge graph stores facts about regions of interest, detailing; objects, attributes, actions, locations, and inter-relations, facilitating the incremental addition of knowledge. This allows facts to be stored explicitly and added incrementally. Leveraging a large language model, the system translates natural language questions into knowledge graph queries, ensuring a fluid visual question-answering experience.
Type:	Thesis or Dissertation
URI:	http://hdl.handle.net/1893/35988

Files in This Item:

File	Description	Size	Format
Thesis_2631677.pdf	Thesis pdf (Sept 2023)	11.13 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show full item record

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.

STORRE

STORRE: Stirling Online Research Repository