Please use this identifier to cite or link to this item:
http://hdl.handle.net/1893/35988
Appears in Collections: | Computing Science and Mathematics eTheses |
Title: | A Neuro -Symbolic Incremental Learner Model for the Visual Question Answering Task |
Author(s): | Johnston, Penny |
Supervisor(s): | Swingler, Kevin |
Keywords: | Neuro-Symbolic Incremental Learning Deep Learning Autoencoders Gaussian Mixture Models Digital Assistant |
Issue Date: | 30-Sep-2023 |
Publisher: | University of Stirling |
Citation: | P. Johnston, K. Nogueira and K. Swingler, "GMM-IL: Image Classification Using Incrementally Learnt, Independent Probabilistic Models for Small Sample Sizes," in IEEE Access, vol. 11, pp. 25492-25501, 2023, https://doi.org/10.1109/ACCESS.2023.3255795 P. Johnston, K. Nogueira and K. Swingler, "NS-IL: Neuro-Symbolic Visual Question Answering Using Incrementally Learnt, Independent Probabilistic Models for Small Sample Sizes," in IEEE Access, vol. 11, pp. 141406-141420, 2023, https://doi.org/10.1109/ACCESS.2023.3341007 |
Abstract: | This research is motivated by the challenge of providing accurate and contextually relevant answers to natural language questions about visual scenes, particularly in support of individuals with visual impairments. Neural-Symbolic computing aims to unlock the potential of both the robust learning capabilities found in neural networks and the reasoning and interpretability of symbolic representation through their integration. This thesis introduces a Neuro-Symbolic Incremental Learner designed specifically for the Visual Question Answering Task. The system incrementally learns visual classes and symbolic facts to answer natural language questions about visual scenes. Using Deep Learning, a feature space is created from which visual classes are learnt as independent probability distributions. This allows for the easy addition of new classes even with limited data, mitigating the catastrophic forgetting typical of traditional neural networks. The incorporation of classification by category allows visual classes to not be limited to just objects but can also include other categories such as attributes. A knowledge graph stores facts about regions of interest, detailing; objects, attributes, actions, locations, and inter-relations, facilitating the incremental addition of knowledge. This allows facts to be stored explicitly and added incrementally. Leveraging a large language model, the system translates natural language questions into knowledge graph queries, ensuring a fluid visual question-answering experience. |
Type: | Thesis or Dissertation |
URI: | http://hdl.handle.net/1893/35988 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Thesis_2631677.pdf | Thesis pdf (Sept 2023) | 11.13 MB | Adobe PDF | View/Open |
This item is protected by original copyright |
Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/
If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.