In collaboration with the Centre for Computational and Animal Learning Research we have developed computational models of associative learning and reinforcement learning with an emphasis on their implementation in (platform independent) software simulators as essential tools in the cycle of theory formation and refinement. This includes simulators of existing models such as the seminal Rescorla-Wagner's as well as extensions and new models that introduce novel representational formalisms and learning rules -for example, our Serial and Simultaneous Configural-Cue Compound Stimuli Representation for Temporal Difference Learning Model (SSCC TD) and the Double Error Dynamic Asymptote Model (DDA). We aim at maximising and exploiting the full predictive power of the error-correction learning paradigm, and to accommodate a wide range of behavioural and neural experimental findings.
The simulators are accessible via this page:
An out-of-distribution detection method for images that combines density and restoration-based approaches using Vector-Quantized Single Shot Model (SSM) segments lesions in chest CT scans and predicts the condition of the patient at the scan slice level (COVID-19, Common Pneumonia, Normal). SSM extends the state-of-the-art instance segmentation Mask R-CNN model to make global (image class) predictions. The Region of Interest (RoI) module consists of two parallel branches (instance segmentation and classification) that adapt to each problem. Segmentation branch outputs predictions at an instance level (separate lesions), classification branch outputs the distribution of ranked regional predictions, from which the model learns the class of the image. The final model with just 8.27M parameters has COVID-19 sensitivity of 93.16%, F1 score of 96.76% and an average segmentation precision of 42.45% (main MS COCO 2017 criterion). The classification part of the model was trained on a fraction of the data (3K images) and evaluated on the full test split (21K images). Weights are provided for evaluation and further modeling.
The Single Shot Model code is accessible via this page:
Single Shot Model
Simple-playgrounds is a library for quickly designing 2D environments where agents can move around and interact with objects.
The game engine includes simple physics, such as collision and friction.
Agents can act through continuous movements and discrete interactive actions. They perceive with realistic first-person view sensors, top-down view sensors, and semantic sensors.
Simple-playgrounds is easy to handle, allows very fast design of AI experiments and runs very quickly.
Simple-playgrounds is accessible via this page:
An out-of-distribution detection method for images that combines density and restoration-based approaches using Vector-Quantized Variational Auto-Encoders (VQ-VAEs). The VQ-VAE model learns to encode images in a categorical latent space. The prior distribution of latent codes is then modelled using an Auto-Regressive model. This approach enables the estimation of both sample and pixel-wise anomaly scores. This method was tested on medical imaging datasets, including brain MRI and abdominal CT scans.
The VQ-VAE model code is accessible via this page:
This is an application of Multi-Agent Reinforcement Learning (MARL) to approximate Load-Frequency Control (LFC), a classic power generation control problem, in a fully decentralized way. More precisely, we use Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to design and train a controller that keeps generation and demand balanced in a power electronics network in a cost-efficient way. The novelty of our approach is that the agents learn to operate in a close-to-optimal way without exchanging any type of information between them, while the state-of-the-art algorithms typically require central authorities to regulate the system.
Code is accessible via this page:
DMARL for Load Frequency Control
HetSAGE is a graph neural network architecture for efficiently dealing with heterogeneous graph structures. General knowledge graphs can contain different types of nodes (e.g. movie, director, actor), which leads to having different feature spaces for each node type. This architecture encodes each unique feature space to a shared latent space to efficiently reason about each node. The performance and scalability of HetSAGE are reported in our AAAI paper, which also makes the connection to neuro-symbolic learning that is often used to deal with similar data structures.
The source code for this architecture is accessible at:
An unsupervised out-of-distribution detection method for medical images based on implicit fields image representations. The distribution of images is learnt using an auto-decoder model that, for normal images, learns to map spatial coordinates to probabilities over a proxy of tissue types. At inference time, we obtain from the model a normal image maximally consistent with a given test image and we use the voxel-wise probabilities to define the anomaly score as the negative log-likelihood for the test image. The proposed approach significantly outperformed other VAE-based anomaly detection methods (average DICE 0.640 vs 0.518 for the best performing VAE-based alternative) while also requiring considerably less computing time.
The Implicit Fields anomaly detection implementation and instructions to run the model are accessible via this page:
To fully leverage renewable energy resources there is a need for a new market design with improved coordination between TSOs and DSOs. Our work proposes two coordination schemes between TSOs and DSOs: one centralised and another decentralised that facilitate the integration of distributed based generation; minimise operational cost; relieve congestion, and promote a sustainable system. In the resulting decentralised scheme, the TSO and DSO collaborate to optimally allocate all resources in the system. We study the interaction of TSOs and DSOs and the existence of any conflicting objectives with the centralised scheme. We approximate the Pareto front of the multi-objective optimal power flow problem where the entire system, i.e., transmission and distribution systems are modelled. The proposed ideas are illustrated through a five bus transmission system connected with distribution systems, represented by the IEEE 33 and 69 bus feeders.
The code is accessible via this page:
This is a novel probabilistic framework to predict short-term PV output taking into account the variability of weather data over different seasons. To this end, we go beyond existing prediction methods, building a pipeline of processes, i.e., feature selection, clustering and Gaussian Process Regression (GPR). The average error follows a normal distribution, and with 95% confidence level, it takes values between 1.6% to 1.4%. The proposed framework decreases the normalised root mean square error and mean absolute error by 54.6% and 55.5%, respectively, when compared with other relevant works.
The code for this proposal is accessible via this page:
HexaJungle is a multi-agent reinforcement learning simulation environment that captures and encourages complex agent interactions in a non-symmetrical grid world.
It is designed for the purpose of allowing learning agents to share information, agree on strategies, or even lie to each other.
The goal is for both agents to exit the ‘jungle’. They may face obstructions like boulders and rivers. To cross a river, agents need to build a bridge (cooperation) and to cross a boulder, they need to relay its location (information sharing). What makes the environment interesting is that the reward for both agents is different, resulting in possibly adversarial behaviour.
The environment is fully compatible with RL Lib.
The code can be found at this page: