Ongoing projects
Closing the sim-to-real gap: A hybrid framework for HVAC simulation and fault detection.
Abstract
Buildings contribute to 40% of global energy consumption, with 36% attributed to heating, ventilation, and air-conditioning (HVAC). Therefore, optimizing HVAC control in buildings is crucial in the transition to a more sustainable society. Model predictive control (MPC) and (deep) reinforcement learning (DRL) have been explored for optimal control strategies, producing promising results. However, their performance depends on the underlying simulation model's accuracy, which is why an accurate model throughout the building's lifecycle is important. Physics-based models introduce discrepancies due to necessary simplifications, called the sim-to-real-gap. Closing this gap requires expert knowledge to increase the model's complexity, which is often not feasible. Given the emergence of smart, sensor-equipped buildings, data-driven solutions are possible, enabling a hybrid model that exploits the advantages of data-driven and physics-based models. First, data-driven models, like for example deep neural networks (DNNs), are added to the physics-based model on the level of the components to close the sim-to-real-gap. Second, as components degrade, the sim-to-real-gap will grow again and is closed using the same approach. Third, the hybrid model facilitates automatic fault detection and diagnosis (AFDD) using results from the adjustment process. Finally, an assessment of energy loss due to component degradation guides cost-optimal maintenance strategies.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Verhaert Ivan
- Fellow: Houben Pieter Jan
Research team(s)
Project type(s)
- Research Project
Strengthening the research capacities for extreme weather events in Romania (SCEWERO).
Abstract
The SCEWERO project will be developed by a consortium of 5 organizations from 4 countries: Babeș-Bolyai University (UBB), a research institution located in Romania as a widening country and acting as coordinator, three top-class leading partners, Fondazione Centro Euro-Mediterraneo Sui Cambiamenti Climatici (IT), Universiteit Antwerpen (BE), and Justus-Liebig-Universität Giessen (DE), and a private partner (SME), Indeco Soft (RO) aiming to improve the excellence capacity in research, to raise the scientific reputation, research profile and attractiveness through networking, and strengthening research management capacity and administrative skills of the UBB team. The SCEWERO project will be implemented in 9 work packages: one is dedicated to Ethics issues, two are dedicated to project management, and two to dissemination, exploitation, and communication. Two dedicated WPs focus on comprehensive training for UBB researchers provided by the top-class partners on the topics of i. extreme temperature and precipitation events, compound events, and artificial intelligence use for better analyzing and forecasting them (WP4) and ii. science communication on weather extremes and artificial intelligence (WP5). WP6 is dedicated to consolidating research management capacity and administrative skills and providing instruction for a dedicated working group to be created in the UBB. WP7 covers a small research project with the UBB team that holds the potential to make a significant impact. It will put into practice the knowledge transferred through instructions provided in WPs 4 and 5 by the high-performing partners. Aligned with the Early Warnings for All Initiative and the EU mission on Adaptation to Climate Change and the European Green Deal objectives, the research component aims to establish a new methodology and provide relevant results obtained through a complex approach to contribute to the enhancement of the early warning systems on heat event in Romania. The methodology could be then replicated in other European countries, paving the way for a more resilient future.Researcher(s)
- Promoter: Tabari Hossein
Research team(s)
Project type(s)
- Research Project
Improving Wind and Solar Energy Forecasting Through Physics-Informed Machine Learning.
Abstract
Renewable energy sources are emerging as a crucial alternative to traditional energy sources, driven by the pressing need to reduce greenhouse gas emissions and mitigate the effects of climate change. Accurate forecasting of renewable energy resources is essential for effective decision-making in the energy sector, particularly in deeply decarbonized energy systems. Machine learning (ML) can play a significant role in improving the accuracy of renewable energy forecasting by integrating it with numerical weather prediction (NWP) models, known as physics-informed ML. This approach can address the challenge of the poor extrapolation/generalization capability of ML models by leveraging the foundation of physics-based models to generalize better to new situations. This project aims to develop a novel physics-informed ML model by integrating physical equations from NWP models with ML models to enhance the accuracy and reliability of renewable energy forecasting, focusing on wind and solar energy production forecasting. The successful implementation of this model has the potential to promote the sustainability of the energy system, lower balancing costs, and combat climate change.Researcher(s)
- Promoter: Tabari Hossein
Research team(s)
Project type(s)
- Research Project
SLICES Flanders 2022 - Flemish participation in Scientific LargeScale Infrastructure for Computing/Communication Experimental Studies.
Abstract
Our society is undoubtedly rapidly evolving towards a fully digital society. These changes and new technologies such as 5G, (I)IoT, Cloud computing, Edge computing, Big Data... and many other new concepts, are getting embedded in our society and daily life. As a consequence, our communication networks and the internet, become very complex and rely on a heterogeneity of technologies never seen or experienced before. Research on new concepts and new aspects of this Next Generation Internet as well as developing tools, techniques and applications cannot be carried out without experimentation. Testing of these newly researched and developed technologies cannot be carried out on systems active in the real world but require experimentation facilities which can mimic the real network in all its aspects. Flemish universities and research organizations have invested in and established a collection of world-class experimentation facilities for these purposes, covering a wide range of technologies, and this proposal aims at establishing a Flemish and Belgian node in a European Research Infrastructure which would integrate all of these testbeds into one single research infrastructure. Scientific communicationResearcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Marquez-Barja Johann
Research team(s)
Project type(s)
- Research Project
Accident-prone Vision-based Simulation for Autonomous Safety-critical Systems
Abstract
Autonomous navigation has been gaining much traction recently. As a result, we see autonomy developing in vehicles and finding its way in many transportation sectors (including smart shipping). Nevertheless, the current state-of-the-art (SOTA) technology is not mature enough to have a widespread application at a higher autonomy level (e.g. level 4 and above). The main reason is that these systems are trained on a lot of real-world data, which often lacks accident-prone scenarios. In order to solve this problem, I propose a solution based on data-driven neural simulations that provide realistic data based on real-world samples and generate unsafe scenarios (collisions, accidents, etc.). Moreover, my system also provides safety checks to validate unsafe scenarios and provide safe boundaries for the current autonomous systems.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Anwar Ali
- Co-promoter: Mercelis Siegfried
- Co-promoter: Oramas Mogrovejo José Antonio
- Fellow: Duym Jens
Research team(s)
Project type(s)
- Research Project
Extensible Tools for Renewable ENergy Decision making (E-TREND).
Abstract
E-TREND is a research and development initiative focusing on creating decision-making tools that integrate expertise in meteorological forecasting and climate projections for renewable energy sources (RES) in Belgium. It aims to enhance the modeling of wind and photovoltaic energy production and electricity consumption through meteorological ensemble forecasting, climate services, and advanced modeling techniques. The project involves collaboration among Belgian federal scientific institutes and universities to develop and integrate RES generation models into a comprehensive forecasting chain. This effort addresses the integration of current best practices and explores advanced topics beyond conventional methods. The outcomes are designed to support energy sector stakeholders in their operational and planning decision-making processes, with a particular emphasis on incorporating input from Belgian stakeholders to guide research and development efforts. E-TREND's primary research priority aligns with developing forecasting tools for renewable energy production, linked to high-resolution atmospheric weather prediction and regional climate models, aiming to improve the predictability of essential variables for managing renewable energy power production. The project differentiates between "forecasting" for short-term meteorological predictions and "projections" for long-term climate outlooks, offering tools for both applications. Additionally, it contributes to understanding the impact of climate change on energy resources, assisting in the creation of future scenarios for sustainable energy production balance. E-TREND aligns with Belgian and European commitments to increase renewable energy usage, supporting the transition to a net-zero emissions economy by 2050 under the Horizon Europe Framework Program.Researcher(s)
- Promoter: Hellinckx Peter
Research team(s)
Project type(s)
- Research Project
Knowledge Based Neural Network Compression: Context-Aware Model Abstractions
Abstract
In the state-of-the-practice IoT platforms complex decisions based on sensor information are made in a centralized data center. Each sensor sends its information over thereafter a decision is send to actuators. In certain applications the latency imposed by this communication can lead to problems. For this, decisions should be made on the edge devices themselves. This is what the research track on resource and context aware AI is about. We want to develop edge inference systems that dynamically reconfigure to adapt to changing environments and resources constraints. This work is focused on compressing neural networks. In this work we want to extend on the current state-of-the-art on neural network compression by incorporating a knowledge-based pruning method. With knowledge based we mean that we first determine the locations of specific task related knowledge in the network and use this to guide the pruning. This way we can make the networks adjustable to environmental characteristics and hardware constraints. For some tasks in a specific environment, it might be favorable to reduce the accuracy of certain classes in favor of resource gain. For example, the classification of certain types of traffic sign types can be less accurate on highways than in a city center. Based on these requirements we want to selectively prune by locating specific task related concepts. By removing them we expect to achieve higher compression ratios compared to the state-of-the-art.Researcher(s)
- Promoter: Hellinckx Peter
- Promoter: Mercelis Siegfried
- Co-promoter: Steckel Jan
- Fellow: Balemans Dieter
Research team(s)
Project type(s)
- Research Project
Distributed multi-modal data fusion using graph-based deep learning for situational awareness in intelligent transport systems.
Abstract
Reliability and accuracy are the two fundamental requirements for intelligent transport systems (ITS). The reliability of active perception for situational awareness algorithms has significantly improved in the past few years due to AI developments. Situational awareness can be improved through exchange of information between multiple agents. Making it complex to accomplish high accuracy at low computational cost cooperatively is critical to ensuring safe and reliable transport systems. This research will tackle the main challenges for shared situational awareness that requires perception from multiple sensor streams and multiple agents. This research will tackle the local sensor fusion problem with graph-based deep learning. Local sensor fusion is the fusion at the agent level where multiple mounted sensors will be used to solve a defined task. By exploiting the structural information in multiple modalities, the proposed solution will construct graph-based deep learning. Then distributed fusion will be accomplished by fusing predictions from multiple agents. As a result, the predictions can be fused across multiple agents to produce a richer situational awareness. The advantage of doing distributed fusion is evident in situations where a single agent's perception is not enough. This will be achieved by modeling spatio-temporal graph networks and studying dynamic updates in the graphs. The results will be validated using real-life benchmark datasets and simulation engine.Researcher(s)
- Promoter: Hellinckx Peter
- Promoter: Mercelis Siegfried
- Co-promoter: Anwar Ali
- Fellow: Ahmed Ahmed
Research team(s)
Project type(s)
- Research Project
Knowledge Based Neural Network Compression: Quality-Aware Model Abstractions.
Abstract
In the state-of-the-practice IoT platforms complex decisions based on sensor information are made in a centralized data center. Each sensor sends its information over thereafter a decision is send to actuators. In certain applications the latency imposed by this communication can lead to problems. In real time applications it is crucial for the decision to be taken immediately. For this complex decisions should be made on the edge devices themselves. This is what the research track on resource and context aware AI is about. In this we want to develop inference edge systems that dynamically reconfigure to adapt to changing environments and resources constraints. This work if focused on compressing AI processing blocks, specifically neural networks. In this work we want to extend on the current state-of-the-art methods on neural network compression by incorporating a knowledge-based pruning method. By knowledge based we mean we want to prune a neural network in a context aware manner. A certain application context will impose requirements of the outputs of the network. For example, on a highway is the detection of pedestrians less important than cars. Based on these requirements we want to selectively prune a network by locating knowledge concepts related to the outputs. By selectively pruning them we expect to achieve higher compression ratios compared to the state-of-the-art for context specific networks.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Mercelis Siegfried
- Co-promoter: Steckel Jan
Research team(s)
Project type(s)
- Research Project
Artifical Intelligence in Meteorological Applications (AIM).
Abstract
A main part of the mission of the RMI is to produce permanent services in order to ensure the security and the information of the population and to support the political authorities in their decision m a king. The development of numerical weather prediction mo dels (NWP) has long been a crucial part of this service. Important developments of the last years are the ever increasing amount of meteorological observations used to improve NWP forecasts through d a ta assimilation and statistical postprocessing , the use of probabilistic ensemble model s that enable better decision support , the ever increasing resolution of the models , and the incorporation of urban effects through land surface schemes . The RMI also o p erationally runs a dedicated road weather mo del since winter 2018 2019 for Belgian highways , giving decision support to traffic agencies such as Agentschap Wegen en Verkeer (AWV) in Flanders High resolution NWP models and data assimilation techniques, en s emble models and the RMI road weather model must continu e to take advantage of the newest scientific developments. Artificial intelligenceis impacting numerous scientific fields , and meteorology is no exception . For example, techniques an d software libraries from Deep Learning are being used in the field of data assimilation and neural networks are starting to be applied to statistica l postprocessing of ensemble forecasts Another important evolution is the availability of crowdsourced meteorologica l data such as from volunteer stations , and new types of sensors such as vehicle sensors, which will be tested in the RMI road weather mo del in the context of the SARWS project. Assimilation of such data can only improve model forecasts if adequate quality control is applied. An innovative new approach is the use of distributed intelligence to perform part of the necessary computations at the le vel of the sensors, before centralizing the data. It isobvious that the RMI would benefit greatly from a univer s ity partner with expertise in artificial intelligence and data science. IDLab University of Antwerp brings such expertise to the table. IDLab performs fundamental and applied research on internet technologies and data science. Within UA , the distributed intelligence group focuses on topics such as distributed and agent based intelligence, scientific machine learning, resource aware AI, and deep reinforcement learningResearcher(s)
- Promoter: Hellinckx Peter
- Fellow: Casteels Wim
- Fellow: Tabari Hossein
Research team(s)
Project type(s)
- Research Project
Learning to communicate efficiently with multi-agent reinforcement learning for distributed control applications.
Abstract
In recent years, there has been increased interest in the field of multi-agent reinforcement learning. For tasks where cooperation between agents is required, researchers are looking towards techniques to allow the agents to learn to communicate while simultaneously learning how to act in the environment. Current state-of-the-art techniques often use broadcast communication. However, this is not scalable to real world applications. Therefore, I want to develop methods to make this communication more efficient. The goal of this research project is to reduce the amount of messages that are sent, while still maintaining the same performance. To reach this goal, I will look at techniques to communicate with a variable amount of agents, at techniques to limit communication using relevance metrics and signatures and at techniques to encourage hopping behavior in agents. The methods proposed in this research project are essential to be able to create scalable control applications by distributing them in combination with scalable learned communication. The developed methods will be validated on simulations of traffic light control.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Mercelis Siegfried
- Fellow: Vanneste Astrid
Research team(s)
Project type(s)
- Research Project
Support maintenance scientific equipment (IDLab).
Abstract
This project is devoted for the maintenance of the City of Things Hercules infrastructure . Within this project, we have developed the CityLab testbed which is a wireless edge computing platform for smart cities. This provides experimental access to wireless networking infrastructure, edge computing infrastructure and smart city sensors.Researcher(s)
- Promoter: Hellinckx Peter
- Promoter: Latré Steven
- Promoter: Mannens Erik
Research team(s)
Project type(s)
- Research Project
Past projects
Goal-Oriented Process Control by Including Expert Knowledge in Model-Based Reinforcement Learning using Soft Constraints.
Abstract
Due to its strong economic impact, the field of process control has received much research interest over the years. Whilst traditional control methods have been used in the industry for decades, the application of Machine Learning (ML) has not been properly assessed. An interesting novel field withing ML is Reinforcement Learning (RL), which has repeatedly improved the state-of-the-art (SOTA) in the control of complex systems. Consequently, applying this technique to industrial process control has the potential of strongly improving process efficiency. On the one hand, this leads to reduced cost, resource usage and energy requirements for some of the biggest industries worldwide. On the other hand, this opens a new avenue for collaboration between academics and industry. This project aims to research techniques that are centered around applying RL to industrial process control by developing goal-oriented agents that effectively capture the expectations of the user. (1) An agent with an accurate latent world model will be developed with SOTA performance and strong reasoning capabilities. (2) This agent is extended with a reverse imagination model to reconstruct physical states from latent states. State constraints are applied to these physical states based on expert knowledge to create an intuitive framework for guiding the agent. (3) The agent is then transferred from simulation to reality using offline data to align the internal world model with the real-world environment.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Mercelis Siegfried
- Fellow: Troch Arne
Research team(s)
Project type(s)
- Research Project
Using Model-Based Reinforcement Learning combined with Monte-Carlo Tree Search to optimize Neural Networks for Embedded Devices.
Abstract
Currently, most AI systems are being run in cloud environments. For some systems, like real-time systems, this can be troublesome, and moving these AI algorithms to the edge can provide a solution to these problems. The aim of my research is to use reinforcement learning techniques to design neural networks with performance rivalling that of modern, state-of-the-art systems, while reducing the resource consumption of these systems to a level that is manageable for edge devices. In order to achieve this goal, my work is split into 3 large components: multi-objective optimization, hardware embeddings and model-based reinforcement learning (MBRL) using monte carlo tree search (MCTS). The first component of my research, will deal with the scalarization of a multi-objective reward function, into a scalar reward. This is necessary for reinforcement learning systems, since they take a single reward value as feedback. For the second component of my research I will try to find a way to represent a certain piece of hardware, in a neural-network friendly manner. This is necessary for our system to be able to be able to exploit the architectural features of a specific piece hardware. Finally, I will introduce MBRL using MCTS to the field of neural architecture search. In this component, I will utilize the developed scalarization techniques and hardware representation from the first two components and a MBRL system to generate neural network architectures targeted at specific devices.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Mercelis Siegfried
- Fellow: Cassimon Amber
Research team(s)
Project type(s)
- Research Project
Sustainable Internet of Batteryless Things (IoBaleT).
Abstract
The Internet of Things (IoT) vision has enabled the wireless connection of billions of battery-powered devices to the Internet. However, batteries are expensive, bulky, cause pollution and degrade after a few years. Replacing and disposing of billions of dead batteries every year is costly and unsustainable. We posit the vision of a sustainable Internet of Battery-Less Things (IoBaLeT). We imagine battery-less devices storing small amounts of energy in capacitors, harvested from their environment or obtained through simultaneous wireless information and power transfer (SWIPT). Using this energy, these intermittently-powered devices are able to cooperatively perform sensing, actuation and communication tasks. Existing battery-less technology has many shortcomings. Such devices, usually based on passive RFID and backscatter, only support simple sensing, unable to handle more complex application logic. Networks do not scale, have a short range and a very low throughput. The goal of IoBaLeT is to bring battery-less technology to the next level. We envision battery-less devices and networks that support complex sensing and actuation applications, and offer throughput, scalability and range on-par with their battery-powered counterparts. To achieve this, we propose a novel battery-less IoT device design that relies on a combination of SWIPT, hybrid energy harvesting, active transmissions and wake-up radios. The project will innovate in terms of SWIPT efficiency, battery-less networking protocols, and distributed intermittent computing paradigms and scheduling algorithms. Leaving batteries behind will enable IoT applications at an unprecedented scale, with a significantly extended lifetime and in hard-to-reach places.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Mercelis Siegfried
Research team(s)
Project type(s)
- Research Project
Multi-modal transfer learning through self-supervision for real-time venue mapping.
Abstract
Venue mapping is a special case of the reverse geocoding problem. Given user's GPS coordinates, an accuracy radius and a list of venues located inside that radius, we want to derive which venue did the user visit. Unfortunately, noise in the signal, and especially in dense urban areas, limits our ability to achieve satisfactory results. Resent research shows that it is possible to improve the results by incorporating temporal and behavioral knowledge into the venue mapping model. As a company specializing in analyzing sensor data, such as accelerometer, gyroscope and GPS, from mobile devices, Sentiance has a vast amount of data for thousands of users. An open question is how to represent the data so that the model could be trained in fully data-driven fashion. Manually creating rules or labelling millions of venues is not an option and would not result in a scalable, future-proof solution. Restricted by the lack of labelled data, we studied the latest achievements in Deep self-supervised learning in order to design a model that would be able to autonomously reveal the internal patterns available in the unlabeled data. In order to guarantee rich generalization capabilities of our model, we searched for ways to incorporate more knowledge into our model by means of publicly available data and Transfer learning. Despite the fact that such datasets exist, we faced another problem – the format of the data is so different from our in-house data, that none of the existing Transfer Learning techniques could be applied directly. Finally, to tackle this challenge we studied the fields of Multimodal learning and Multi-task learning. In this project we propose training a series of Deep learning models with a novel architecture that would result in a new state-of-the-art solution for the venue mapping problem.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Latré Steven
- Fellow: Musaev Gadzhi
Research team(s)
Project type(s)
- Research Project
IMEC-Next generation connectivity for enhanced, safe & efficient transport & logistics (5G-Blueprint).
Abstract
The overall objective of 5G-Blueprint is to design and validate a technical architecture, business and governance model for uninterrupted cross-border teleoperated transport based on 5G connectivity. 5G-Blueprint will explore and define: - The economics of 5G tools in cross border transport & logistics as well as passenger transport: bringing CAPEX and OPEX into view, both on the supply (Telecom) side and on the demand (Transport & Logistics) side for transformation of current business practices as well as new value propositions - The Governance issues and solutions pertaining to responsibilities and accountability within the value chain dependent on cross border connectivity and seamless services relating to the Dutch & Belgian regulatory framework (telecommunications, traffic and CAM experimentation laws, contracts, value chain management) - Tactical and operational (pre-) conditions that need to be in place to get full value of 5G tooled transport & logistics. This includes implementing use cases that increase cooperative awareness to guarantee safe and responsible tele-operated transport - Preparing and piloting tele-operated and tele-monitored transport on roadways and waterways to alleviate the increasing shortage of manpower and bring transport and logistics on a higher level of efficiency through data sharing in the supply chain and use of AI. - Exploring the possibilities of increasing the volume of freight being transported during the night where excess physical infrastructure capacity is abundant; the lowering of personnel costs would make this feasible on a cost effective basis - Tele-operation will be enabled by the following 5G qualities, such as low latency, reliable connectivity and high bandwidth that current 4G LTE cannot deliver sufficiently. The project's outcome will be the blueprint for subsequent operational pan-European deployment of teleoperated transport solutions in the logistics sector and beyond.Researcher(s)
- Promoter: Hellinckx Peter
Research team(s)
Project type(s)
- Research Project
IMEC-Novel inland waterway transport concepts for moving freight effectively (NOVIMOVE).
Abstract
Inland Waterborne Transport (IWT) advantages as low-energy and low CO2 emitting transport mode are not fully exploited today due to gaps in the logistics system. Inland container vessels pay 6-8 calls at seaport terminals with long waiting times. More time is lost by sub-optimal navigation on rivers and waiting at bridges and locks. In addition, low load factors of containers and vessels impact the logistics systems with unnecessary high numbers of containers being transported and trips being made. NOVIMOVE strategy is to "condense" the logistics system by improving container load factors and by reducing waiting times in seaports, by improved river voyage planning and execution, and by facilitating smooth passages through bridges and locks. NOVIMOVE's innovations are: (1) cargo reconstruction to raise container load factors, (2) mobile terminals feeding inland barges, (3) smart river navigation by merging satellite (Galileo) and real time river water depths data, (4) smooth passage through bridges/locks by dynamic scheduling system for better corridor management along the TEN-T Rhine-Alpine (R-A) route, (5) concepts for innovative vessels that can adapt to low water condition while maintaining a full payload, and (6) close cooperation with logistic stakeholders, ports and water authorities along the R-A route: Antwerp, Rotterdam, Duisburg, Basel. NOVIMOVE technology developments will be demonstrated by virtual simulation, scaled model tests and full-scale demonstrations. NOVIMOVE innovations will impact the quantity of freight moved by IWT along the R-A corridor by 30% with respect to 2010 baseline data. The NOVIMOVE 21-members consortium combines logistics operators, ports, system-developers and research organisations from 4 EU member states and two associate countries. The work plan contains 4 technical Work Packages. The project duration is four years; the requested funding is 8,9 MIO.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Latré Steven
Research team(s)
Project type(s)
- Research Project
Multi-Agent Communication and Behaviour Training using Reinforcement Learning.
Abstract
Many real-world applications require intelligent cooperative agents that can work together to solve a problem. An example of such an cooperative multi-agent application is the control of multiple autonomous vehicles. Multi-agent reinforcement learning is a wellresearched topic and many solutions exist in the state-of-the-art. Recently, the research community was able to create agents that learn how to communicate with each other to reach their goal. This is a new subfield of the multi-agent reinforcement learning domain in which we will research how we can achieve decentralised training of these communicating agents. This will allow us to create heterogeneous agents that can communicate or continue to train the communicating agents after their deployment which is not possible with the current state-of-the-art methods. In this project, I will extend the state-of-the-art by investigation how we can communicate with an unknown number of other agents which is a problem with state-ofthe- art methods. Next, I will work on the feedback structure that is used to train the communication between the agents. After the feedback structure I will work on splitting the agents architecture to create environment specific agents. I hypothesise that this will decrease the training time of the communication policy which is required for decentralised training. These advancements will be combined to create agents that we can train in a decentralised setting.Researcher(s)
- Promoter: Hellinckx Peter
- Co-promoter: Mercelis Siegfried
- Fellow: Vanneste Simon
Research team(s)
Project type(s)
- Research Project