Researcher profile

Damien Ernst

Damien Ernst contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
14works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

14 published item(s)

preprint2026arXiv

Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

Always-on AI applications, from environmental sensors to biomedical implants, require ultra-low power consumption. Analog circuits offer a path to sub-microwatt inference, yet existing analog implementations are limited to feedforward architectures: extending them to recurrent dynamics has been considered impractical due to noise accumulation through temporal feedback. We demonstrate that this barrier can be overcome through hardware-software co-design. Specifically, we identify that Bistable Memory Recurrent Units (BMRUs), a class of Recurrent Neural Networks (RNNs) with discrete-valued outputs and hysteretic dynamics, admit an ultra-low power current-mode analog implementation which we design from first principles. The resulting circuit establishes a one-to-one correspondence between each learned parameter and a circuit element. The discrete outputs suppress analog noise by at least 20-fold at each cell boundary, breaking the noise accumulation that prevents analog recurrence. We reformulate BMRUs for first-quadrant operation with fixed thresholds, enabling the direct correspondence while preserving expressivity and trainability. Transistor-level simulations in 180 nm Complementary Metal-Oxide-Semiconductor (CMOS) show near-perfect agreement between software predictions and circuit-level behavior, with the software model thereby serving as a high-fidelity simulator of the physical hardware at low computational cost. We leverage this fidelity to conduct large-scale noise immunity and power scaling analyses: the power cost of adding recurrence scales linearly with state dimension, while the feedforward layers dominating total power scale quadratically, meaning recurrence is added at linear marginal cost relative to the feedforward backbone. End-to-end keyword spotting achieves sub-microwatt inference at the RNN core.

preprint2026arXiv

Improving the Performance and Learning Stability of Parallelizable RNNs Designed for Ultra-Low Power Applications

Sequence learning is dominated by Transformers and parallelizable recurrent neural networks (RNNs) such as state-space models, yet learning long-term dependencies remains challenging, and state-of-the-art designs trade power consumption for performance. The Bistable Memory Recurrent Unit (BMRU) was introduced to enable hardware-software co-design of ultra-low power RNNs: quantized states with hysteresis provide persistent memory while mapping directly to analog primitives. However, BMRU performance lags behind parallelizable RNNs on complex sequential tasks. In this paper, we identify gradient blocking during state updates as a key limitation and propose a cumulative update formulation that restores gradient flow while preserving persistent memory, creating skip-connections through time. This leads to the Cumulative Memory Recurrent Unit (CMRU) and its relaxed variant, the $α$CMRU. Experiments show that the cumulative formulation dramatically improves convergence stability and reduces initialization sensitivity. The CMRU and $α$CMRU match or outperform Linear Recurrent Units (LRUs) and minimal Gated Recurrent Units (minGRUs) across diverse benchmarks at small model sizes, with particular advantages on tasks requiring discrete long-range retention, while the CMRU retains quantized states, persistent memory, and noise-resilient dynamics essential for analog implementation.

preprint2022arXiv

Allocation of locally generated electricity in renewable energy communities

Local electricity markets represent a way of supplementing traditional retailing contracts for end consumers -- among these markets, the renewable energy community has gained momentum over the last few years. This paper proposes a practical and readily to be adopted modelling solution for these communities, one that allows their members to share the economic benefits derived from them. The proposed solution relies on an \emph{ex-post} allocation of the electricity that is generated within energy communities (i.e., local electricity) based on the optimisation of \emph{repartition keys}. Repartition keys are therefore optimally computed to represent the proportion of total local electricity to be allocated to each community member, and aim to minimise the sum of electricity bills of all community members. Since the optimisation takes place \emph{ex-post} the repartition keys do not modify the actual electricity flows, but rather the financial flows of the community members. Then, the billing process of the community will take these keys into account to correctly send the electricity bills to each member. Building on this concept, we also introduce two additions to the basic algorithm to enhance the stability of the community, which a global bill minimisation may fail to ensure (e.g., very asymmetrical solutions between members may lead to some of them opting out).

preprint2022arXiv

Churn prediction in online gambling

In business retention, churn prevention has always been a major concern. This work contributes to this domain by formalizing the problem of churn prediction in the context of online gambling as a binary classification task. We also propose an algorithmic answer to this problem based on recurrent neural network. This algorithm is tested with online gambling data that have the form of time series, which can be efficiently processed by recurrent neural networks. To evaluate the performances of the trained models, standard machine learning metrics were used, such as accuracy, precision and recall. For this problem in particular, the conducted experiments allowed to assess that the choice of a specific architecture depends on the metric which is given the greatest importance. Architectures using nBRC favour precision, those using LSTM give better recall, while GRU-based architectures allow a higher accuracy and balance two other metrics. Moreover, further experiments showed that using only the more recent time-series histories to train the networks decreases the quality of the results. We also study the performances of models learned at a specific instant $t$, at other times $t^{\prime} > t$. The results show that the performances of the models learned at time $t$ remain good at the following instants $t^{\prime} > t$, suggesting that there is no need to refresh the models at a high rate. However, the performances of the models were subject to noticeable variance due to one-off events impacting the data.

preprint2022arXiv

Computing Necessary Conditions for Near-Optimality in Capacity Expansion Planning Problems

In power systems, large-scale optimisation problems are extensively used to plan for capacity expansion at the supra-national level. However, their cost-optimal solutions are often not exploitable by decision-makers who are preferably looking for features of solutions that can accommodate their different requirements. This paper proposes a generic framework for addressing this problem. It is based on the concept of the epsilon-optimal feasible space of a given optimisation problem and the identification of necessary conditions over this space. This framework has been developed in a generic case, and an approach for solving this problem is subsequently described for a specific case where conditions are constrained sums of variables. The approach is tested on a case study about capacity expansion planning of the European electricity network to determine necessary conditions on the minimal investments in transmission, storage and generation capacity.

preprint2022arXiv

Jointly Learning Environments and Control Policies with Projected Stochastic Gradient Ascent

We consider the joint design and control of discrete-time stochastic dynamical systems over a finite time horizon. We formulate the problem as a multi-step optimization problem under uncertainty seeking to identify a system design and a control policy that jointly maximize the expected sum of rewards collected over the time horizon considered. The transition function, the reward function and the policy are all parametrized, assumed known and differentiable with respect to their parameters. We then introduce a deep reinforcement learning algorithm combining policy gradient methods with model-based optimization techniques to solve this problem. In essence, our algorithm iteratively approximates the gradient of the expected return via Monte-Carlo sampling and automatic differentiation and takes projected gradient ascent steps in the space of environment and policy parameters. This algorithm is referred to as Direct Environment and Policy Search (DEPS). We assess the performance of our algorithm in three environments concerned with the design and control of a mass-spring-damper system, a small-scale off-grid power system and a drone, respectively. In addition, our algorithm is benchmarked against a state-of-the-art deep reinforcement learning algorithm used to tackle joint design and control problems. We show that DEPS performs at least as well or better in all three environments, consistently yielding solutions with higher returns in fewer iterations. Finally, solutions produced by our algorithm are also compared with solutions produced by an algorithm that does not jointly optimize environment and policy parameters, highlighting the fact that higher returns can be achieved when joint optimization is performed.

preprint2022arXiv

M4Depth: Monocular depth estimation for autonomous vehicles in unseen environments

Estimating the distance to objects is crucial for autonomous vehicles when using depth sensors is not possible. In this case, the distance has to be estimated from on-board mounted RGB cameras, which is a complex task especially in environments such as natural outdoor landscapes. In this paper, we present a new method named M4Depth for depth estimation. First, we establish a bijective relationship between depth and the visual disparity of two consecutive frames and show how to exploit it to perform motion-invariant pixel-wise depth estimation. Then, we detail M4Depth which is based on a pyramidal convolutional neural network architecture where each level refines an input disparity map estimate by using two customized cost volumes. We use these cost volumes to leverage the visual spatio-temporal constraints imposed by motion and to make the network robust for varied scenes. We benchmarked our approach both in test and generalization modes on public datasets featuring synthetic camera trajectories recorded in a wide variety of outdoor scenes. Results show that our network outperforms the state of the art on these datasets, while also performing well on a standard depth estimation benchmark. The code of our method is publicly available at https://github.com/michael-fonder/M4Depth.

preprint2022arXiv

Optimal Connection Phase Selection of Residential Distributed Energy Resources and its Impact on Aggregated Demand

The recent major increase in decentralized energy resources (DERs) such as photovoltaic (PV) panels alters the loading profile of distribution systems (DS) and impacts higher voltage levels. Distribution system operators (DSOs) try to manage the deployment of new DERs to decrease the operational costs. However, DER location and size are factors beyond any DSO's reach. This paper presents a practical method to minimize the DS operational costs due to new DER deployments, through optimal selection of their connection phase. The impact of such distribution grid management efforts on aggregated demand for higher voltage levels is also evaluated and discussed in this paper. Simulation results on a real-life Belgian network show the effectiveness of optimal connection phase selection in decreasing DS operational costs, and the considerable impact of such simple DS management efforts on the aggregated demand.

preprint2022arXiv

Recurrent networks, hidden states and beliefs in partially observable environments

Reinforcement learning aims to learn optimal policies from interaction with environments whose dynamics are unknown. Many methods rely on the approximation of a value function to derive near-optimal policies. In partially observable environments, these functions depend on the complete sequence of observations and past actions, called the history. In this work, we show empirically that recurrent neural networks trained to approximate such value functions internally filter the posterior probability distribution of the current state given the history, called the belief. More precisely, we show that, as a recurrent neural network learns the Q-function, its hidden states become more and more correlated with the beliefs of state variables that are relevant to optimal control. This correlation is measured through their mutual information. In addition, we show that the expected return of an agent increases with the ability of its recurrent architecture to reach a high mutual information between its hidden states and the beliefs. Finally, we show that the mutual information between the hidden states and the beliefs of variables that are irrelevant for optimal control decreases through the learning process. In summary, this work shows that in its hidden states, a recurrent neural network approximating the Q-function of a partially observable environment reproduces a sufficient statistic from the history that is correlated to the relevant part of the belief for taking optimal actions.

preprint2022arXiv

Risk-Sensitive Policy with Distributional Reinforcement Learning

Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the $Q$ function generally standing at the core of learning schemes in RL by another function taking into account both the expected return and the risk. Named the risk-based utility function $U$, it can be extracted from the random return distribution $Z$ naturally learnt by any distributional RL algorithm. This enables to span the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, and with an emphasis on the interpretability of the resulting decision-making process.

preprint2021arXiv

Sparse Training Theory for Scalable and Efficient Agents

A fundamental task for artificial intelligence is learning. Deep Neural Networks have proven to cope perfectly with all learning paradigms, i.e. supervised, unsupervised, and reinforcement learning. Nevertheless, traditional deep learning approaches make use of cloud computing facilities and do not scale well to autonomous agents with low computational resources. Even in the cloud, they suffer from computational and memory limitations, and they cannot be used to model adequately large physical worlds for agents which assume networks with billions of neurons. These issues are addressed in the last few years by the emerging topic of sparse training, which trains sparse networks from scratch. This paper discusses sparse training state-of-the-art, its challenges and limitations while introducing a couple of new theoretical research directions which has the potential of alleviating sparse training limitations to push deep learning scalability well beyond its current boundaries. Nevertheless, the theoretical advancements impact in complex multi-agents settings is discussed from a real-world perspective, using the smart grid case study.

preprint2020arXiv

A Deep Reinforcement Learning Framework for Continuous Intraday Market Bidding

The large integration of variable energy resources is expected to shift a large part of the energy exchanges closer to real-time, where more accurate forecasts are available. In this context, the short-term electricity markets and in particular the intraday market are considered a suitable trading floor for these exchanges to occur. A key component for the successful renewable energy sources integration is the usage of energy storage. In this paper, we propose a novel modelling framework for the strategic participation of energy storage in the European continuous intraday market where exchanges occur through a centralized order book. The goal of the storage device operator is the maximization of the profits received over the entire trading horizon, while taking into account the operational constraints of the unit. The sequential decision-making problem of trading in the intraday market is modelled as a Markov Decision Process. An asynchronous distributed version of the fitted Q iteration algorithm is chosen for solving this problem due to its sample efficiency. The large and variable number of the existing orders in the order book motivates the use of high-level actions and an alternative state representation. Historical data are used for the generation of a large number of artificial trajectories in order to address exploration issues during the learning process. The resulting policy is back-tested and compared against a benchmark strategy that is the current industrial standard. Results indicate that the agent converges to a policy that achieves in average higher total revenues than the benchmark strategy.

preprint2020arXiv

An Application of Deep Reinforcement Learning to Algorithmic Trading

This scientific research paper presents an innovative approach based on deep reinforcement learning (DRL) to solve the algorithmic trading problem of determining the optimal trading position at any point in time during a trading activity in stock markets. It proposes a novel DRL trading strategy so as to maximise the resulting Sharpe ratio performance indicator on a broad range of stock markets. Denominated the Trading Deep Q-Network algorithm (TDQN), this new trading strategy is inspired from the popular DQN algorithm and significantly adapted to the specific algorithmic trading problem at hand. The training of the resulting reinforcement learning (RL) agent is entirely based on the generation of artificial trajectories from a limited set of stock market historical data. In order to objectively assess the performance of trading strategies, the research paper also proposes a novel, more rigorous performance assessment methodology. Following this new performance assessment approach, promising results are reported for the TDQN strategy.

preprint2020arXiv

An Artificial Intelligence Solution for Electricity Procurement in Forward Markets

Retailers and major consumers of electricity generally purchase an important percentage of their estimated electricity needs years ahead in the forward market. This long-term electricity procurement task consists of determining when to buy electricity so that the resulting energy cost is minimised, and the forecast consumption is covered. In this scientific article, the focus is set on a yearly base load product from the Belgian forward market, named calendar (CAL), which is tradable up to three years ahead of the delivery period. This research paper introduces a novel algorithm providing recommendations to either buy electricity now or wait for a future opportunity based on the history of CAL prices. This algorithm relies on deep learning forecasting techniques and on an indicator quantifying the deviation from a perfectly uniform reference procurement policy. On average, the proposed approach surpasses the benchmark procurement policies considered and achieves a reduction in costs of 1.65% with respect to the perfectly uniform reference procurement policy achieving the mean electricity price. Moreover, in addition to automating the complex electricity procurement task, this algorithm demonstrates more consistent results throughout the years. Eventually, the generality of the solution presented makes it well suited for solving other commodity procurement problems.