Source author record

Joan Lasenby

Joan Lasenby appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Robotics Computational Geometry eess.IV Human-Computer Interaction physics.flu-dyn Quantitative Methods

Catalog footprint

What is connected

13works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Efficient Camera-Controlled Video Generation of Static Scenes via Sparse Diffusion and 3D Rendering

Modern video generative models based on diffusion models can produce very realistic clips, but they are computationally inefficient, often requiring minutes of GPU time for just a few seconds of video. This inefficiency poses a critical barrier to deploying generative video in applications that require real-time interactions, such as embodied AI and VR/AR. This paper explores a new strategy for camera-conditioned video generation of static scenes: using diffusion-based generative models to generate a sparse set of keyframes, and then synthesizing the full video through 3D reconstruction and rendering. By lifting keyframes into a 3D representation and rendering intermediate views, our approach amortizes the generation cost across hundreds of frames while enforcing geometric consistency. We further introduce a model that predicts the optimal number of keyframes for a given camera trajectory, allowing the system to adaptively allocate computation. Our final method, SRENDER, uses very sparse keyframes for simple trajectories and denser ones for complex camera motion. This results in video generation that is more than 40 times faster than the diffusion-based baseline in generating 20 seconds of video, while maintaining high visual fidelity and temporal stability, offering a practical path toward efficient and controllable video synthesis.

preprint2025arXiv

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

We present SpaceTimePilot, a video diffusion model that disentangles space and time for controllable generative rendering. Given a monocular video, SpaceTimePilot can independently alter the camera viewpoint and the motion sequence within the generative process, re-rendering the scene for continuous and arbitrary exploration across space and time. To achieve this, we introduce an effective animation time-embedding mechanism in the diffusion process, allowing explicit control of the output video's motion sequence with respect to that of the source video. As no datasets provide paired videos of the same dynamic scene with continuous temporal variations, we propose a simple yet effective temporal-warping training scheme that repurposes existing multi-view datasets to mimic temporal differences. This strategy effectively supervises the model to learn temporal control and achieve robust space-time disentanglement. To further enhance the precision of dual control, we introduce two additional components: an improved camera-conditioning mechanism that allows altering the camera from the first frame, and CamxTime, the first synthetic space-and-time full-coverage rendering dataset that provides fully free space-time video trajectories within a scene. Joint training on the temporal-warping scheme and the CamxTime dataset yields more precise temporal control. We evaluate SpaceTimePilot on both real-world and synthetic data, demonstrating clear space-time disentanglement and strong results compared to prior work. Project page: https://zheninghuang.github.io/Space-Time-Pilot/ Code: https://github.com/ZheningHuang/spacetimepilot

preprint2022arXiv

Closed-form solutions for the inverse kinematics of serial robots using conformal geometric algebra

This work addresses the inverse kinematics of serial robots using conformal geometric algebra. Classical approaches include either the use of homogeneous matrices, which entails high computational cost and execution time or the development of particular geometric strategies that cannot be generalized to arbitrary serial robots. In this work, we present a compact, elegant and intuitive formulation of robot kinematics based on conformal geometric algebra that provides a suitable framework for the closed-form resolution of the inverse kinematic problem for manipulators with a spherical wrist. For serial robots of this kind, the inverse kinematics problem can be split in two subproblems: the position and orientation problems. The latter is solved by appropriately splitting the rotor that defines the target orientation into three simpler rotors, while the former is solved by developing a geometric strategy for each combination of prismatic and revolute joints that forms the position part of the robot. Finally, the inverse kinematics of 7 DoF redundant manipulators with a spherical wrist is solved by extending the geometric solutions obtained in the non-redundant case.

preprint2022arXiv

ECLIPSE : Envisioning CLoud Induced Perturbations in Solar Energy

Efficient integration of solar energy into the electricity mix depends on a reliable anticipation of its intermittency. A promising approach to forecast the temporal variability of solar irradiance resulting from the cloud cover dynamics is based on the analysis of sequences of ground-taken sky images or satellite observations. Despite encouraging results, a recurrent limitation of existing deep learning approaches lies in the ubiquitous tendency of reacting to past observations rather than actively anticipating future events. This leads to a frequent temporal lag and limited ability to predict sudden events. To address this challenge, we introduce ECLIPSE, a spatio-temporal neural network architecture that models cloud motion from sky images to not only predict future irradiance levels and associated uncertainties, but also segmented images, which provide richer information on the local irradiance map. We show that ECLIPSE anticipates critical events and reduces temporal delay while generating visually realistic futures. The model characteristics and properties are investigated with an ablation study and a comparative study on the benefits and different ways to integrate auxiliary data into the modelling. The model predictions are also interpreted through an analysis of the principal spatio-temporal components learned during network training.

preprint2022arXiv

Omnivision forecasting: combining satellite observations with sky images for improved intra-hour solar energy predictions

Integration of intermittent renewable energy sources into electric grids in large proportions is challenging. A well-established approach aimed at addressing this difficulty involves the anticipation of the upcoming energy supply variability to adapt the response of the grid. In solar energy, short-term changes in electricity production caused by occluding clouds can be predicted at different time scales from all-sky cameras (up to 30-min ahead) and satellite observations (up to 6h ahead). In this study, we integrate these two complementary points of view on the cloud cover in a single machine learning framework to improve intra-hour (up to 60-min ahead) irradiance forecasting. Both deterministic and probabilistic predictions are evaluated in different weather conditions (clear-sky, cloudy, overcast) and with different input configurations (sky images, satellite observations and/or past irradiance values). Our results show that the hybrid model benefits predictions in clear-sky conditions and improves longer-term forecasting. This study lays the groundwork for future novel approaches of combining sky images and satellite observations in a single learning framework to advance solar nowcasting.

preprint2022arXiv

Pre-training Molecular Graph Representation with 3D Geometry

Molecular graph representation learning is a fundamental problem in modern drug and material discovery. Molecular graphs are typically modeled by their 2D topological structures, but it has been recently discovered that 3D geometric information plays a more vital role in predicting molecular functionalities. However, the lack of 3D information in real-world scenarios has significantly impeded the learning of geometric graph representation. To cope with this challenge, we propose the Graph Multi-View Pre-training (GraphMVP) framework where self-supervised learning (SSL) is performed by leveraging the correspondence and consistency between 2D topological structures and 3D geometric views. GraphMVP effectively learns a 2D molecular graph encoder that is enhanced by richer and more discriminative 3D geometry. We further provide theoretical insights to justify the effectiveness of GraphMVP. Finally, comprehensive experiments show that GraphMVP can consistently outperform existing graph SSL methods.

preprint2022arXiv

SPIN: Simplifying Polar Invariance for Neural networks Application to vision-based irradiance forecasting

Translational invariance induced by pooling operations is an inherent property of convolutional neural networks, which facilitates numerous computer vision tasks such as classification. Yet to leverage rotational invariant tasks, convolutional architectures require specific rotational invariant layers or extensive data augmentation to learn from diverse rotated versions of a given spatial configuration. Unwrapping the image into its polar coordinates provides a more explicit representation to train a convolutional architecture as the rotational invariance becomes translational, hence the visually distinct but otherwise equivalent rotated versions of a given scene can be learnt from a single image. We show with two common vision-based solar irradiance forecasting challenges (i.e. using ground-taken sky images or satellite images), that this preprocessing step significantly improves prediction results by standardising the scene representation, while decreasing training time by a factor of 4 compared to augmenting data with rotations. In addition, this transformation magnifies the area surrounding the centre of the rotation, leading to more accurate short-term irradiance predictions.

preprint2021arXiv

Singularities of serial robots: Identification and distance computation using geometric algebra

The singularities of serial robotic manipulators are those configurations in which the robot loses the ability to move in at least one direction. Hence, their identification is fundamental to enhance the performance of current control and motion planning strategies. While classical approaches entail the computation of the determinant of either a 6x n or nxn matrix for an n degrees of freedom serial robot, this work addresses a novel singularity identification method based on modelling the twists defined by the joint axes of the robot as vectors of the six-dimensional and three-dimensional geometric algebras. In particular, it consists of identifying which configurations cause the exterior product of these twists to vanish. In addition, since rotors represent rotations in geometric algebra, once these singularities have been identified, a distance function is defined in the configuration space C such that its restriction to the set of singular configurations S allows us to compute the distance of any configuration to a given singularity. This distance function is used to enhance how the singularities are handled in three different scenarios, namely motion planning, motion control and bilateral teleoperation.

preprint2020arXiv

Convolutional Neural Networks applied to sky images for short-term solar irradiance forecasting

Despite the advances in the field of solar energy, improvements of solar forecasting techniques, addressing the intermittent electricity production, remain essential for securing its future integration into a wider energy supply. A promising approach to anticipate irradiance changes consists of modeling the cloud cover dynamics from ground taken or satellite images. This work presents preliminary results on the application of deep Convolutional Neural Networks for 2 to 20 min irradiance forecasting using hemispherical sky images and exogenous variables. We evaluate the models on a set of irradiance measurements and corresponding sky images collected in Palaiseau (France) over 8 months with a temporal resolution of 2 min. To outline the learning of neural networks in the context of short-term irradiance forecasting, we implemented visualisation techniques revealing the types of patterns recognised by trained algorithms in sky images. In addition, we show that training models with past samples of the same day improves their forecast skill, relative to the smart persistence model based on the Mean Square Error, by around 10% on a 10 min ahead prediction. These results emphasise the benefit of integrating previous same-day data in short-term forecasting. This, in turn, can be achieved through model fine tuning or using recurrent units to facilitate the extraction of relevant temporal features from past data.

preprint2020arXiv

Neural Random Subspace

The random subspace method, known as the pillar of random forests, is good at making precise and robust predictions. However, there is not a straightforward way yet to combine it with deep learning. In this paper, we therefore propose Neural Random Subspace (NRS), a novel deep learning based random subspace method. In contrast to previous forest methods, NRS enjoys the benefits of end-to-end, data-driven representation learning, as well as pervasive support from deep learning software and hardware platforms, hence achieving faster inference speed and higher accuracy. Furthermore, as a non-linear component to be encoded into Convolutional Neural Networks (CNNs), NRS learns non-linear feature representations in CNNs more efficiently than previous higher-order pooling methods, producing good results with negligible increase in parameters, floating point operations (FLOPs) and real running time. Compared with random subspaces, random forests and gradient boosting decision trees (GBDTs), NRS achieves superior performance on 35 machine learning datasets. Moreover, on both 2D image and 3D point cloud recognition tasks, integration of NRS with CNN architectures achieves consistent improvements with minor extra cost. Code is available at https://github.com/CupidJay/NRS_pytorch.

preprint2015arXiv

An acoustic space-time and the Lorentz transformation in aeroacoustics

In this paper we introduce concepts from relativity and geometric algebra to aeroacoustics. We do this using an acoustic space-time transformation within the framework of sound propagation in uniform flows. By using Geometric Algebra we are able to provide a simple geometric interpretation to the space-time transformation, and are able to give neat and lucid derivations of the free-field Green's function for the convected wave equation and the Doppler shift for a stationary observer and a source in uniform rectilinear motion in a uniform flow.

preprint2014arXiv

Single camera pose estimation using Bayesian filtering and Kinect motion priors

Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.

preprint2013arXiv

ChESS - Quick and Robust Detection of Chess-board Features

Localization of chess-board vertices is a common task in computer vision, underpinning many applications, but relatively little work focusses on designing a specific feature detector that is fast, accurate and robust. In this paper the `Chess-board Extraction by Subtraction and Summation' (ChESS) feature detector, designed to exclusively respond to chess-board vertices, is presented. The method proposed is robust against noise, poor lighting and poor contrast, requires no prior knowledge of the extent of the chess-board pattern, is computationally very efficient, and provides a strength measure of detected features. Such a detector has significant application both in the key field of camera calibration, as well as in Structured Light 3D reconstruction. Evidence is presented showing its robustness, accuracy, and efficiency in comparison to other commonly used detectors both under simulation and in experimental 3D reconstruction of flat plate and cylindrical objects

Joan Lasenby

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

Efficient Camera-Controlled Video Generation of Static Scenes via Sparse Diffusion and 3D Rendering

SpaceTimePilot: Generative Rendering of Dynamic Scenes Across Space and Time

Closed-form solutions for the inverse kinematics of serial robots using conformal geometric algebra

ECLIPSE : Envisioning CLoud Induced Perturbations in Solar Energy

Omnivision forecasting: combining satellite observations with sky images for improved intra-hour solar energy predictions

Pre-training Molecular Graph Representation with 3D Geometry

SPIN: Simplifying Polar Invariance for Neural networks Application to vision-based irradiance forecasting

Singularities of serial robots: Identification and distance computation using geometric algebra

Convolutional Neural Networks applied to sky images for short-term solar irradiance forecasting

Neural Random Subspace

An acoustic space-time and the Lorentz transformation in aeroacoustics

Single camera pose estimation using Bayesian filtering and Kinect motion priors

ChESS - Quick and Robust Detection of Chess-board Features