Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
15topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

TowerMind: A Tower Defence Game Learning Environment and Benchmark for LLM as Agents

Recent breakthroughs in Large Language Models (LLMs) have positioned them as a promising paradigm for agents, with long-term planning and decision-making emerging as core general-purpose capabilities for adapting to diverse scenarios and tasks. Real-time strategy (RTS) games serve as an ideal testbed for evaluating these two capabilities, as their inherent gameplay requires both macro-level strategic planning and micro-level tactical adaptation and action execution. Existing RTS game-based environments either suffer from relatively high computational demands or lack support for textual observations, which has constrained the use of RTS games for LLM evaluation. Motivated by this, we present TowerMind, a novel environment grounded in the tower defense (TD) subgenre of RTS games. TowerMind preserves the key evaluation strengths of RTS games for assessing LLMs, while featuring low computational demands and a multimodal observation space, including pixel-based, textual, and structured game-state representations. In addition, TowerMind supports the evaluation of model hallucination and provides a high degree of customizability. We design five benchmark levels to evaluate several widely used LLMs under different multimodal input settings. The results reveal a clear performance gap between LLMs and human experts across both capability and hallucination dimensions. The experiments further highlight key limitations in LLM behavior, such as inadequate planning validation, a lack of multifinality in decision-making, and inefficient action use. We also evaluate two classic reinforcement learning algorithms: Ape-X DQN and PPO. By offering a lightweight and multimodal design, TowerMind complements the existing RTS game-based environment landscape and introduces a new benchmark for the AI agent field. The source code is publicly available on GitHub(https://github.com/tb6147877/TowerMind).

preprint2025arXiv

Towards a deeper fundamental understanding of (Al,Sc)N ferroelectric nitrides

Density Functional Theory (DFT) calculations, within the virtual crystal alloy approximation, are performed, along with the development of a Landau-type model employing a symmetry-allowed analytical expression of the internal energy and having parameters being determined from first principles, to investigate properties and energetics of Al1-xScxN ferroelectric nitrides in their hexagonal forms. These DFT computations and this model predict the existence of two different types of minima, namely the 4-fold-coordinated wurtzite (WZ) polar structure and a 5-times paraelectric hexagonal phase (to be denoted as H5), for any Sc composition up to 40%. The H5 minimum progressively becomes the lowest energy state within hexagonal symmetry as the Sc concentration increases from 0 to 40%. Furthermore, the model points out to several key findings. Examples include the crucial role of the coupling between polarization and strains to create the WZ minimum, in addition to polar and elastic energies, and that the origin of the H5 state overcoming the WZ phase as the global minimum within hexagonal symmetry when increasing the Sc composition mostly lies in the compositional dependency of only two parameters, one linked to the polarization and another one being purely elastic in nature. Other examples are that forcing Al1-xScxN systems to have no or a weak change in lattice parameters when heating them allows to reproduce well their finite-temperature polar properties, and that a value of the axial ratio close to that of the ideal WZ structure does imply a large polarization at low temperatures but not necessarily at high temperatures because of the ordered-disordered character of the temperature-induced formation of the WZ state. Such findings should allow for a better fundamental understanding of (Al,Sc)N ferroelectric nitrides, which may be used to design efficient devices operating at low voltages.

preprint2023arXiv

A high-order deferred correction method for the solution of free boundary problems using penalty iteration, with an application to American option pricing

This paper presents a high-order deferred correction algorithm combined with penalty iteration for solving free and moving boundary problems, using a fourth-order finite difference method. Typically, when free boundary problems are solved on a fixed computational grid, the order of the solution is low due to the discontinuity in the solution at the free boundary, even if a high-order method is used. Using a detailed error analysis, we observe that the order of convergence of the solution can be increased to fourth-order by solving successively corrected finite difference systems, where the corrections are derived from the previously computed lower order solutions. The penalty iterations converge quickly given a good initial guess. We demonstrate the accuracy and efficiency of our algorithm using several examples. Numerical results show that our algorithm gives fourth-order convergence for both the solution and the free boundary location. We also test our algorithm on the challenging American put option pricing problem. Our algorithm gives the expected high-order convergence.

preprint2022arXiv

A Concept and Argumentation based Interpretable Model in High Risk Domains

Interpretability has become an essential topic for artificial intelligence in some high-risk domains such as healthcare, bank and security. For commonly-used tabular data, traditional methods trained end-to-end machine learning models with numerical and categorical data only, and did not leverage human understandable knowledge such as data descriptions. Yet mining human-level knowledge from tabular data and using it for prediction remain a challenge. Therefore, we propose a concept and argumentation based model (CAM) that includes the following two components: a novel concept mining method to obtain human understandable concepts and their relations from both descriptions of features and the underlying data, and a quantitative argumentation-based method to do knowledge representation and reasoning. As a result of it, CAM provides decisions that are based on human-level knowledge and the reasoning process is intrinsically interpretable. Finally, to visualize the purposed interpretable model, we provide a dialogical explanation that contain dominated reasoning path within CAM. Experimental results on both open source benchmark dataset and real-word business dataset show that (1) CAM is transparent and interpretable, and the knowledge inside the CAM is coherent with human understanding; (2) Our interpretable approach can reach competitive results comparing with other state-of-art models.

preprint2022arXiv

Double-Pulse Generation of Indistinguishable Single Photons with Optically Controlled Polarization

Single-photon sources play a key role in photonic quantum technologies. Semiconductor quantum dots can emit indistinguishable single photons under resonant excitation. However, the resonance fluorescence technique typically requires cross-polarization filtering which causes a loss of the unpolarized quantum dot emission by 50%. To solve this problem, we demonstrate a method to generate indistinguishable single photons with optically controlled polarization by two laser pulses off-resonant with neutral exciton states. This scheme is realized by exciting the quantum dot to the biexciton state and subsequently driving the quantum dot to an exciton eigenstate. Combining with magnetic field, we demonstrated the generation of photons with optically controlled polarization (polarization degree of 101(2)%), laser-neutral exciton detuning up to 0.81 meV, high single-photon purity (99.6(1)%) and indistinguishability (85(4)%). Laser pulses can be blocked using polarization and spectral filtering. Our work makes an important step towards indistinguishable single-photon sources with near-unity collection efficiency.

preprint2022arXiv

Dual function spin-wave logic gates based on electric field control magnetic anisotropy boundary

Spin waves (SWs) have been considered a promising candidate for encoding information with lower power consumption. Here, we propose the dual function SW logic gates based on the electric field controlling the SW propagation in the Fe film of Fe/BaTiO3 heterostructure with the motion of magnetic anisotropy boundary (MAB). We show micromagnetic simulations to validate the AND-OR and NAND-NOR logic gates. Our research may find a path for simplifying integrated logic circuits using such dual function SW logic gates.

preprint2022arXiv

Towards 3D Scene Understanding by Referring Synthetic Models

Promising performance has been achieved for visual perception on the point cloud. However, the current methods typically rely on labour-extensive annotations on the scene scans. In this paper, we explore how synthetic models alleviate the real scene annotation burden, i.e., taking the labelled 3D synthetic models as reference for supervision, the neural network aims to recognize specific categories of objects on a real scene scan (without scene annotation for supervision). The problem studies how to transfer knowledge from synthetic 3D models to real 3D scenes and is named Referring Transfer Learning (RTL). The main challenge is solving the model-to-scene (from a single model to the scene) and synthetic-to-real (from synthetic model to real scene's object) gap between the synthetic model and the real scene. To this end, we propose a simple yet effective framework to perform two alignment operations. First, physical data alignment aims to make the synthetic models cover the diversity of the scene's objects with data processing techniques. Then a novel \textbf{convex-hull regularized feature alignment} introduces learnable prototypes to project the point features of both synthetic models and real scenes to a unified feature space, which alleviates the domain gap. These operations ease the model-to-scene and synthetic-to-real difficulty for a network to recognize the target objects on a real unseen scene. Experiments show that our method achieves the average mAP of 46.08\% and 55.49\% on the ScanNet and S3DIS datasets by learning the synthetic models from the ModelNet dataset. Code will be publicly available.

preprint2021arXiv

An Intelligent Self-driving Truck System For Highway Transportation

Recently, there have been many advances in autonomous driving society, attracting a lot of attention from academia and industry. However, existing works mainly focus on cars, extra development is still required for self-driving truck algorithms and models. In this paper, we introduce an intelligent self-driving truck system. Our presented system consists of three main components, 1) a realistic traffic simulation module for generating realistic traffic flow in testing scenarios, 2) a high-fidelity truck model which is designed and evaluated for mimicking real truck response in real-world deployment, 3) an intelligent planning module with learning-based decision making algorithm and multi-mode trajectory planner, taking into account the truck's constraints, road slope changes, and the surrounding traffic flow. We provide quantitative evaluations for each component individually to demonstrate the fidelity and performance of each part. We also deploy our proposed system on a real truck and conduct real world experiments which shows our system's capacity of mitigating sim-to-real gap. Our code is available at https://github.com/InceptioResearch/IITS

preprint2021arXiv

Crowd-Driven Mapping, Localization and Planning

Navigation in dense crowds is a well-known open problem in robotics with many challenges in mapping, localization, and planning. Traditional solutions consider dense pedestrians as passive/active moving obstacles that are the cause of all troubles: they negatively affect the sensing of static scene landmarks and must be actively avoided for safety. In this paper, we provide a new perspective: the crowd flow locally observed can be treated as a sensory measurement about the surrounding scenario, encoding not only the scene's traversability but also its social navigation preference. We demonstrate that even using the crowd-flow measurement alone without any sensing about static obstacles, our method still accomplishes good results for mapping, localization, and social-aware planning in dense crowds. Videos of the experiments are available at https://sites.google.com/view/crowdmapping.

preprint2021arXiv

F-CAD: A Framework to Explore Hardware Accelerators for Codec Avatar Decoding

Creating virtual avatars with realistic rendering is one of the most essential and challenging tasks to provide highly immersive virtual reality (VR) experiences. It requires not only sophisticated deep neural network (DNN) based codec avatar decoders to ensure high visual quality and precise motion expression, but also efficient hardware accelerators to guarantee smooth real-time rendering using lightweight edge devices, like untethered VR headsets. Existing hardware accelerators, however, fail to deliver sufficient performance and efficiency targeting such decoders which consist of multi-branch DNNs and require demanding compute and memory resources. To address these problems, we propose an automation framework, called F-CAD (Facebook Codec avatar Accelerator Design), to explore and deliver optimized hardware accelerators for codec avatar decoding. Novel technologies include 1) a new accelerator architecture to efficiently handle multi-branch DNNs; 2) a multi-branch dynamic design space to enable fine-grained architecture configurations; and 3) an efficient architecture search for picking the optimized hardware design based on both application-specific demands and hardware resource constraints. To the best of our knowledge, F-CAD is the first automation tool that supports the whole design flow of hardware acceleration of codec avatar decoders, allowing joint optimization on decoder designs in popular machine learning frameworks and corresponding customized accelerator design with cycle-accurate evaluation. Results show that the accelerators generated by F-CAD can deliver up to 122.1 frames per second (FPS) and 91.6% hardware efficiency when running the latest codec avatar decoder. Compared to the state-of-the-art designs, F-CAD achieves 4.0X and 2.8X higher throughput, 62.5% and 21.2% higher efficiency than DNNBuilder and HybridDNN by targeting the same hardware device.

preprint2021arXiv

Multi-Rate Nyquist-SCM for C-Band 100Gbit/s Signal over 50km Dispersion-Uncompensated Link

In this paper, to the best of our knowledge, we propose the first multi-rate Nyquist-subcarriers modulation (SCM) for C-band 100Gbit/s signal transmission over 50km dispersion-uncompensated link. Chromatic dispersion (CD) introduces severe spectral nulls on optical double-sideband signal, which greatly degrades the performance of intensity-modulation and direct-detection systems. Based on the prior knowledge of the dispersive channel, Nyquist-SCM with multi-rate subcarriers is proposed to keep away from the CD-caused spectral nulls flexibly. Signal on each subcarrier can be individually recovered by a digital signal processing, including the feed-forward equalizer with no more than 31 taps, a two-tap post filter, and maximum likelihood sequence estimation with one memory length. Combining with entropy loading based on probabilistic constellation shaping to maximize the capacity-reach, the C-band 100Gbit/s multi-rate Nyquist-SCM signal over 50km dispersion-uncompensated link can achieve 7% hard-decision forward error correction limit and average normalized generalized mutual information of 0.967 at received optical power of -4dBm and optical signal-to-noise ratio of 47.67dB. In conclusion, the multi-rate Nyquist-SCM shows great potentials in solving the CD-caused spectral distortions.

preprint2020arXiv

Comparison of Machine Learning Methods for Predicting Karst Spring Discharge in North China

The quantitative analyses of karst spring discharge typically rely on physical-based models, which are inherently uncertain. To improve the understanding of the mechanism of spring discharge fluctuation and the relationship between precipitation and spring discharge, three machine learning methods were developed to reduce the predictive errors of physical-based groundwater models, simulate the discharge of Longzici Spring's karst area, and predict changes in the spring on the basis of long time series precipitation monitoring and spring water flow data from 1987 to 2018. The three machine learning methods included two artificial neural networks (ANNs), namely, multilayer perceptron (MLP) and long short-term memory-recurrent neural network (LSTM-RNN), and support vector regression (SVR). A normalization method was introduced for data preprocessing to make the three methods robust and computationally efficient. To compare and evaluate the capability of the three machine learning methods, the mean squared error (MSE), mean absolute error (MAE), and root-mean-square error (RMSE) were selected as the performance metrics for these methods. Simulations showed that MLP reduced MSE, MAE, and RMSE to 0.0010, 0.0254, and 0.0318, respectively. Meanwhile, LSTM-RNN reduced MSE to 0.0010, MAE to 0.0272, and RMSE to 0.0329. Moreover, the decrease in MSE, MAE, and RMSE were 0.0910, 0.1852, and 0.3017, respectively, for SVR. Results indicated that MLP performed slightly better than LSTM-RNN, and MLP and LSTM-RNN performed considerably better than SVR. Furthermore, ANNs were demonstrated to be prior machine learning methods for simulating and predicting karst spring discharge.

preprint2019arXiv

Phase evolution and thermal stability of high Curie temperature BiScO$_3$-PbTiO$_3$-Pb(Cd${1/3}$Nb$_{2/3}$)O$_3$ ceramics near MPB

Piezoelectric and ferroelectric ceramics with a high Curie temperature (Tc) have attracted a growing attention owning to their applications under severe environments. In this work, phase structure, dielectric, ferroelectric and piezoelectric properties of (0.975-x)BiScO3-xPbTiO3-0.025Pb(Cd1/3Nb2/3)O3 ceramics (x = 0.58-0.64) were studied. A composition-induced structural transformation occurs from rhombohedral phase to tetragonal phase through an intermediate monoclinic phase with the increasing PT concentration. The relationship between structure and electrical properties of the system were discussed. The BS-xPT-PCN system near the morphotropic phase boundary (x = 0.62) exhibits excellent piezoelectric and ferroelectric performances with d33 = 508 pC/N, kp = 56%, and Pr = 40 uC/cm2. The high-temperature piezoelectricity of the sample with MPB (x = 0.62) was characterized by an in situ XRD. The excellent thermal stability of the crystal structure and the piezoelectric property indicate that the BS-xPT-PCN system is a promising candidate for high temperature piezoelectric applications.