Researcher profile

Yicheng Zhang

Yicheng Zhang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
9topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

Deep Pre-Alignment for VLMs

Most Vision Language Models (VLMs) directly map outputs from ViT encoders to the LLM via a lightweight projector. While effective, recent analysis suggests this architecture suffers from an alignment challenge: visual features remain distant from the text space in the initial layers of the LLM, forcing the model to waste critical depth~\cite{zhang-etal-2024-investigating,artzy-schwartz-2024-attend} on superficial modality alignment rather than deep understanding and complex reasoning. In this work, we propose Deep Pre-Alignment (DPA), a novel architecture that replaces the standard ViT encoder with a small VLM as perceiver, ensuring visual features are deeply aligned with the text space of the target large language model. Comprehensive experiments demonstrate the effectiveness of DPA. On the 4B parameter scale, DPA outperforms baselines by 1.9 points across 8 multimodal benchmarks, with gains widening to 3.0 points at the 32B scale. Moreover, by offloading alignment to the perceiver, DPA achieves a 32.9\% reduction in language capability forgetting over 3 text benchmarks. We further demonstrate that these gains are consistent across different LLM families including Qwen3 and LLaMA 3.2, highlighting the generality of our approach. Beyond performance, DPA also offers a seamless upgrade path for current VLM development, requiring only a modular replacement for the visual encoder with marginal computation overhead.

preprint2026arXiv

MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI

Modern AI progress has been driven by ML methods that are generalizable across settings and scalable to larger regimes. As large language models demonstrate advanced capabilities in reasoning, coding, and engineering tasks, it is increasingly important to understand whether they can discover such methods rather than only apply existing ones. We introduce MLS-Bench, a benchmark for evaluating whether AI systems can invent generalizable and scalable ML methods. MLS-Bench contains 140 tasks across 12 domains, each requiring an agent to improve one targeted component of an ML system or algorithm and demonstrate that the improvement generalizes across controlled settings and scales. We find that current agents remain far from reliably surpassing human-designed methods, and that engineering-style tuning is easier for them than genuine method invention. We further study the effects of test-time scaling, adaptive compute allocation, and context provision on agents' discovery performance, together with case studies of their behavior. Our analyses suggest that the bottleneck is not only in proposing new methods, but also in the scientific insight needed to plan, validate, and scale claims about them. More search, compute, or context alone does not remove this bottleneck. We build and maintain a community platform for cumulative and comparable iteration, and release the data and code at https://mls-bench.com.

preprint2024arXiv

Aircraft Landing Time Prediction with Deep Learning on Trajectory Images

Aircraft landing time (ALT) prediction is crucial for air traffic management, especially for arrival aircraft sequencing on the runway. In this study, a trajectory image-based deep learning method is proposed to predict ALTs for the aircraft entering the research airspace that covers the Terminal Maneuvering Area (TMA). Specifically, the trajectories of all airborne arrival aircraft within the temporal capture window are used to generate an image with the target aircraft trajectory labeled as red and all background aircraft trajectory labeled as blue. The trajectory images contain various information, including the aircraft position, speed, heading, relative distances, and arrival traffic flows. It enables us to use state-of-the-art deep convolution neural networks for ALT modeling. We also use real-time runway usage obtained from the trajectory data and the external information such as aircraft types and weather conditions as additional inputs. Moreover, a convolution neural network (CNN) based module is designed for automatic holding-related featurizing, which takes the trajectory images, the leading aircraft holding status, and their time and speed gap at the research airspace boundary as its inputs. Its output is further fed into the final end-to-end ALT prediction. The proposed ALT prediction approach is applied to Singapore Changi Airport (ICAO Code: WSSS) using one-month Automatic Dependent Surveillance-Broadcast (ADS-B) data from November 1 to November 30, 2022. Experimental results show that by integrating the holding featurization, we can reduce the mean absolute error (MAE) from 82.23 seconds to 43.96 seconds, and achieve an average accuracy of 96.1\%, with 79.4\% of the predictions errors being less than 60 seconds.

preprint2022arXiv

A SUMO Framework for Deep Reinforcement Learning Experiments Solving Electric Vehicle Charging Dispatching Problem

In modern cities, the number of Electric vehicles (EV) is increasing rapidly for their low emission and better dynamic performance, leading to increasing demand for EV charging. However, due to the limited number of EV charging facilities, catering to the huge demand for time-consuming EV charging becomes a critical problem. It is quite a challenge to dispatch EVs in the dynamic traffic environment and coordinate interaction among agents. To better serve further research on various related Deep Reinforcment Learning (DRL) EV dispatching algorithms, an efficient simulation environment is necessary to ensure success. As simulator Simulation Urban Mobility (SUMO) is one of the most widely used open-source simulators, it has great significance in creating an environment that satisfies research requirements on SUMO. We aim to improve the efficiency of EV charging station usage and save time for EV users in further work. As a result, we design an EV navigation system on the basis of the traffic simulator SUMO using Jurong Area, Singapore in this paper. Various state-of-the-art DRL algorithms are deployed on the designed testbed to validate the feasibility of the framework in terms of EV charging dispatching problems. Besides EV dispatching problems, the environment can also serve for other reinforcement learning (RL) traffic control problems

preprint2022arXiv

AST-GIN: Attribute-Augmented Spatial-Temporal Graph Informer Network for Electric Vehicle Charging Station Availability Forecasting

Electric Vehicle (EV) charging demand and charging station availability forecasting is one of the challenges in the intelligent transportation system. With the accurate EV station situation prediction, suitable charging behaviors could be scheduled in advance to relieve range anxiety. Many existing deep learning methods are proposed to address this issue, however, due to the complex road network structure and comprehensive external factors, such as point of interests (POIs) and weather effects, many commonly used algorithms could just extract the historical usage information without considering comprehensive influence of external factors. To enhance the prediction accuracy and interpretability, the Attribute-Augmented Spatial-Temporal Graph Informer (AST-GIN) structure is proposed in this study by combining the Graph Convolutional Network (GCN) layer and the Informer layer to extract both external and internal spatial-temporal dependence of relevant transportation data. And the external factors are modeled as dynamic attributes by the attribute-augmented encoder for training. AST-GIN model is tested on the data collected in Dundee City and experimental results show the effectiveness of our model considering external factors influence over various horizon settings compared with other baselines.

preprint2022arXiv

Statistical properties of the off-diagonal matrix elements of observables in eigenstates of integrable systems

We study the statistical properties of the off-diagonal matrix elements of observables in the energy eigenstates of integrable quantum systems. They have been found to be dense in the spin-1/2 XXZ chain, while they are sparse in noninteracting systems. We focus on the quasimomentum occupation of hard-core bosons in one dimension, and show that the distributions of the off-diagonal matrix elements are well described by generalized Gamma distributions, in both the presence and absence of translational invariance but not in the presence of localization. We also show that the results obtained for the off-diagonal matrix elements of observables in the spin-1/2 XXZ model are well described by a generalized Gamma distribution.

preprint2020arXiv

A Distributed Architecture for Real-time Hybrid Traffic Light Control in Urban Transportation Networks

A macroscopic model is proposed to depict the traffic dynamics involved in urban traffic systems. The link dynamics are described based on the cell-transmission model and bounded by the link capacities, while the flow dynamics are proposed based on the discharge headways and saturation flow at intersections. To fulfill the requirement of a closed-loop traffic light control strategy, an approach to estimate the branching ratios at intersections is proposed and simulations show that the convergence would be achieved under constant cyclic flow profiles. Furthermore, a system partitioning approach is proposed based congestion level identification, which is achieved via a machine learning method and a hybrid traffic network control strategy is proposed to integrate different traffic light control schemes together.

preprint2019arXiv

Observation of Dynamical Fermionization

We observe dynamical fermionization, where the momentum distribution of a Tonks-Girardeau (T-G) gas of strongly interacting bosons in 1D evolves from bosonic to fermionic after its axial confinement is removed. The asymptotic momentum distribution after expansion in 1D is the distribution of rapidities, which are the conserved quantities associated with many-body integrable systems. Rapidities have not previously been measured in any interacting many-body quantum system. Our measurements agree well with T-G gas theory. We also study momentum evolution after the trap depth is suddenly changed to a new non-zero value. We observe the predicted bosonic-fermionic oscillations and see deviations from the theory outside of the T-G gas limit.