Source author record

Chenxi Liu

Chenxi Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Information Theory math.IT Machine Learning eess.SP Computation and Language Robotics Computational Geometry cond-mat.stat-mech cond-mat.str-el Graphics Networking and Internet Architecture physics.comp-ph quant-ph

Catalog footprint

What is connected

18works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Test-time scaling (TTS) has become an effective approach for improving large language model performance by allocating additional computation during inference. However, existing TTS strategies are largely hand-crafted: researchers manually design reasoning patterns and tune heuristics by intuition, leaving much of the computation-allocation space unexplored. We propose an environment-driven framework, AutoTTS, that changes what researchers design: from individual TTS heuristics to environments where TTS strategies can be discovered automatically. The key to AutoTTS lies in environment construction: the discovery environment must make the control space tractable and provide cheap, frequent feedback for TTS search. As a concrete instantiation, we formulate width--depth TTS as controller synthesis over pre-collected reasoning trajectories and probe signals, where controllers decide when to branch, continue, probe, prune, or stop and can be evaluated cheaply without repeated LLM calls. We further introduce beta parameterization to make the search tractable and fine-grained execution trace feedback to improve discovery efficiency by helping the agent diagnose why a TTS program fails. Experiments on mathematical reasoning benchmarks show that the discovered strategies improve the overall accuracy--cost tradeoff over strong manually designed baselines. The discovered strategies generalize to held-out benchmarks and model scales, while the entire discovery costs only $39.9 and 160 minutes. Our data, and code will be open-source at https://github.com/zhengkid/AutoTTS.

preprint2022arXiv

ConTesse: Accurate Occluding Contours for Subdivision Surfaces

This paper proposes a method for computing the visible occluding contours of subdivision surfaces. The paper first introduces new theory for contour visibility of smooth surfaces. Necessary and sufficient conditions are introduced for when a sampled occluding contour is valid, that is, when it may be assigned consistent visibility. Previous methods do not guarantee these conditions, which helps explain why smooth contour visibility has been such a challenging problem in the past. The paper then proposes an algorithm that, given a subdivision surface, finds sampled contours satisfying these conditions, and then generates a new triangle mesh matching the given occluding contours. The contours of the output triangle mesh may then be rendered with standard non-photorealistic rendering algorithms, using the mesh for visibility computation. The method can be applied to any triangle mesh, by treating it as the base mesh of a subdivision surface.

preprint2022arXiv

IRS-Aided Non-Orthogonal ISAC Systems: Performance Analysis and Beamforming Design

Intelligent reflecting surface (IRS) has shown its effectiveness in facilitating orthogonal time-division integrated sensing and communications (TD-ISAC), in which the sensing task and the communication task occupy orthogonal time-frequency resources, while the role of IRS in the more interesting scenarios of non-orthogonal ISAC (NO-ISAC) systems has so far remained unclear. In this paper, we consider an IRS-aided NO-ISAC system, where a distributed IRS is deployed to assist concurrent communication and location sensing for a blind-zone user, occupying non-orthogonal/overlapped time-frequency resources. We first propose a modified Cramer-Rao lower bound (CRLB) to characterize the performances of both communication and location sensing in a unified manner. We further derive the closed-form expressions of the modified CRLB in our considered NO-ISAC system, enabling us to identify the fundamental trade-off between the communication and location sensing performances. In addition, by exploiting the modified CRLB, we propose a joint active and passive beamforming design algorithm that achieves a good communication and location sensing trade-off. Through numerical results, we demonstrate the superiority of the IRS-aided NO-ISAC systems over the IRS-aided TD-ISAC systems, in terms of both communication and localization performances. Besides, it is shown that the IRS-aided NO-ISAC system with random communication signals can achieve comparable localization performance to the IRS-aided localization system with dedicated positioning reference signals. Moreover, we investigate the trade-off between communication performance and localization performance and show how the performance of the NO-ISAC system can be significantly boosted by increasing the number of the IRS elements.

preprint2022arXiv

IRS-Based Integrated Location Sensing and Communication for mmWave SIMO Systems

In this paper, we establish an integrated sensing and communication (ISAC) system based on a distributed semi-passive intelligent reflecting surface (IRS), which allows location sensing and data transmission to be carried out simultaneously, sharing the same frequency and time resources. The detailed working process of the proposed IRS-based ISAC system is designed, including the transmission protocol, location sensing and beamforming optimization. Specifically, each coherence block consists of two periods, the ISAC period with two time blocks and the pure communication (PC) period. During each time block of the ISAC period, data transmission and user positioning are carried out simultaneously. The estimated user location in the first time block will be used for beamforming design in the second time block. During the PC period, only data transmission is conducted, by invoking the user location estimated in the second time block of the ISAC period for beamforming design. {\color{black}Simulation results show that a millimeter-level positioning accuracy can be achieved by the proposed location sensing scheme, demonstrating the advantage of the proposed IRS-based ISAC framework. Besides, the proposed two beamforming schemes based on the estimated location information achieve similar performance to the benchmark schemes assuming perfect channel state information (CSI), which verifies the effectiveness of beamforming design using sensed location information.

preprint2022arXiv

Multi-Class 3D Object Detection with Single-Class Supervision

While multi-class 3D detectors are needed in many robotics applications, training them with fully labeled datasets can be expensive in labeling cost. An alternative approach is to have targeted single-class labels on disjoint data samples. In this paper, we are interested in training a multi-class 3D object detection model, while using these single-class labeled data. We begin by detailing the unique stance of our "Single-Class Supervision" (SCS) setting with respect to related concepts such as partial supervision and semi supervision. Then, based on the case study of training the multi-class version of Range Sparse Net (RSN), we adapt a spectrum of algorithms -- from supervised learning to pseudo-labeling -- to fully exploit the properties of our SCS setting, and perform extensive ablation studies to identify the most effective algorithm and practice. Empirical experiments on the Waymo Open Dataset show that proper training under SCS can approach or match full supervision training while saving labeling costs.

preprint2022arXiv

PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

Cross-entropy loss and focal loss are the most common choices when training deep neural networks for classification problems. Generally speaking, however, a good loss function can take on much more flexible forms, and should be tailored for different tasks and datasets. Motivated by how functions can be approximated via Taylor expansion, we propose a simple framework, named PolyLoss, to view and design loss functions as a linear combination of polynomial functions. Our PolyLoss allows the importance of different polynomial bases to be easily adjusted depending on the targeting tasks and datasets, while naturally subsuming the aforementioned cross-entropy loss and focal loss as special cases. Extensive experimental results show that the optimal choice within the PolyLoss is indeed dependent on the task and dataset. Simply by introducing one extra hyperparameter and adding one line of code, our Poly-1 formulation outperforms the cross-entropy loss and focal loss on 2D image classification, instance segmentation, object detection, and 3D object detection tasks, sometimes by a large margin.

preprint2022arXiv

Scene Transformer: A unified architecture for predicting multiple agent trajectories

Predicting the motion of multiple agents is necessary for planning in dynamic environments. This task is challenging for autonomous driving since agents (e.g. vehicles and pedestrians) and their associated behaviors may be diverse and influence one another. Most prior work have focused on predicting independent futures for each agent based on all past motion, and planning against these independent predictions. However, planning against independent predictions can make it challenging to represent the future interaction possibilities between different agents, leading to sub-optimal planning. In this work, we formulate a model for predicting the behavior of all agents jointly, producing consistent futures that account for interactions between agents. Inspired by recent language modeling approaches, we use a masking strategy as the query to our model, enabling one to invoke a single model to predict agent behavior in many ways, such as potentially conditioned on the goal or full future trajectory of the autonomous vehicle or the behavior of other agents in the environment. Our model architecture employs attention to combine features across road elements, agent interactions, and time steps. We evaluate our approach on autonomous driving datasets for both marginal and joint motion prediction, and achieve state of the art performance across two popular datasets. Through combining a scene-centric approach, agent permutation equivariant model, and a sequence masking strategy, we show that our model can unify a variety of motion prediction tasks from joint motion predictions to conditioned prediction.

preprint2021arXiv

Fresh, Fair and Energy-Efficient Content Provision in a Private and Cache-Enabled UAV Network

In this paper, we investigate a private and cache-enabled unmanned aerial vehicle (UAV) network for content provision. Aiming at delivering fresh, fair, and energy-efficient content files to terrestrial users, we formulate a joint UAV caching, UAV trajectory, and UAV transmit power optimization problem. This problem is confirmed to be a sequential decision problem with mixed-integer non-convex constraints, which is intractable directly. To this end, we propose a novel algorithm based on the techniques of subproblem decomposition and convex approximation. Particularly, we first propose to decompose the sequential decision problem into multiple repeated optimization subproblems via a Lyapunov technique. Next, an iterative optimization scheme incorporating a successive convex approximation (SCA) technique is explored to tackle the challenging mixed-integer non-convex subproblems. Besides, we analyze the convergence and computational complexity of the proposed algorithm and derive the theoretical value of the expected peak age of information (PAoI) to estimate the content freshness. Simulation results demonstrate that the proposed algorithm can achieve the expected PAoI close to the theoretical value and is more 22.11% and 70.51% energy-efficient and fairer than benchmark algorithms.

preprint2020arXiv

Are Labels Necessary for Neural Architecture Search?

Existing neural network architectures in computer vision -- whether designed by humans or by machines -- were typically found using both images and their associated labels. In this paper, we ask the question: can we find high-quality neural architectures using only images, but no human-annotated labels? To answer this question, we first define a new setup called Unsupervised Neural Architecture Search (UnNAS). We then conduct two sets of experiments. In sample-based experiments, we train a large number (500) of diverse architectures with either supervised or unsupervised objectives, and find that the architecture rankings produced with and without labels are highly correlated. In search-based experiments, we run a well-established NAS algorithm (DARTS) using various unsupervised objectives, and report that the architectures searched without labels can be competitive to their counterparts searched with labels. Together, these results reveal the potentially surprising finding that labels are not necessary, and the image statistics alone may be sufficient to identify good neural architectures.

preprint2020arXiv

Enhancing Physical Layer Security of Random Caching in Large-Scale Multi-Antenna Heterogeneous Wireless Networks

In this paper, we propose a novel secure random caching scheme for large-scale multi-antenna heterogeneous wireless networks, where the base stations (BSs) deliver randomly cached confidential contents to the legitimate users in the presence of passive eavesdroppers as well as active jammers. In order to safeguard the content delivery, we consider that the BSs transmits the artificial noise together with the useful signals. By using tools from stochastic geometry, we first analyze the average reliable transmission probability (RTP) and the average confidential transmission probability (CTP), which take both the impact of the eavesdroppers and the impact of the jammers into consideration. We further provide tight upper and lower bounds on the average RTP. These analytical results enable us to obtain rich insights into the behaviors of the average RTP and the average CTP with respect to key system parameters. Moreover, we optimize the caching distribution of the files to maximize the average RTP of the system, while satisfying the constraints on the caching size and the average CTP. Through numerical results, we show that our proposed secure random caching scheme can effectively boost the secrecy performance of the system compared to the existing solutions.

preprint2020arXiv

Micro-Batch Training with Batch-Channel Normalization and Weight Standardization

Batch Normalization (BN) has become an out-of-box technique to improve deep network training. However, its effectiveness is limited for micro-batch training, i.e., each GPU typically has only 1-2 images for training, which is inevitable for many computer vision tasks, e.g., object detection and semantic segmentation, constrained by memory consumption. To address this issue, we propose Weight Standardization (WS) and Batch-Channel Normalization (BCN) to bring two success factors of BN into micro-batch training: 1) the smoothing effects on the loss landscape and 2) the ability to avoid harmful elimination singularities along the training trajectory. WS standardizes the weights in convolutional layers to smooth the loss landscape by reducing the Lipschitz constants of the loss and the gradients; BCN combines batch and channel normalizations and leverages estimated statistics of the activations in convolutional layers to keep networks away from elimination singularities. We validate WS and BCN on comprehensive computer vision tasks, including image classification, object detection, instance segmentation, video recognition and semantic segmentation. All experimental results consistently show that WS and BCN improve micro-batch training significantly. Moreover, using WS and BCN with micro-batch training is even able to match or outperform the performances of BN with large-batch training.

preprint2020arXiv

Predicting quantum many-body dynamics with transferable neural networks

Machine learning (ML) architectures such as convolutional neural networks (CNNs) have garnered considerable recent attention in the study of quantum many-body systems. However, advanced ML approaches such as transfer learning have seldom been applied to such contexts. Here we demonstrate that a simple recurrent unit (SRU) based efficient and transferable sequence learning framework is capable of learning and accurately predicting the time evolution of one-dimensional (1D) Ising model with simultaneous transverse and parallel magnetic fields, as quantitatively corroborated by relative entropy measurements and magnetization between the predicted and exact state distributions. At a cost of constant computational complexity, a larger many-body state evolution was predicted in an autoregressive way from just one initial state, without any guidance or knowledge of any Hamiltonian. Our work paves the way for future applications of advanced ML methods in quantum many-body dynamics only with knowledge from a smaller system.

preprint2019arXiv

Rethinking Normalization and Elimination Singularity in Neural Networks

In this paper, we study normalization methods for neural networks from the perspective of elimination singularity. Elimination singularities correspond to the points on the training trajectory where neurons become consistently deactivated. They cause degenerate manifolds in the loss landscape which will slow down training and harm model performances. We show that channel-based normalizations (e.g. Layer Normalization and Group Normalization) are unable to guarantee a far distance from elimination singularities, in contrast with Batch Normalization which by design avoids models from getting too close to them. To address this issue, we propose BatchChannel Normalization (BCN), which uses batch knowledge to avoid the elimination singularities in the training of channel-normalized models. Unlike Batch Normalization, BCN is able to run in both large-batch and micro-batch training settings. The effectiveness of BCN is verified on many tasks, including image classification, object detection, instance segmentation, and semantic segmentation. The code is here: https://github.com/joe-siyuan-qiao/Batch-Channel-Normalization.

preprint2016arXiv

Artificial-Noise-Aided Transmission in Multi-Antenna Relay Wiretap Channels with Spatially Random Eavesdroppers

We design a new secure transmission scheme in the relay wiretap channel where a source communicates with a destination through a decode-and-forward relay in the presence of spatially random-distributed eavesdroppers. For the sake of practicality, we consider a general antenna configuration in which the source, relay, destination, and eavesdroppers are equipped with multiple antennas. In order to confuse the eavesdroppers, we assume that both the source and the relay transmit artificial noise signals in addition to information signals. We first derive a closed-form expression for the transmission outage probability and an easy-to-compute expression for the secrecy outage probability. Notably, these expressions are valid for an arbitrary number of antennas at the source, relay, and destination. We then derive simple yet valuable expressions for the asymptotic transmission outage probability and the asymptotic secrecy outage probability, which reveal the secrecy performance when the number of antennas at the source grows sufficiently large. Using our expressions, we quantify a practical performance metric, namely the secrecy throughput, under a secrecy outage probability constraint. We further determine the system and channel parameters that maximize the secrecy throughput, leading to analytical security solutions suitable for real-world deployment.

preprint2016arXiv

Attention Correctness in Neural Image Captioning

Attention mechanisms have recently been introduced in deep learning for various tasks in natural language processing and computer vision. But despite their popularity, the "correctness" of the implicitly-learned attention maps has only been assessed qualitatively by visualization of several examples. In this paper we focus on evaluating and improving the correctness of attention in neural image captioning models. Specifically, we propose a quantitative evaluation metric for the consistency between the generated attention maps and human annotations, using recently released datasets with alignment between regions in images and entities in captions. We then propose novel models with different levels of explicit supervision for learning attention maps during training. The supervision can be strong when alignment between regions and caption entities are available, or weak when only object segments and categories are provided. We show on the popular Flickr30k and COCO datasets that introducing supervision of attention maps during training solidly improves both attention correctness and caption quality, showing the promise of making machine perception more human-like.

preprint2016arXiv

Location-Based Beamforming and Physical Layer Security in Rician Wiretap Channels

We propose a new location-based beamforming (LBB) scheme for wiretap channels, where a multi-antenna source communicates with a single-antenna legitimate receiver in the presence of a multi-antenna eavesdropper. We assume that all channels are in a Rician fading environment, the channel state information from the legitimate receiver is perfectly known at the source, and that the only information on the eavesdropper available at the source is her location. We first describe how the optimal beamforming vector that minimizes the secrecy outage probability of the system is obtained, illustrating its dependence on the eavesdropper's location. We then derive an easy-to-compute expression for the secrecy outage probability when our proposed LBB scheme is adopted. We also consider the positive impact a friendly jammer can have on our beamforming solution, showing how the path to optimality remains the same. Finally, we investigate the impact of location uncertainty on the secrecy outage probability, showing how our solution can still allow for secrecy even when the source only has a noisy estimate of the eavesdropper's location. Our work demonstrates how a multi-antenna array, operating in the most general channel conditions and most likely system set-up, can be configured rapidly in the field so as to deliver an optimal physical layer security solution.

preprint2015arXiv

Location-Based Beamforming for Rician Wiretap Channels

We propose a location-based beamforming scheme for wiretap channels, where a source communicates with a legitimate receiver in the presence of an eavesdropper. We assume that the source and the eavesdropper are equipped with multiple antennas, while the legitimate receiver is equipped with a single antenna. We also assume that all channels are in a Rician fading environment, the channel state information from the legitimate receiver is perfectly known at the source, and that the only information on the eavesdropper available at the source is her location. We first describe how the beamforming vector that minimizes the secrecy outage probability of the system is obtained, illustrating its dependence on the eavesdropper's location. We then derive an easy-to-compute expression for the secrecy outage probability when our proposed location-based beamforming is adopted. Finally, we investigate the impact location uncertainty has on the secrecy outage probability, showing how our proposed solution can still allow for secrecy even when the source has limited information on the eavesdropper's location.

preprint2015arXiv

Secure Transmission for Relay Wiretap Channels in the Presence of Spatially Random Eavesdroppers

We propose a secure transmission scheme for a relay wiretap channel, where a source communicates with a destination via a decode-and-forward relay in the presence of spatially random-distributed eavesdroppers. We assume that the source is equipped with multiple antennas, whereas the relay, the destination, and the eavesdroppers are equipped with a single antenna each. In the proposed scheme, in addition to information signals, the source transmits artificial noise signals in order to confuse the eavesdroppers. With the target of maximizing the secrecy throughput of the relay wiretap channel, we derive a closed-form expression for the transmission outage probability and an easy-to-compute expression for the secrecy outage probability. Using these expressions, we determine the optimal power allocation factor and wiretap code rates that guarantee the maximum secrecy throughput, while satisfying a secrecy outage probability constraint. Furthermore, we examine the impact of source antenna number on the secrecy throughput, showing that adding extra transmit antennas at the source brings about a significant increase in the secrecy throughput.

Chenxi Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

18 published item(s)

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

ConTesse: Accurate Occluding Contours for Subdivision Surfaces

IRS-Aided Non-Orthogonal ISAC Systems: Performance Analysis and Beamforming Design

IRS-Based Integrated Location Sensing and Communication for mmWave SIMO Systems

Multi-Class 3D Object Detection with Single-Class Supervision

PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions

Scene Transformer: A unified architecture for predicting multiple agent trajectories

Fresh, Fair and Energy-Efficient Content Provision in a Private and Cache-Enabled UAV Network

Are Labels Necessary for Neural Architecture Search?

Enhancing Physical Layer Security of Random Caching in Large-Scale Multi-Antenna Heterogeneous Wireless Networks

Micro-Batch Training with Batch-Channel Normalization and Weight Standardization

Predicting quantum many-body dynamics with transferable neural networks

Rethinking Normalization and Elimination Singularity in Neural Networks

Artificial-Noise-Aided Transmission in Multi-Antenna Relay Wiretap Channels with Spatially Random Eavesdroppers

Attention Correctness in Neural Image Captioning

Location-Based Beamforming and Physical Layer Security in Rician Wiretap Channels

Location-Based Beamforming for Rician Wiretap Channels

Secure Transmission for Relay Wiretap Channels in the Presence of Spatially Random Eavesdroppers