Source author record

Jian Lou

Jian Lou appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computer Science and Game Theory Computer Vision eess.IV Machine Learning Multiagent Systems Cryptography and Security Distributed, Parallel, and Cluster Computing Multimedia

Catalog footprint

What is connected

10works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

Large reasoning models often reach correct answers through flawed intermediate steps, creating a gap between final accuracy and reasoning reliability. Existing alignment strategies address this with external verifiers or massive sampling, limiting scalability. In this work, we introduce CASPO (Confidence-Aware Step-wise Preference Optimization), a framework that aligns token-level confidence with step-wise logical correctness through iterative Direct Preference Optimization, without training a separate reward model. During inference, we propose Confidence-aware Thought (CaT), which leverages this calibrated confidence to dynamically prune uncertain reasoning branches with negligible O(V) latency. Experiments across ten benchmarks and multiple model families show that CASPO consistently improves reasoning reliability and inference efficiency. CASPO scales to Qwen3-8B-Base and surpasses tree-search baselines on AIME'24 and AIME'25 without using reward-model data. We also release a step-wise dataset with confidence annotations to support fine-grained analysis of reasoning reliability. Code is available at https://github.com/Thecommonirin/CASPO.

preprint2026arXiv

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration

Many-shot jailbreaking (MSJ) causes safety-aligned language models to answer harmful queries by preceding them with many harmful question-answer demonstrations. We study why this attack becomes stronger as the number of demonstrations increases. Empirically, we find that MSJ induces a progressive activation drift: the representation of a fixed harmful query moves step by step away from the safety-aligned region as more harmful demonstrations are added. Theoretically, we show that this drift can be interpreted as implicit malicious fine-tuning: conditioning on N harmful demonstrations induces SGD-style updates equivalent to optimizing on the corresponding N harmful samples. This view turns the attack mechanism into a defense principle. We append a fixed one-shot safety demonstration at inference time, which induces a counteracting safety-oriented update and restores refusal behavior. The resulting method improves the model's robustness to MSJ without modifying its parameters or requiring white-box access at deployment. Code is available at https://github.com/Thecommonirin/SafeEnd.

preprint2022arXiv

Backdoor Attacks on Crowd Counting

Crowd counting is a regression task that estimates the number of people in a scene image, which plays a vital role in a range of safety-critical applications, such as video surveillance, traffic monitoring and flow control. In this paper, we investigate the vulnerability of deep learning based crowd counting models to backdoor attacks, a major security threat to deep learning. A backdoor attack implants a backdoor trigger into a target model via data poisoning so as to control the model's predictions at test time. Different from image classification models on which most of existing backdoor attacks have been developed and tested, crowd counting models are regression models that output multi-dimensional density maps, thus requiring different techniques to manipulate. In this paper, we propose two novel Density Manipulation Backdoor Attacks (DMBA$^{-}$ and DMBA$^{+}$) to attack the model to produce arbitrarily large or small density estimations. Experimental results demonstrate the effectiveness of our DMBA attacks on five classic crowd counting models and four types of datasets. We also provide an in-depth analysis of the unique challenges of backdooring crowd counting models and reveal two key elements of effective attacks: 1) full and dense triggers and 2) manipulation of the ground truth counts or density maps. Our work could help evaluate the vulnerability of crowd counting models to potential backdoor attacks.

preprint2022arXiv

Just Noticeable Difference for Deep Machine Vision

As an important perceptual characteristic of the Human Visual System (HVS), the Just Noticeable Difference (JND) has been studied for decades with image and video processing (e.g., perceptual visual signal compression). However, there is little exploration on the existence of JND for the Deep Machine Vision (DMV), although the DMV has made great strides in many machine vision tasks. In this paper, we take an initial attempt, and demonstrate that the DMV has the JND, termed as the DMV-JND. We then propose a JND model for the image classification task in the DMV. It has been discovered that the DMV can tolerate distorted images with average PSNR of only 9.56dB (the lower the better), by generating JND via unsupervised learning with the proposed DMV-JND-NET. In particular, a semantic-guided redundancy assessment strategy is designed to restrain the magnitude and spatial distribution of the DMV-JND. Experimental results on image classification demonstrate that we successfully find the JND for deep machine vision. Our DMV-JND facilitates a possible direction for DMV-oriented image and video compression, watermarking, quality assessment, deep neural network security, and so on.

preprint2022arXiv

MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning

Tensor factorization has received increasing interest due to its intrinsic ability to capture latent factors in multi-dimensional data with many applications such as recommender systems and Electronic Health Records (EHR) mining. PARAFAC2 and its variants have been proposed to address irregular tensors where one of the tensor modes is not aligned, e.g., different users in recommender systems or patients in EHRs may have different length of records. PARAFAC2 has been successfully applied on EHRs for extracting meaningful medical concepts (phenotypes). Despite recent advancements, current models' predictability and interpretability are not satisfactory, which limits its utility for downstream analysis. In this paper, we propose MULTIPAR: a supervised irregular tensor factorization with multi-task learning. MULTIPAR is flexible to incorporate both static (e.g. in-hospital mortality prediction) and continuous or dynamic (e.g. the need for ventilation) tasks. By supervising the tensor factorization with downstream prediction tasks and leveraging information from multiple related predictive tasks, MULTIPAR can yield not only more meaningful phenotypes but also better predictive performance for downstream tasks. We conduct extensive experiments on two real-world temporal EHR datasets to demonstrate that MULTIPAR is scalable and achieves better tensor fit with more meaningful subgroups and stronger predictive performance compared to existing state-of-the-art methods.

preprint2022arXiv

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data

Despite enormous research interest and rapid application of federated learning (FL) to various areas, existing studies mostly focus on supervised federated learning under the horizontally partitioned local dataset setting. This paper will study the unsupervised FL under the vertically partitioned dataset setting. Accordingly, we propose the federated principal component analysis for vertically partitioned dataset (VFedPCA) method, which reduces the dimensionality across the joint datasets over all the clients and extracts the principal component feature information for downstream data analysis. We further take advantage of the nonlinear dimensionality reduction and propose the vertical federated advanced kernel principal component analysis (VFedAKPCA) method, which can effectively and collaboratively model the nonlinear nature existing in many real datasets. In addition, we study two communication topologies. The first is a server-client topology where a semi-trusted server coordinates the federated training, while the second is the fully-decentralized topology which further eliminates the requirement of the server by allowing clients themselves to communicate with their neighbors. Extensive experiments conducted on five types of real-world datasets corroborate the efficacy of VFedPCA and VFedAKPCA under the vertically partitioned FL setting. Code is available at: https://github.com/juyongjiang/VFedPCA-VFedAKPCA

preprint2021arXiv

An Optimized H.266/VVC Software Decoder On Mobile Platform

As the successor of H.265/HEVC, the new versatile video coding standard (H.266/VVC) can provide up to 50% bitrate saving with the same subjective quality, at the cost of increased decoding complexity. To accelerate the application of the new coding standard, a real-time H.266/VVC software decoder that can support various platforms is implemented, where SIMD technologies, parallelism optimization, and the acceleration strategies based on the characteristics of each coding tool are applied. As the mobile devices have become an essential carrier for video services nowadays, the mentioned optimization efforts are not only implemented for the x86 platform, but more importantly utilized to highly optimize the decoding performance on the ARM platform in this work. The experimental results show that when running on the Apple A14 SoC (iPhone 12pro), the average single-thread decoding speed of the present implementation can achieve 53fps (RA and LB) for full HD (1080p) bitstreams generated by VTM-11.0 reference software using 8bit Common Test Conditions (CTC). When multi-threading is enabled, an average of 32 fps (RA) can be achieved when decoding the 4K bitstreams.

preprint2015arXiv

Multidefender Security Games

Stackelberg security game models and associated computational tools have seen deployment in a number of high-consequence security settings, such as LAX canine patrols and Federal Air Marshal Service. These models focus on isolated systems with only one defender, despite being part of a more complex system with multiple players. Furthermore, many real systems such as transportation networks and the power grid exhibit interdependencies between targets and, consequently, between decision makers jointly charged with protecting them. To understand such multidefender strategic interactions present in security, we investigate game theoretic models of security games with multiple defenders. Unlike most prior analysis, we focus on the situations in which each defender must protect multiple targets, so that even a single defender's best response decision is, in general, highly non-trivial. We start with an analytical investigation of multidefender security games with independent targets, offering an equilibrium and price-of-anarchy analysis of three models with increasing generality. In all models, we find that defenders have the incentive to over-protect targets, at times significantly. Additionally, in the simpler models, we find that the price of anarchy is unbounded, linearly increasing both in the number of defenders and the number of targets per defender. Considering interdependencies among targets, we develop a novel mixed-integer linear programming formulation to compute a defender's best response, and make use of this formulation in approximating Nash equilibria of the game. We apply this approach towards computational strategic analysis of several models of networks representing interdependencies, including real-world power networks. Our analysis shows how network structure and the probability of failure spread determine the propensity of defenders to over- or under-invest in security.

preprint2014arXiv

A Parallel Elicitation-Free Protocol for Allocating Indivisible Goods

We study the problem of allocating a set of indivisible goods to multiple agents. Recent work [Bouveret and Lang, 2011] focused on allocating goods in a sequential way, and studied what is the "best" sequence of agents to pick objects based on utilitarian or egalitarian criterion. In this paper, we propose a parallel elicitation-free protocol for allocating indivisible goods. In every round of the allocation process, some agents will be selected (according to some policy) to report their preferred objects among those that remain, and every reported object will be allocated randomly to an agent reporting it. Empirical comparison between the parallel protocol (applying a simple selection policy) and the sequential protocol (applying the optimal sequence) reveals that our proposed protocol is promising. We also address strategical issues.

preprint2014arXiv

Allocating Indivisible Resources under Price Rigidities in Polynomial Time

In many realistic problems of allocating resources, economy efficiency must be taken into consideration together with social equality, and price rigidities are often made according to some economic and social needs. We study the computational issues of dynamic mechanisms for selling multiple indivisible items under price rigidities. We propose a polynomial algorithm that can be used to find over-demanded sets of items, and then introduce a dynamic mechanism with rationing to discover constrained Walrasian equilibria under price rigidities in polynomial time. We also address the computation of sellers' expected profits and items' expected prices, and discuss strategical issues in the sense of expected profits.

Jian Lou

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Confidence-Aware Alignment Makes Reasoning LLMs More Reliable

Mitigating Many-shot Jailbreak Attacks with One Single Demonstration

Backdoor Attacks on Crowd Counting

Just Noticeable Difference for Deep Machine Vision

MULTIPAR: Supervised Irregular Tensor Factorization with Multi-task Learning

Vertical Federated Principal Component Analysis and Its Kernel Extension on Feature-wise Distributed Data

An Optimized H.266/VVC Software Decoder On Mobile Platform

Multidefender Security Games

A Parallel Elicitation-Free Protocol for Allocating Indivisible Goods

Allocating Indivisible Resources under Price Rigidities in Polynomial Time