Source author record

Junfeng Yang

Junfeng Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning Cryptography and Security Computer Vision Artificial Intelligence Operating Systems Databases Distributed, Parallel, and Cluster Computing physics.comp-ph physics.flu-dyn Software Engineering

Catalog footprint

What is connected

17works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Detecting Privilege Escalation in Polyglot Microservices via Agentic Program Analysis

Microservices are widely adopted in modern cloud systems due to their scalability and fault tolerance. However, microservice architectures introduce significant complexity in privilege and permission control, creating risks of privilege escalation where attackers can gain unauthorized access to resources or operations. Detecting such vulnerabilities is challenging due to complex cross-service interactions, polyglot codebases, and diverse privileged operations and permission checks. We present Neo, an agentic program analysis framework that combines large language models (LLMs) with classic program analysis to address these challenges. Neo leverages an LLM-based agent that dynamically generates analysis plans, adapts code search strategies, and validates semantics. We develop code search primitives that enable Neo to perform scalable and flexible code exploration across services and languages. We evaluated Neo on 25 open-source microservice applications spanning 7 programming languages and 6.2 million lines of code. Neo uncovered 24 zero-day privilege escalation vulnerabilities and achieved 81.0% precision and 85.0% recall on a ground-truth dataset. Compared to existing program analysis and agentic solutions, Neo demonstrated significant improvements in both detection accuracy and scalability. We further showcased Neo's extensibility by applying it to other application domains and vulnerability types, uncovering 18 additional zero-day vulnerabilities.

preprint2026arXiv

Hierarchical Multi-Fidelity Learning for Predicting Three-Dimensional Flame Wrinkling and Turbulent Burning Velocity

High-fidelity experimental characterization of turbulent premixed flames remains limited by the cost and complexity of advanced diagnostics, particularly under elevated pressures and intense turbulence where measurements of coupled flame morphology and burning dynamics are sparse. Here, we develop a hierarchical multi-fidelity neural network framework (MuFiNNs) to address this challenge by integrating sparse high-fidelity experimental data with structured low-fidelity representations encoding dominant physical trends. The framework combines hierarchical low-fidelity construction with nonlinear multi-fidelity correction to learn coupled geometric and reactive flame behavior while recovering discrepancies that simplified models alone cannot capture. The methodology is applied to expanding turbulent premixed flames to predict three-dimensional flame wrinkling dynamics and turbulent mass burning velocity across varying fuels, pressures, and turbulence intensities. Using experimentally informed low-fidelity trend models with sparse high-fidelity measurements, MuFiNNs accurately reconstruct observed flame behavior, enable interpolation across unseen operating conditions, and demonstrate robust extrapolation beyond the training domain. Importantly, the framework remains effective in noisy, weakly structured, or experimentally inaccessible regimes where conventional data-driven approaches often fail. These results show that hierarchical multi-fidelity learning provides a scalable and physically grounded strategy for predictive combustion modeling in data-limited regimes. More broadly, this work establishes multi-fidelity scientific machine learning as a practical framework for extracting physically meaningful predictive models from sparse experiments, particularly for instability-dominated and turbulence-sensitive reactive flows where high-fidelity data acquisition is demanding.

preprint2023arXiv

Tight Convergence Rate in Subgradient Norm of the Proximal Point Algorithm

Proximal point algorithm has found many applications, and it has been playing fundamental roles in the understanding, design, and analysis of many first-order methods. In this paper, we derive the tight convergence rate in subgradient norm of the proximal point algorithm, which was conjectured by Taylor, Hendrickx and Glineur [SIAM J.~Optim., 27 (2017), pp.~1283--1313]. This sort of convergence results in terms of the residual (sub)gradient norm is particularly interesting when considering dual methods, where the dual residual gradient norm corresponds to the primal distance to feasibility.

preprint2022arXiv

A Tale of Two Models: Constructing Evasive Attacks on Edge Models

Full-precision deep learning models are typically too large or costly to deploy on edge devices. To accommodate to the limited hardware resources, models are adapted to the edge using various edge-adaptation techniques, such as quantization and pruning. While such techniques may have a negligible impact on top-line accuracy, the adapted models exhibit subtle differences in output compared to the original model from which they are derived. In this paper, we introduce a new evasive attack, DIVA, that exploits these differences in edge adaptation, by adding adversarial noise to input data that maximizes the output difference between the original and adapted model. Such an attack is particularly dangerous, because the malicious input will trick the adapted model running on the edge, but will be virtually undetectable by the original model, which typically serves as the authoritative model version, used for validation, debugging and retraining. We compare DIVA to a state-of-the-art attack, PGD, and show that DIVA is only 1.7-3.6% worse on attacking the adapted model but 1.9-4.2 times more likely not to be detected by the the original model under a whitebox and semi-blackbox setting, compared to PGD.

preprint2022arXiv

Causal Transportability for Visual Recognition

Visual representations underlie object recognition tasks, but they often contain both robust and non-robust features. Our main observation is that image classifiers may perform poorly on out-of-distribution samples because spurious correlations between non-robust features and labels can be changed in a new environment. By analyzing procedures for out-of-distribution generalization with a causal graph, we show that standard classifiers fail because the association between images and labels is not transportable across settings. However, we then show that the causal effect, which severs all sources of confounding, remains invariant across domains. This motivates us to develop an algorithm to estimate the causal effect for image classification, which is transportable (i.e., invariant) across source and target environments. Without observing additional variables, we show that we can derive an estimand for the causal effect under empirical assumptions using representations in deep models as proxies. Theoretical analysis, empirical results, and visualizations show that our approach captures causal invariances and improves overall generalization.

preprint2022arXiv

Using Multiple Self-Supervised Tasks Improves Model Robustness

Deep networks achieve state-of-the-art performance on computer vision tasks, yet they fail under adversarial attacks that are imperceptible to humans. In this paper, we propose a novel defense that can dynamically adapt the input using the intrinsic structure from multiple self-supervised tasks. By simultaneously using many self-supervised tasks, our defense avoids over-fitting the adapted image to one specific self-supervised task and restores more intrinsic structure in the image compared to a single self-supervised task approach. Our approach further improves robustness and clean accuracy significantly compared to the state-of-the-art single task self-supervised defense. Our work is the first to connect multiple self-supervised tasks to robustness, and suggests that we can achieve better robustness with more intrinsic signal from visual data.

preprint2021arXiv

BPF for storage: an exokernel-inspired approach

The overhead of the kernel storage path accounts for half of the access latency for new NVMe storage devices. We explore using BPF to reduce this overhead, by injecting user-defined functions deep in the kernel's I/O processing stack. When issuing a series of dependent I/O requests, this approach can increase IOPS by over 2.5$\times$ and cut latency by half, by bypassing kernel layers and avoiding user-kernel boundary crossings. However, we must avoid losing important properties when bypassing the file system and block layer such as the safety guarantees of the file system and translation between physical blocks addresses and file offsets. We sketch potential solutions to these problems, inspired by exokernel file systems from the late 90s, whose time, we believe, has finally come!

preprint2020arXiv

A golden ratio primal-dual algorithm for structured convex optimization

We design, analyze and test a golden ratio primal-dual algorithm (GRPDA) for solving structured convex optimization problem, where the objective function is the sum of two closed proper convex functions, one of which involves a composition with a linear transform. GRPDA preserves all the favorable features of the classical primal-dual algorithm (PDA), i.e., the primal and the dual variables are updated in a Gauss-Seidel manner, and the per iteration cost is dominated by the evaluation of the proximal point mappings of the two component functions and two matrix-vector multiplications. Compared with the classical PDA, which takes an extrapolation step, the novelty of GRPDA is that it is constructed based on a convex combination of essentially the whole iteration trajectory. We show that GRPDA converges within a broader range of parameters than the classical PDA, provided that the reciprocal of the convex combination parameter is bounded above by the golden ratio, which explains the name of the algorithm. An O(1/N) ergodic convergence rate result is also established based on the primal-dual gap function, where N denotes the number of iterations. When either the primal or the dual problem is strongly convex, an accelerated GRPDA is constructed to improve the ergodic convergence rate from O(1/N) to O(1/N2). Moreover, we show for regularized least-squares and linear equality constrained problems that the reciprocal of the convex combination parameter can be extended from the golden ratio to 2 and meanwhile a relaxation step can be taken. Our preliminary numerical results on LASSO, nonnegative least-squares and minimax matrix game problems, with comparisons to some state-of-the-art relative algorithms, demonstrate the efficiency of the proposed algorithms.

preprint2020arXiv

Live Trojan Attacks on Deep Neural Networks

Like all software systems, the execution of deep learning models is dictated in part by logic represented as data in memory. For decades, attackers have exploited traditional software programs by manipulating this data. We propose a live attack on deep learning systems that patches model parameters in memory to achieve predefined malicious behavior on a certain set of inputs. By minimizing the size and number of these patches, the attacker can reduce the amount of network communication and memory overwrites, with minimal risk of system malfunctions or other detectable side effects. We demonstrate the feasibility of this attack by computing efficient patches on multiple deep learning models. We show that the desired trojan behavior can be induced with a few small patches and with limited access to training data. We describe the details of how this attack is carried out on real systems and provide sample code for patching TensorFlow model parameters in Windows and in Linux. Lastly, we present a technique for effectively manipulating entropy on perturbed inputs to bypass STRIP, a state-of-the-art run-time trojan detection technique.

preprint2020arXiv

Multitask Learning Strengthens Adversarial Robustness

Although deep networks achieve strong accuracy on a range of computer vision benchmarks, they remain vulnerable to adversarial attacks, where imperceptible input perturbations fool the network. We present both theoretical and empirical analyses that connect the adversarial robustness of a model to the number of tasks that it is trained on. Experiments on two datasets show that attack difficulty increases as the number of target tasks increase. Moreover, our results suggest that when models are trained on multiple tasks at once, they become more robust to adversarial attacks on individual tasks. While adversarial defense remains an open challenge, our results suggest that deep networks are vulnerable partly because they are trained on too few tasks.

preprint2020arXiv

On the dual step length of the alternating direction method of multipliers

The alternating direction method of multipliers (ADMM) is a most widely used optimization scheme for solving linearly constrained separable convex optimization problems. The convergence of the ADMM can be guaranteed when the dual step length is less than the golden ratio, while plenty of numerical evidence suggests that even larger dual step length often accelerates the convergence. It has also been proved that the dual step length can be enlarged to less than 2 in some special cases, namely, one of the separable functions in the objective function is linear, or both are quadratic plus some additional assumptions. However, it remains unclear whether the golden ratio can be exceeded in the general convex setting. In this paper, the performance estimation framework is used to analyze the convergence of the ADMM, and assisted by numerical and symbolic computations, a counter example is constructed, which indicates that the conventional measure may lose monotonicity as the dual step length exceeds the golden ratio, ruling out the possibility of breaking the golden ratio within this conventional analytic framework.

preprint2020arXiv

What Does CNN Shift Invariance Look Like? A Visualization Study

Feature extraction with convolutional neural networks (CNNs) is a popular method to represent images for machine learning tasks. These representations seek to capture global image content, and ideally should be independent of geometric transformations. We focus on measuring and visualizing the shift invariance of extracted features from popular off-the-shelf CNN models. We present the results of three experiments comparing representations of millions of images with exhaustively shifted objects, examining both local invariance (within a few pixels) and global invariance (across the image frame). We conclude that features extracted from popular networks are not globally invariant, and that biases and artifacts exist within this variance. Additionally, we determine that anti-aliased models significantly improve local invariance but do not impact global invariance. Finally, we provide a code repository for experiment reproduction, as well as a website to interact with our results at https://jakehlee.github.io/visualize-invariance.

preprint2016arXiv

Applications of gauge duality in robust principal component analysis and semidefinite programming

Gauge duality theory was originated by Freund [Math. Programming, 38(1):47-67, 1987] and was recently further investigated by Friedlander, Mac{ê}do and Pong [SIAM J. Optm., 24(4):1999-2022, 2014]. When solving some matrix optimization problems via gauge dual, one is usually able to avoid full matrix decompositions such as singular value and/or eigenvalue decompositions. In such an approach, a gauge dual problem is solved in the first stage, and then an optimal solution to the primal problem can be recovered from the dual optimal solution obtained in the first stage. Recently, this theory has been applied to a class of \emph{semidefinite programming} (SDP) problems with promising numerical results [Friedlander and Mac{ê}do, SIAM J. Sci. Comp., to appear, 2016]. In this paper, we establish some theoretical results on applying the gauge duality theory to robust \emph{principal component analysis} (PCA) and general SDP. For each problem, we present its gauge dual problem, characterize the optimality conditions for the primal-dual gauge pair, and validate a way to recover a primal optimal solution from a dual one. These results are extensions of [Friedlander and Mac{ê}do, SIAM J. Sci. Comp., to appear, 2016] from nuclear norm regularization to robust PCA and from a special class of SDP which requires the coefficient matrix in the linear objective to be positive definite to SDP problems without this restriction. Our results provide further understanding in the potential advantages and disadvantages of the gauge duality theory.

preprint2015arXiv

EOS: Automatic In-vivo Evolution of Kernel Policies for Better Performance

Today's monolithic kernels often implement a small, fixed set of policies such as disk I/O scheduling policies, while exposing many parameters to let users select a policy or adjust the specific setting of the policy. Ideally, the parameters exposed should be flexible enough for users to tune for good performance, but in practice, users lack domain knowledge of the parameters and are often stuck with bad, default parameter settings. We present EOS, a system that bridges the knowledge gap between kernel developers and users by automatically evolving the policies and parameters in vivo on users' real, production workloads. It provides a simple policy specification API for kernel developers to programmatically describe how the policies and parameters should be tuned, a policy cache to make in-vivo tuning easy and fast by memorizing good parameter settings for past workloads, and a hierarchical search engine to effectively search the parameter space. Evaluation of EOS on four main Linux subsystems shows that it is easy to use and effectively improves each subsystem's performance.

preprint2014arXiv

A general inertial proximal point method for mixed variational inequality problem

In this paper, we first propose a general inertial proximal point method for the mixed variational inequality (VI) problem. Based on our knowledge, without stronger assumptions, convergence rate result is not known in the literature for inertial type proximal point methods. Under certain conditions, we are able to establish the global convergence and a $o(1/k)$ convergence rate result (under certain measure) of the proposed general inertial proximal point method. We then show that the linearized alternating direction method of multipliers (ADMM) for separable convex optimization with linear constraints is an application of a general proximal point method, provided that the algorithmic parameters are properly chosen. As byproducts of this finding, we establish global convergence and $O(1/k)$ convergence rate results of the linearized ADMM in both ergodic and nonergodic sense. In particular, by applying the proposed inertial proximal point method for mixed VI to linearly constrained separable convex optimization, we obtain an inertial version of the linearized ADMM for which the global convergence is guaranteed. We also demonstrate the effect of the inertial extrapolation step via experimental results on the compressive principal component pursuit problem.

preprint2014arXiv

Inertial primal-dual algorithms for structured convex optimization

The primal-dual algorithm recently proposed by Chambolle & Pock (abbreviated as CPA) for structured convex optimization is very efficient and popular. It was shown by Chambolle & Pock in \cite{CP11} and also by Shefi & Teboulle in \cite{ST14} that CPA and variants are closely related to preconditioned versions of the popular alternating direction method of multipliers (abbreviated as ADM). In this paper, we further clarify this connection and show that CPAs generate exactly the same sequence of points with the so-called linearized ADM (abbreviated as LADM) applied to either the primal problem or its Lagrangian dual, depending on different updating orders of the primal and the dual variables in CPAs, as long as the initial points for the LADM are properly chosen. The dependence on initial points for LADM can be relaxed by focusing on cyclically equivalent forms of the algorithms. Furthermore, by utilizing the fact that CPAs are applications of a general weighted proximal point method to the mixed variational inequality formulation of the KKT system, where the weighting matrix is positive definite under a parameter condition, we are able to propose and analyze inertial variants of CPAs. Under certain conditions, global point-convergence, nonasymptotic $O(1/k)$ and asymptotic $o(1/k)$ convergence rate of the proposed inertial CPAs can be guaranteed, where $k$ denotes the iteration index. Finally, we demonstrate the profits gained by introducing the inertial extrapolation step via experimental results on compressive image reconstruction based on total variation minimization.

preprint2010arXiv

A Fast Algorithm for Total Variation Image Reconstruction from Random Projections

Total variation (TV) regularization is popular in image restoration and reconstruction due to its ability to preserve image edges. To date, most research activities on TV models concentrate on image restoration from blurry and noisy observations, while discussions on image reconstruction from random projections are relatively fewer. In this paper, we propose, analyze, and test a fast alternating minimization algorithm for image reconstruction from random projections via solving a TV regularized least-squares problem. The per-iteration cost of the proposed algorithm involves a linear time shrinkage operation, two matrix-vector multiplications and two fast Fourier transforms. Convergence, certain finite convergence and $q$-linear convergence results are established, which indicate that the asymptotic convergence speed of the proposed algorithm depends on the spectral radii of certain submatrix. Moreover, to speed up convergence and enhance robustness, we suggest an accelerated scheme based on an inexact alternating direction method. We present experimental results to compare with an existing algorithm, which indicate that the proposed algorithm is stable, efficient and competitive with TwIST \cite{TWIST} -- a state-of-the art algorithm for solving TV regularization problems.

Junfeng Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Detecting Privilege Escalation in Polyglot Microservices via Agentic Program Analysis

Hierarchical Multi-Fidelity Learning for Predicting Three-Dimensional Flame Wrinkling and Turbulent Burning Velocity

Tight Convergence Rate in Subgradient Norm of the Proximal Point Algorithm

A Tale of Two Models: Constructing Evasive Attacks on Edge Models

Causal Transportability for Visual Recognition

Using Multiple Self-Supervised Tasks Improves Model Robustness

BPF for storage: an exokernel-inspired approach

A golden ratio primal-dual algorithm for structured convex optimization

Live Trojan Attacks on Deep Neural Networks

Multitask Learning Strengthens Adversarial Robustness

On the dual step length of the alternating direction method of multipliers

What Does CNN Shift Invariance Look Like? A Visualization Study

Applications of gauge duality in robust principal component analysis and semidefinite programming

EOS: Automatic In-vivo Evolution of Kernel Policies for Better Performance

A general inertial proximal point method for mixed variational inequality problem

Inertial primal-dual algorithms for structured convex optimization

A Fast Algorithm for Total Variation Image Reconstruction from Random Projections