Source author record

Lijun Zhang

Lijun Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

82works

28topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images

Recent advances in large Vision-Language Models (VLMs) have exhibited strong reasoning capabilities on complex visual tasks by thinking with images in their Chain-of-Thought (CoT), which is achieved by actively invoking tools to analyze visual inputs rather than merely perceiving them. However, existing models often struggle to reflect on and correct themselves when attempting incorrect reasoning trajectories. To address this limitation, we propose DRIM, a model that enables deep but reliable multi-turn reasoning when thinking with images in its multimodal CoT. Our pipeline comprises three stages: data construction, cold-start SFT and RL. Based on a high-resolution image dataset, we construct high-difficulty and verifiable visual question-answer pairs, where solving each task requires multi-turn tool calls to reach the correct answer. In the SFT stage, we collect tool trajectories as cold-start data, guiding a multi-turn reasoning pattern. In the RL stage, we introduce redundancy-penalized policy optimization, which incentivizes the model to develop a self-reflective reasoning pattern. The basic idea is to impose judgment on reasoning trajectories and penalize those that produce incorrect answers without sufficient multi-scale exploration. Extensive experiments demonstrate that DRIM achieves superior performance on visual understanding benchmarks.

preprint2026arXiv

Distributed Online Convex Optimization with Efficient Communication: Improved Algorithm and Lower bounds

We investigate distributed online convex optimization with compressed communication, where $n$ learners connected by a network collaboratively minimize a sequence of global loss functions using only local information and compressed data from neighbors. Prior work has established regret bounds of $O(\max\{ω^{-2}ρ^{-4}n^{1/2},ω^{-4}ρ^{-8}\}n\sqrt{T})$ and $O(\max\{ω^{-2}ρ^{-4}n^{1/2},ω^{-4}ρ^{-8}\}n\ln{T})$ for convex and strongly convex functions, respectively, where $ω\in(0,1]$ is the compression quality factor ($ω=1$ means no compression) and $ρ<1$ is the spectral gap of the communication matrix. However, these regret bounds suffer from a quadratic or even quartic dependence on $ω^{-1}$. Moreover, the super-linear dependence on $n$ is also undesirable. To overcome these limitations, we propose a novel algorithm that achieves improved regret bounds of $\tilde{O}(ω^{-1/2}ρ^{-1}n\sqrt{T})$ and $\tilde{O}(ω^{-1}ρ^{-2}n\ln{T})$ for convex and strongly convex functions, respectively. The primary idea is to design a two-level blocking update framework incorporating two novel ingredients: an online gossip strategy and an error compensation scheme, which collaborate to achieve a better consensus among learners. Furthermore, we establish the first lower bounds for this problem, justifying the optimality of our results with respect to both $ω$ and $T$. Additionally, we consider the bandit feedback scenario, and extend our method with the classic gradient estimators to enhance existing regret bounds.

preprint2026arXiv

Stable Routing for Mixture-of-Experts in Class-Incremental Learning

Class-incremental learning (CIL) requires models to learn new classes sequentially while preserving prior knowledge. Recently, approaches that combine pre-trained models with mixture-of-experts (MoE) have received increasing attention in CIL: they typically expand experts during learning and employ a router to assign weights across experts. However, existing MoE methods often overlook routing drift induced by expert expansion. Once new experts are introduced, the router may reassign samples from earlier classes to newly added experts, thereby perturbing previously established expert compositions and causing interference even when old experts remain frozen. We argue that expandable MoE in CIL requires two complementary properties: stable old-class routing for knowledge preservation and sufficient capacity utilization for new-class adaptation. To this end, we propose Stable Routing for MoE (StaR-MoE), a routing-level framework for expandable MoE in CIL. By incorporating sensitivity-aware routing alignment, StaR-MoE aligns current old-class routing behavior with historical routing distributions through sensitivity-guided constraints. Complementarily, StaR-MoE introduces asymmetric capacity regularization to encourage effective utilization of the expanded expert pool without compromising class-specific routing specialization. Extensive experiments across four standard CIL benchmarks demonstrate that StaR-MoE consistently improves both average and last accuracy over state-of-the-art methods, highlighting the importance of stable routing.

preprint2026arXiv

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

Recently, self-play fine-tuning (SPIN) has been proposed to adapt large language models to downstream applications with scarce expert-annotated data, by iteratively generating synthetic responses from the model itself. However, SPIN is designed to optimize the current reward advantages of annotated responses over synthetic responses at hand, which may gradually vanish during iterations, leading to unstable optimization. Moreover, the utilization of reference policy induces a misalignment issue between the reward formulation for training and the metric for generation. To address these limitations, we propose a novel Triplet-based Self-Play fIne-tuNing (T-SPIN) method that integrates two key designs. First, beyond current advantages, T-SPIN additionally incorporates historical advantages between iteratively generated responses and proto-synthetic responses produced by the initial policy. Even if the current advantages diminish, historical advantages remain effective, stabilizing the overall optimization. Second, T-SPIN introduces the entropy constraint into the self-play framework, which is theoretically justified to support reference-free fine-tuning, eliminating the training-generation discrepancy. Empirical results on various tasks demonstrate not only the superior performance of T-SPIN over SPIN, but also its stable evolution during iterations. Remarkably, compared to supervised fine-tuning, T-SPIN achieves comparable or even better performance with only 25% samples, highlighting its effectiveness when faced with scarce annotated data.

preprint2026arXiv

When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $\ell_1$-norm Lower Bounds

Sign-based optimization algorithms, such as SignSGD and Muon, have garnered significant attention for their remarkable performance in training large foundation models. Despite this empirical success, we still lack a theoretical understanding of when and why these sign-based methods outperform vanilla SGD. The core obstacle is that under standard smoothness and finite variance conditions, SGD is known to be minimax optimal for finding stationary points measured by $\ell_2$-norms, thereby fundamentally precluding any complexity gains for sign-based methods in standard settings. To overcome this barrier, we analyze sign-based optimizers leveraging $\ell_1$-norm stationarity, $\ell_\infty$-smoothness, and a separable noise model, which can better capture the coordinate-wise nature of signed updates. Under this distinct problem geometry, we derive matched upper and lower bounds for SignSGD and explicitly characterize the problem class in which SignSGD provably dominates SGD. Specifically, we compare the \emph{upper bound of SignSGD} with the \emph{lower bound of SGD}, illustrating that SignSGD effectively reduces the complexity by a factor of $d$ under \emph{sparse noise}, where $d$ is the problem dimension. Furthermore, we elevate this framework to the matrix domain, providing an equivalent optimal lower bound for the Muon optimizer, proving that extending the sign operator to matrices preserves this optimal scaling with dimensionality. Finally, we bridge our theoretical bounds to practice, demonstrating that the theoretical superiority of SignSGD accurately predicts its faster convergence during the pretraining of a 124M parameter GPT-2 model.

preprint2025arXiv

Constrained Language Model Policy Optimization via Risk-aware Stepwise Alignment

When fine-tuning pre-trained Language Models (LMs) to exhibit desired behaviors, maintaining control over risk is critical for ensuring both safety and trustworthiness. Most existing safety alignment methods, such as Safe RLHF and SACPO, typically operate under a risk-neutral paradigm that is insufficient to address the risks arising from deviations from the reference policy and offers limited robustness against rare but potentially catastrophic harmful behaviors. To address this limitation, we propose Risk-aware Stepwise Alignment (RSA), a novel alignment method that explicitly incorporates risk awareness into the policy optimization process by leveraging a class of nested risk measures. Specifically, RSA formulates safety alignment as a token-level risk-aware constrained policy optimization problem and solves it through a stepwise alignment procedure that yields token-level policy updates derived from the nested risk measures. This design offers two key benefits: (1) it mitigates risks induced by excessive model shift away from a reference policy, and (2) it explicitly suppresses low-probability yet high-impact harmful behaviors. Moreover, we provide theoretical analysis on policy optimality under mild assumptions. Experimental results demonstrate that our method achieves high levels of helpfulness while ensuring strong safety and significantly suppresses tail risks, namely low-probability yet high-impact unsafe responses.

preprint2023arXiv

ECSAS: Exploring Critical Scenarios from Action Sequence in Autonomous Driving

Critical scenario generation requires the ability of sampling critical combinations from the infinite parameter space in the logic scenario. Existing solutions aim to explore the correlation of action parameters in the initial scenario rather than action sequences. How to model action sequences so that one can further consider the effects of different action parameters in the scenario is the bottleneck of the problem. In this paper, we attack the problem by proposing the ECSAS framework. Specifically, we first propose a description language, BTScenario, allowing us to model action sequences of the scenarios. We then use reinforcement learning to search for combinations of critical action parameters. To increase efficiency, we further propose several optimizations, including action masking and replay buffer. We have implemented ECSAS, and experimental results show that it is more efficient than native approaches such as random and combination testing in various nontrivial scenarios.

preprint2022arXiv

A Tree-Structured Multi-Task Model Recommender

Tree-structured multi-task architectures have been employed to jointly tackle multiple vision tasks in the context of multi-task learning (MTL). The major challenge is to determine where to branch out for each task given a backbone model to optimize for both task accuracy and computation efficiency. To address the challenge, this paper proposes a recommender that, given a set of tasks and a convolutional neural network-based backbone model, automatically suggests tree-structured multi-task architectures that could achieve a high task performance while meeting a user-specified computation budget without performing model training. Extensive evaluations on popular MTL benchmarks show that the recommended architectures could achieve competitive task accuracy and computation efficiency compared with state-of-the-art MTL methods. Our tree-structured multi-task model recommender is open-sourced and available at https://github.com/zhanglijun95/TreeMTL.

preprint2022arXiv

Adaptive Deep Learning for Entity Resolution by Risk Analysis

The state-of-the-art performance on entity resolution (ER) has been achieved by deep learning. However, deep models are usually trained on large quantities of accurately labeled training data, and can not be easily tuned towards a target workload. Unfortunately, in real scenarios, there may not be sufficient labeled training data, and even worse, their distribution is usually more or less different from the target workload even when they come from the same domain. To alleviate the said limitations, this paper proposes a novel risk-based approach to tune a deep model towards a target workload by its particular characteristics. Built on the recent advances on risk analysis for ER, the proposed approach first trains a deep model on labeled training data, and then fine-tunes it by minimizing its estimated misprediction risk on unlabeled target data. Our theoretical analysis shows that risk-based adaptive training can correct the label status of a mispredicted instance with a fairly good chance. We have also empirically validated the efficacy of the proposed approach on real benchmark data by a comparative study. Our extensive experiments show that it can considerably improve the performance of deep models. Furthermore, in the scenario of distribution misalignment, it can similarly outperform the state-of-the-art alternative of transfer learning by considerable margins. Using ER as a test case, we demonstrate that risk-based adaptive training is a promising approach potentially applicable to various challenging classification tasks.

preprint2022arXiv

Defensive Design of Saturating Counters Based on Differential Privacy

The saturating counter is the basic module of the dynamic branch predictor, which involves the core technique to improve instruction level parallelism performance in modern processors. However, most studies focus on the performance improvement and hardware consumption of saturating counters, while ignoring the security problems they may cause. In this paper, we creatively propose to study and design saturating counters from the defense perspective of differential privacy, so that attackers cannot distinguish the states that saturating counters are in and further infer sensitive information. To obtain theoretical guarantees, we use Markov chain to formalize the attack algorithm applied to the saturating counter, investigate into the optimal attack strategy and calculate the probability of successful attack. Furthermore, we find that the attacker is able to accurately guess the branch execution of the victim's process in the existing saturating counters. To avoid this, we design a new probabilistic saturating counter, which generalizes the existing conventional and probabilistic saturating counters. The guarantee of differential privacy is applied to deduce parameters of the new saturating counters so that the security requirement can be satisfied. We also theoretically calculate the misprediction rate when the saturating counter reaches the steady state. The experimental results on testing programs show that the calculated theoretical results agree with the experimental performances. Compared with the existing conventional and probabilistic saturating counters, when the parameters of our designed models are selected appropriately, the new saturating counters can not only ensure similar operational performance, but also establish strict security guarantee.

preprint2022arXiv

Divide-and-Conquer Determinization of Büchi Automata based on SCC Decomposition

The determinization of a nondeterministic Büchi automaton (NBA) is a fundamental construction of automata theory, with applications to probabilistic verification and reactive synthesis. The standard determinization constructions, such as the ones based on the Safra-Piterman's approach, work on the whole NBA. In this work we propose a divide-and-conquer determinization approach. To this end, we first classify the strongly connected components (SCCs) of the given NBA as inherently weak, deterministic accepting, and nondeterministic accepting. We then present how to determinize each type of SCC independently from the others; this results in an easier handling of the determinization algorithm that takes advantage of the structure of that SCC. Once all SCCs have been determinized, we show how to compose them so to obtain the final equivalent deterministic Emerson-Lei automaton, which can be converted into a deterministic Rabin automaton without blow-up of states and transitions. We implement our algorithm in a our tool COLA and empirically evaluate COLA with the state-of-the-art tools Spot and OWL on a large set of benchmarks from the literature. The experimental results show that our prototype COLA outperforms Spot and OWL regarding the number of states and transitions.

preprint2022arXiv

Inorganic Crystal Structure Prototype Database based on Unsupervised Learning of Local Atomic Environments

Recognition of structure prototypes from tremendous known inorganic crystal structures has been an important subject beneficial for material science research and new materials design. The existing databases of inorganic crystal structure prototypes were mostly constructed by classifying materials in terms of the crystallographic space group information. Herein, we employed a distinct strategy to construct the inorganic crystal structure prototype database, relying on the classification of materials in terms of local atomic environments (LAE) accompanied by unsupervised machine learning method. Specifically, we adopted a hierarchical clustering approach onto all experimentally known inorganic crystal structures data to identify structure prototypes. The criterion for hierarchical clustering is the LAE represented by the state-of-the-art structure fingerprints of the improved bond-orientational order parameters and the smooth overlap of atomic positions. This allows us to build up a LAE-based Inorganic Crystal Structure Prototype Database (LAE-ICSPD) containing 15,613 structure prototypes with defined stoichiometries. In addition, we have developed a Structure Prototype Generator Infrastructure (SPGI) package, which is a useful toolkit for structure prototype generation. Our developed SPGI toolkit and LAE-ICSPD are beneficial for investigating inorganic materials in a global way as well as accelerating materials discovery process in the data-driven mode.

preprint2022arXiv

Momentum Accelerates the Convergence of Stochastic AUPRC Maximization

In this paper, we study stochastic optimization of areas under precision-recall curves (AUPRC), which is widely used for combating imbalanced classification tasks. Although a few methods have been proposed for maximizing AUPRC, stochastic optimization of AUPRC with convergence guarantee remains an undeveloped territory. A state-of-the-art complexity is $O(1/ε^5)$ for finding an $ε$-stationary solution. In this paper, we further improve the stochastic optimization of AURPC by (i) developing novel stochastic momentum methods with a better iteration complexity of $O(1/ε^4)$ for finding an $ε$-stationary solution; and (ii) designing a novel family of stochastic adaptive methods with the same iteration complexity, which enjoy faster convergence in practice. To this end, we propose two innovative techniques that are critical for improving the convergence: (i) the biased estimators for tracking individual ranking scores are updated in a randomized coordinate-wise manner; and (ii) a momentum update is used on top of the stochastic gradient estimator for tracking the gradient of the objective. The novel analysis of Adam-style updates is also one main contribution. Extensive experiments on various data sets demonstrate the effectiveness of the proposed algorithms. Of independent interest, the proposed stochastic momentum and adaptive algorithms are also applicable to a class of two-level stochastic dependent compositional optimization problems.

preprint2022arXiv

Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization

Variance reduction techniques such as SPIDER/SARAH/STORM have been extensively studied to improve the convergence rates of stochastic non-convex optimization, which usually maintain and update a sequence of estimators for a single function across iterations. What if we need to track multiple functional mappings across iterations but only with access to stochastic samples of $\mathcal{O}(1)$ functional mappings at each iteration? There is an important application in solving an emerging family of coupled compositional optimization problems in the form of $\sum_{i=1}^m f_i(g_i(\mathbf{w}))$, where $g_i$ is accessible through a stochastic oracle. The key issue is to track and estimate a sequence of $\mathbf g(\mathbf{w})=(g_1(\mathbf{w}), \ldots, g_m(\mathbf{w}))$ across iterations, where $\mathbf g(\mathbf{w})$ has $m$ blocks and it is only allowed to probe $\mathcal{O}(1)$ blocks to attain their stochastic values and Jacobians. To improve the complexity for solving these problems, we propose a novel stochastic method named Multi-block-Single-probe Variance Reduced (MSVR) estimator to track the sequence of $\mathbf g(\mathbf{w})$. It is inspired by STORM but introduces a customized error correction term to alleviate the noise not only in stochastic samples for the selected blocks but also in those blocks that are not sampled. With the help of the MSVR estimator, we develop several algorithms for solving the aforementioned compositional problems with improved complexities across a spectrum of settings with non-convex/convex/strongly convex/Polyak-Łojasiewicz (PL) objectives. Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on the strong convexity parameter. Empirical studies on multi-task deep AUC maximization demonstrate the better performance of using the new estimator.

preprint2022arXiv

Rethinking Hard-Parameter Sharing in Multi-Domain Learning

Hard parameter sharing in multi-domain learning (MDL) allows domains to share some of the model parameters to reduce storage cost while improving prediction accuracy. One common sharing practice is to share the bottom layers of a deep neural network among domains while using separate top layers for each domain. In this work, we revisit this common practice via an empirical study on image classification tasks from a diverse set of visual domains and make two surprising observations. (1) Using separate bottom-layer parameters could achieve significantly better performance than the common practice and this phenomenon holds with different experimental settings. (2) A multi-domain model with a small proportion of domain-specific parameters from bottom layers can achieve competitive performance with independent models trained on each domain separately. Our observations suggest that people adopt the new strategy of using separate bottom-layer parameters as a stronger baseline for model design in MDL.

preprint2022arXiv

Towards Practical Robustness Analysis for DNNs based on PAC-Model Learning

To analyse local robustness properties of deep neural networks (DNNs), we present a practical framework from a model learning perspective. Based on black-box model learning with scenario optimisation, we abstract the local behaviour of a DNN via an affine model with the probably approximately correct (PAC) guarantee. From the learned model, we can infer the corresponding PAC-model robustness property. The innovation of our work is the integration of model learning into PAC robustness analysis: that is, we construct a PAC guarantee on the model level instead of sample distribution, which induces a more faithful and accurate robustness evaluation. This is in contrast to existing statistical methods without model learning. We implement our method in a prototypical tool named DeepPAC. As a black-box method, DeepPAC is scalable and efficient, especially when DNNs have complex structures or high-dimensional inputs. We extensively evaluate DeepPAC, with 4 baselines (using formal verification, statistical methods, testing and adversarial attack) and 20 DNN models across 3 datasets, including MNIST, CIFAR-10, and ImageNet. It is shown that DeepPAC outperforms the state-of-the-art statistical method PROVERO, and it achieves more practical robustness analysis than the formal verification tool ERAN. Also, its results are consistent with existing DNN testing work like DeepGini.

preprint2022arXiv

Weight Expansion: A New Perspective on Dropout and Generalization

While dropout is known to be a successful regularization technique, insights into the mechanisms that lead to this success are still lacking. We introduce the concept of \emph{weight expansion}, an increase in the signed volume of a parallelotope spanned by the column or row vectors of the weight covariance matrix, and show that weight expansion is an effective means of increasing the generalization in a PAC-Bayesian setting. We provide a theoretical argument that dropout leads to weight expansion and extensive empirical support for the correlation between dropout and weight expansion. To support our hypothesis that weight expansion can be regarded as an \emph{indicator} of the enhanced generalization capability endowed by dropout, and not just as a mere by-product, we have studied other methods that achieve weight expansion (resp.\ contraction), and found that they generally lead to an increased (resp.\ decreased) generalization ability. This suggests that dropout is an attractive regularizer, because it is a computationally cheap method for obtaining weight expansion. This insight justifies the role of dropout as a regularizer, while paving the way for identifying regularizers that promise improved generalization through weight expansion.

preprint2021arXiv

Analyzing Deep Neural Networks with Symbolic Propagation: Towards Higher Precision and Faster Verification

Deep neural networks (DNNs) have been shown lack of robustness for the vulnerability of their classification to small perturbations on the inputs. This has led to safety concerns of applying DNNs to safety-critical domains. Several verification approaches have been developed to automatically prove or disprove safety properties of DNNs. However, these approaches suffer from either the scalability problem, i.e., only small DNNs can be handled, or the precision problem, i.e., the obtained bounds are loose. This paper improves on a recent proposal of analyzing DNNs through the classic abstract interpretation technique, by a novel symbolic propagation technique. More specifically, the values of neurons are represented symbolically and propagated forwardly from the input layer to the output layer, on top of abstract domains. We show that our approach can achieve significantly higher precision and thus can prove more properties than using only abstract domains. Moreover, we show that the bounds derived from our approach on the hidden neurons, when applied to a state-of-the-art SMT based verification tool, can improve its performance. We implement our approach into a software tool and validate it over a few DNNs trained on benchmark datasets such as MNIST, etc.

preprint2020arXiv

Adaptive and Efficient Algorithms for Tracking the Best Expert

In this paper, we consider the problem of prediction with expert advice in dynamic environments. We choose tracking regret as the performance metric and develop two adaptive and efficient algorithms with data-dependent tracking regret bounds. The first algorithm achieves a second-order tracking regret bound, which improves existing first-order bounds. The second algorithm enjoys a path-length bound, which is generally not comparable to the second-order bound but offers advantages in slowly moving environments. Both algorithms are developed under the online mirror descent framework and draw inspiration from existing algorithms that attain data-dependent bounds of static regret. The key idea is to use a clipped simplex in the updating step of online mirror descent. Finally, we extend our algorithms and analysis to online matrix prediction and provide the first data-dependent tracking regret bound for this problem.

preprint2020arXiv

Bandit Convex Optimization in Non-stationary Environments

Bandit Convex Optimization (BCO) is a fundamental framework for modeling sequential decision-making with partial information, where the only feedback available to the player is the one-point or two-point function values. In this paper, we investigate BCO in non-stationary environments and choose the \emph{dynamic regret} as the performance measure, which is defined as the difference between the cumulative loss incurred by the algorithm and that of any feasible comparator sequence. Let $T$ be the time horizon and $P_T$ be the path-length of the comparator sequence that reflects the non-stationarity of environments. We propose a novel algorithm that achieves $O(T^{3/4}(1+P_T)^{1/2})$ and $O(T^{1/2}(1+P_T)^{1/2})$ dynamic regret respectively for the one-point and two-point feedback models. The latter result is optimal, matching the $Ω(T^{1/2}(1+P_T)^{1/2})$ lower bound established in this paper. Notably, our algorithm is more adaptive to non-stationary environments since it does not require prior knowledge of the path-length $P_T$ ahead of time, which is generally unknown.

preprint2020arXiv

Helium Incorporation Stabilized Direct-gap Silicides

The search of direct-gap Si-based semiconductors is of great interest due to the potential application in many technologically relevant fields. This work examines the incorporation of He as a possible route to form a direct band gap in Si. Structure predictions and first-principles calculations have shown that He reacts with Si at high pressure, to form the stable compounds Si2He and Si3He. Both compounds have host-guest structures consisting of a channel-like Si host framework filled with He guest atoms. The Si frameworks in two compounds could be persisted to ambient pressure after removal of He, forming two pure Si allotropes. Both Si-He compounds and both Si allotropes exhibit direct or quasi-direct band gaps of 0.84-1.34 eV, close to the optimal value (~1.3 eV) for solar cell applications. Analysis shows that Si2He with an electric-dipole-transition allowed band gap possesses higher absorption capacity than diamond cubic Si, which makes it to be a promising candidate material for thin-film solar cell.

preprint2020arXiv

Minimizing Dynamic Regret and Adaptive Regret Simultaneously

Regret minimization is treated as the golden rule in the traditional study of online learning. However, regret minimization algorithms tend to converge to the static optimum, thus being suboptimal for changing environments. To address this limitation, new performance measures, including dynamic regret and adaptive regret have been proposed to guide the design of online algorithms. The former one aims to minimize the global regret with respect to a sequence of changing comparators, and the latter one attempts to minimize every local regret with respect to a fixed comparator. Existing algorithms for dynamic regret and adaptive regret are developed independently, and only target one performance measure. In this paper, we bridge this gap by proposing novel online algorithms that are able to minimize the dynamic regret and adaptive regret simultaneously. In fact, our theoretical guarantee is even stronger in the sense that one algorithm is able to minimize the dynamic regret over any interval.

preprint2020arXiv

Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs

In this paper, we study the problem of stochastic linear bandits with finite action sets. Most of existing work assume the payoffs are bounded or sub-Gaussian, which may be violated in some scenarios such as financial markets. To settle this issue, we analyze the linear bandits with heavy-tailed payoffs, where the payoffs admit finite $1+ε$ moments for some $ε\in(0,1]$. Through median of means and dynamic truncation, we propose two novel algorithms which enjoy a sublinear regret bound of $\widetilde{O}(d^{\frac{1}{2}}T^{\frac{1}{1+ε}})$, where $d$ is the dimension of contextual information and $T$ is the time horizon. Meanwhile, we provide an $Ω(d^{\fracε{1+ε}}T^{\frac{1}{1+ε}})$ lower bound, which implies our upper bound matches the lower bound up to polylogarithmic factors in the order of $d$ and $T$ when $ε=1$. Finally, we conduct numerical experiments to demonstrate the effectiveness of our algorithms and the empirical results strongly support our theoretical guarantees.

preprint2020arXiv

New Polymorphs of Two-Dimensional Indium Selenide with Enhanced Electronic Properties

The two-dimensional (2D) semiconductor indium selenide (InSe) has attracted significant interest due its unique electronic band structure, high electron mobility and wide tunability of its band gap energy achieved by varying the layer thickness. All these features make 2D InSe a potential candidate for advanced electronic and optoelectronic applications. Here, we report on the discovery of new polymorphs of InSe with enhanced electronic properties. Using a global structure search that combines artificial swarm intelligence with first-principles energetic calculations, we identify polymorphs that consist of a centrosymmetric monolayer belonging to the point group D$_{3d}$, distinct from the well-known polymorphs based on the D$_{3h}$ monolayers that lack inversion symmetry. The new polymorphs are thermodynamically and kinetically stable, and exhibit a wider optical spectral response and larger electron mobilities compared to the known polymorphs. We discuss opportunities to synthesize these newly discovered polymorphs and viable routes to identify them by X-ray diffraction, Raman spectroscopy and second harmonic generation experiments.

preprint2020arXiv

Proving Non-Inclusion of Büchi Automata based on Monte Carlo Sampling

The search for a proof of correctness and the search for counterexamples (bugs) are complementary aspects of verification. In order to maximize the practical use of verification tools it is better to pursue them at the same time. While this is well-understood in the termination analysis of programs, this is not the case for the language inclusion analysis of Büchi automata, where research mainly focused on improving algorithms for proving language inclusion, with the search for counterexamples left to the expensive complementation operation. In this paper, we present $\mathsf{IMC}^2$, a specific algorithm for proving Büchi automata non-inclusion $\mathcal{L}(\mathcal{A}) \not\subseteq \mathcal{L}(\mathcal{B})$, based on Grosu and Smolka's algorithm $\mathsf{MC}^2$ developed for Monte Carlo model checking against LTL formulas. The algorithm we propose takes $M = \lceil \ln δ/ \ln (1-ε) \rceil$ random lasso-shaped samples from $\mathcal{A}$ to decide whether to reject the hypothesis $\mathcal{L}(\mathcal{A}) \not\subseteq \mathcal{L}(\mathcal{B})$, for given error probability $ε$ and confidence level $1 - δ$. With such a number of samples, $\mathsf{IMC}^2$ ensures that the probability of witnessing $\mathcal{L}(\mathcal{A}) \not\subseteq \mathcal{L}(\mathcal{B})$ via further sampling is less than $δ$, under the assumption that the probability of finding a lasso counterexample is larger than $ε$. Extensive experimental evaluation shows that $\mathsf{IMC}^2$ is a fast and reliable way to find counterexamples to Büchi automata inclusion.

preprint2020arXiv

Stochastic Optimization for Non-convex Inf-Projection Problems

In this paper, we study a family of non-convex and possibly non-smooth inf-projection minimization problems, where the target objective function is equal to minimization of a joint function over another variable. This problem include difference of convex (DC) functions and a family of bi-convex functions as special cases. We develop stochastic algorithms and establish their first-order convergence for finding a (nearly) stationary solution of the target non-convex function under different conditions of the component functions. To the best of our knowledge, this is the first work that comprehensively studies stochastic optimization of non-convex inf-projection minimization problems with provable convergence guarantee. Our algorithms enable efficient stochastic optimization of a family of non-decomposable DC functions and a family of bi-convex functions. To demonstrate the power of the proposed algorithms we consider an important application in variance-based regularization. Experiments verify the effectiveness of our inf-projection based formulation and the proposed stochastic algorithm in comparison with previous stochastic algorithms based on the min-max formulation for achieving the same effect.

preprint2016arXiv

A Simple Homotopy Proximal Mapping for Compressive Sensing

In this paper, we present a novel yet simple homotopy proximal mapping algorithm for compressive sensing. The algorithm adopts a simple proximal mapping of the $\ell_1$ norm at each iteration and gradually reduces the regularization parameter for the $\ell_1$ norm. We prove a global linear convergence of the proposed homotopy proximal mapping (HPM) algorithm for solving compressive sensing under three different settings (i) sparse signal recovery under noiseless measurements, (ii) sparse signal recovery under noisy measurements, and (iii) nearly-sparse signal recovery under sub-gaussian noisy measurements. In particular, we show that when the measurement matrix satisfies Restricted Isometric Properties (RIP), our theoretical results in settings (i) and (ii) almost recover the best condition on the RIP constants for compressive sensing. In addition, in setting (iii), our results for sparse signal recovery are better than the previous results, and furthermore our analysis explicitly exhibits that more observations lead to not only more accurate recovery but also faster convergence. Compared with previous studies on linear convergence for sparse signal recovery, our algorithm is simple and efficient, and our results are better and provide more insights. Finally our empirical studies provide further support for the proposed homotopy proximal mapping algorithm and verify the theoretical results.

preprint2016arXiv

A weighted pair graph representation for reconstructibility of Boolean control networks

A new concept of weighted pair graphs (WPGs) is proposed to represent a new reconstructibility definition for Boolean control networks (BCNs), which is a generalization of the reconstructibility definition given in [Fornasini & Valcher, TAC2013, Def. 4]. Based on the WPG representation, an effective algorithm for determining the new reconstructibility notion for BCNs is designed with the help of the theories of finite automata and formal languages. We prove that a BCN is not reconstructible iff its WPG has a complete subgraph. Besides, we prove that a BCN is reconstructible in the sense of [Fornasini & Valcher, TAC2013, Def. 4] iff its WPG has no cycles, which is simpler to be checked than the condition in [Fornasini & Valcher, TAC2013, Thm. 4].

preprint2016arXiv

An Efficient Synthesis Algorithm for Parametric Markov Chains Against Linear Time Properties

In this paper, we propose an efficient algorithm for the parameter synthesis of PLTL formulas with respect to parametric Markov chains. The PLTL formula is translated to an almost fully partitioned Büchi automaton which is then composed with the parametric Markov chain. We then reduce the problem to solving an optimisation problem, allowing to decide the satisfaction of the formula using an SMT solver. The algorithm works also for interval Markov chains. The complexity is linear in the size of the Markov chain, and exponential in the size of the formula. We provide a prototype and show the efficiency of our approach on a number of benchmarks.

preprint2016arXiv

Design of ternary alkaline-earth metal Sn(II) oxides with potential good p-type conductivity

Oxides with good p-type conductivity have been long sought after to achieve high performance all-oxide optoelectronic devices. Divalent Sn(II) based oxides are promising candidates because of their rather dispersive upper valence bands caused by the Sn-5s/O-2p anti-bonding hybridization. There are so far few known Sn(II) oxides being p-type conductive suitable for device applications. Here, we present via first-principles global optimization structure searches a material design study for a hitherto unexplored Sn(II)-based system, ternary alkaline-earth metal Sn(II) oxides in the stoichiometry of MSn2O3 (M = Mg, Ca, Sr, Ba). We identify two stable compounds of SrSn2O3 and BaSn2O3, which can be stabilized by Sn-rich conditions in phase stability diagrams. Their structures follow the Zintl behaviour and consist of basic structural motifs of SnO3 tetrahedra. Unexpectedly they show distinct electronic properties with band gaps ranging from 1.90 (BaSn2O3) to 3.15 (SrSn2O3) eV, and hole effective masses ranging from 0.87 (BaSn2O3) to above 6.0 (SrSn2O3) m0. Further exploration of metastable phases indicates a wide tunability of electronic properties controlled by the details of the bonding between the basic structural motifs. This suggests further exploration of alkaline-earth metal Sn(II) oxides for potential applications requiring good p-type conductivity such as transparent conductors and photovoltaic absorbers.

preprint2016arXiv

Efficient Non-oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee

In this paper, we address learning problems for high dimensional data. Previously, oblivious random projection based approaches that project high dimensional features onto a random subspace have been used in practice for tackling high-dimensionality challenge in machine learning. Recently, various non-oblivious randomized reduction methods have been developed and deployed for solving many numerical problems such as matrix product approximation, low-rank matrix approximation, etc. However, they are less explored for the machine learning tasks, e.g., classification. More seriously, the theoretical analysis of excess risk bounds for risk minimization, an important measure of generalization performance, has not been established for non-oblivious randomized reduction methods. It therefore remains an open problem what is the benefit of using them over previous oblivious random projection based approaches. To tackle these challenges, we propose an algorithmic framework for employing non-oblivious randomized reduction method for general empirical risk minimizing in machine learning tasks, where the original high-dimensional features are projected onto a random subspace that is derived from the data with a small matrix approximation error. We then derive the first excess risk bound for the proposed non-oblivious randomized reduction approach without requiring strong assumptions on the training data. The established excess risk bound exhibits that the proposed approach provides much better generalization performance and it also sheds more insights about different randomized reduction approaches. Finally, we conduct extensive experiments on both synthetic and real-world benchmark datasets, whose dimension scales to $O(10^7)$, to demonstrate the efficacy of our proposed approach.

preprint2016arXiv

High-pressure Phase Stability and Superconductivity of Pnictogen Hydrides and Chemical Trends for Compressed Hydrides

Binary hydrides formed by the pnictogens of phosphorus, arsenic and antimony are studied at high pressures using first principles methods. Stable structures are predicted and their electronic, vibrational and superconducting properties are investigated. We predict that SbH$_{4}$ and AsH$_{8}$ will be high-temperature superconductors at megabar pressures, with critical temperatures in excess of 100 K. The highly symmetric hexagonal SbH$_{4}$ phase is predicted to be stabilized above about 150 GPa, which is readily achievable in diamond anvil cell experiments. We find that all phosphorus hydrides are metastable with respect to decomposition into the elements within the pressure range studied. Trends based on our results and literature data reveal a connection between the high-pressure behaviors and ambient-pressure chemical quantities which provides insight into understanding which elements may form hydrogen-rich high-temperature superconducting phases at high pressures.

preprint2016arXiv

Intrinsic ultralow lattice thermal conductivity of the unfilled skutterudite FeSb$_3$

It has been generally accepted that unfilled skutterudites process high lattice thermal conductivity ($κ_{l}$) that can be efficiently reduced upon filling. Here by using first principles Boltzmann-Peierls transport calculations, we find pure skutterudite of FeSb$_3$ with no filler in fact has an intrinsic ultralow $κ_{l}$ smaller than that of CoSb$_3$ by one order of magnitude. The value is even smaller than those of most of the fully filled skutterudites. This finding means that with FeSb$_3$ as a reference, filling does not necessarily lower $κ_{l}$. The ultralow $κ_{l}$ of FeSb$_3$ is a consequence of much softened optical phonon branches associated with the weakly bonded Sb$_4$ rings. They overlap more with heat-carrying acoustic phonons and significantly increase the phase space for three-phonon anharmonic scattering processes. This provides an alternative non-filling related mechanism for lowering the $κ_{l}$ of skutterudites.

preprint2016arXiv

Inverse Design of Inorganic Electrides

Electrides are ionic solids that consist of cationic frameworks and anionic electrons trapped in the voids of lattices. Organic electrides exist in a large abundance, but the thermal instability at room temperature and sensitivity to moisture are bottlenecks that limit their practical uses. Known inorganic electrides are rare but appear to have high thermal and chemical stability and exhibit promising applications as electron-emitting materials, superior catalysts and strong reducing agents. Here, we report a developed inverse-design method that can be used to search for a large variety of inorganic electrides. Our method utilizes the intrinsic property of interstitial electron localization of electrides as the global variable function being incorporated into the swarm-intelligence based structure searches. Through screening 99 binary ionic compounds, we have designed 89 new inorganic electrides that are classified into three-, two-, and zero-dimensional species according to the way that the interstitial electrons are localized and the conductive properties of the systems. Our work reveals the rich abundance of inorganic electrides by extending them into more general forms and provides new structure types for electrides that are not thought of as before.

preprint2016arXiv

Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections

We consider stochastic strongly convex optimization with a complex inequality constraint. This complex inequality constraint may lead to computationally expensive projections in algorithmic iterations of the stochastic gradient descent~(SGD) methods. To reduce the computation costs pertaining to the projections, we propose an Epoch-Projection Stochastic Gradient Descent~(Epro-SGD) method. The proposed Epro-SGD method consists of a sequence of epochs; it applies SGD to an augmented objective function at each iteration within the epoch, and then performs a projection at the end of each epoch. Given a strongly convex optimization and for a total number of $T$ iterations, Epro-SGD requires only $\log(T)$ projections, and meanwhile attains an optimal convergence rate of $O(1/T)$, both in expectation and with a high probability. To exploit the structure of the optimization problem, we propose a proximal variant of Epro-SGD, namely Epro-ORDA, based on the optimal regularized dual averaging method. We apply the proposed methods on real-world applications; the empirical results demonstrate the effectiveness of our methods.

preprint2016arXiv

Sn(II)-containing phosphates as optoelectronic materials

We theoretically investigate Sn(II) phosphates as optoelectronic materials using first principles calculations. We focus on known prototype materials Sn$_n$P$_2$O$_{5+n}$ (n=2, 3, 4, 5) and a previously unreported compound, SnP$_2$O$_6$ (n=1), which we find using global optimization structure prediction. The electronic structure calculations indicate that these compounds all have large band gaps above 3.2 eV, meaning their transparency to visible light. Several of these compounds show relatively low hole effective masses ($\sim$2-3 m$_0$), comparable the electron masses. This suggests potential bipolar conductivity depending on doping. The dispersive valence band-edges underlying the low hole masses, originate from the anti-bonding hybridization between the Sn 5s orbitals and the phosphate groups. Analysis of structure-property relationships for the metastable structures generated during structure search shows considerable variation in combinations of band gap and carrier effective masses, implying chemical tunability of these properties. The unusual combinations of relatively high band gap, low carrier masses and high chemical stability suggests possible optoelectronic applications of these Sn(II) phosphates, including p-type transparent conductors. Related to this, calculations for doped material indicate low visible light absorption, combined with high plasma frequencies.

preprint2016arXiv

Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach

In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer. Under the assumption that there exists a (approximately) sparse solution with high classification accuracy, we argue that the dual solution is also sparse or approximately sparse. The fact that both primal and dual solutions are sparse motivates us to develop a randomized approach for a general convex-concave optimization problem. Specifically, the proposed approach combines the strength of random projection with that of sparse learning: it utilizes random projection to reduce the dimensionality, and introduces $\ell_1$-norm regularization to alleviate the approximation error caused by random projection. Theoretical analysis shows that under favored conditions, the randomized algorithm can accurately recover the optimal solutions to the convex-concave optimization problem (i.e., recover both the primal and dual solutions).

preprint2016arXiv

Synthesising Strategy Improvement and Recursive Algorithms for Solving 2.5 Player Parity Games

2.5 player parity games combine the challenges posed by 2.5 player reachability games and the qualitative analysis of parity games. These two types of problems are best approached with different types of algorithms: strategy improvement algorithms for 2.5 player reachability games and recursive algorithms for the qualitative analysis of parity games. We present a method that - in contrast to existing techniques - tackles both aspects with the best suited approach and works exclusively on the 2.5 player game itself. The resulting technique is powerful enough to handle games with several million states.

preprint2016arXiv

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

This work focuses on dynamic regret of online convex optimization that compares the performance of online learning to a clairvoyant who knows the sequence of loss functions in advance and hence selects the minimizer of the loss function at each step. By assuming that the clairvoyant moves slowly (i.e., the minimizers change slowly), we present several improved variation-based upper bounds of the dynamic regret under the true and noisy gradient feedback, which are {\it optimal} in light of the presented lower bounds. The key to our analysis is to explore a regularity metric that measures the temporal changes in the clairvoyant's minimizers, to which we refer as {\it path variation}. Firstly, we present a general lower bound in terms of the path variation, and then show that under full information or gradient feedback we are able to achieve an optimal dynamic regret. Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback. Moreover, for a sequence of smooth loss functions that admit a small variation in the gradients, our dynamic regret under the two-point bandit feedback matches what is achieved with full information.

preprint2016arXiv

Verify LTL with Fairness Assumptions Efficiently

This paper deals with model checking problems with respect to LTL properties under fairness assumptions. We first present an efficient algorithm to deal with a fragment of fairness assumptions and then extend the algorithm to handle arbitrary %fairness assumptions ones. Notably, by making use of some syntactic transformations, our algorithm avoids to construct corresponding Büchi automata for the whole fairness assumptions, which can be very large in practice. We implement our algorithm in NuSMV and consider a large selection of formulas. Our experiments show that in many cases our approach exceeds the automata-theoretic approach up to several orders of magnitude, in both time and memory.

Lijun Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

82 published item(s)

Deep But Reliable: Advancing Multi-turn Reasoning for Thinking with Images

Distributed Online Convex Optimization with Efficient Communication: Improved Algorithm and Lower bounds

Stable Routing for Mixture-of-Experts in Class-Incremental Learning

Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs

When and Why SignSGD Outperforms SGD: A Theoretical Study Based on $\ell_1$-norm Lower Bounds

Constrained Language Model Policy Optimization via Risk-aware Stepwise Alignment

ECSAS: Exploring Critical Scenarios from Action Sequence in Autonomous Driving

A Tree-Structured Multi-Task Model Recommender

Adaptive Deep Learning for Entity Resolution by Risk Analysis

Defensive Design of Saturating Counters Based on Differential Privacy

Divide-and-Conquer Determinization of Büchi Automata based on SCC Decomposition

Inorganic Crystal Structure Prototype Database based on Unsupervised Learning of Local Atomic Environments

Momentum Accelerates the Convergence of Stochastic AUPRC Maximization

Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization

Rethinking Hard-Parameter Sharing in Multi-Domain Learning

Towards Practical Robustness Analysis for DNNs based on PAC-Model Learning

Weight Expansion: A New Perspective on Dropout and Generalization

Analyzing Deep Neural Networks with Symbolic Propagation: Towards Higher Precision and Faster Verification

Adaptive and Efficient Algorithms for Tracking the Best Expert

Bandit Convex Optimization in Non-stationary Environments

Helium Incorporation Stabilized Direct-gap Silicides

Minimizing Dynamic Regret and Adaptive Regret Simultaneously

Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs

New Polymorphs of Two-Dimensional Indium Selenide with Enhanced Electronic Properties

Proving Non-Inclusion of Büchi Automata based on Monte Carlo Sampling

Stochastic Optimization for Non-convex Inf-Projection Problems

A Simple Homotopy Proximal Mapping for Compressive Sensing

A weighted pair graph representation for reconstructibility of Boolean control networks

An Efficient Synthesis Algorithm for Parametric Markov Chains Against Linear Time Properties

Design of ternary alkaline-earth metal Sn(II) oxides with potential good p-type conductivity

Efficient Non-oblivious Randomized Reduction for Risk Minimization with Improved Excess Risk Guarantee

High-pressure Phase Stability and Superconductivity of Pnictogen Hydrides and Chemical Trends for Compressed Hydrides

Intrinsic ultralow lattice thermal conductivity of the unfilled skutterudite FeSb$_3$

Inverse Design of Inorganic Electrides

Optimal Stochastic Strongly Convex Optimization with a Logarithmic Number of Projections

Sn(II)-containing phosphates as optoelectronic materials

Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach

Synthesising Strategy Improvement and Recursive Algorithms for Solving 2.5 Player Parity Games

Tracking Slowly Moving Clairvoyant: Optimal Dynamic Regret of Online Learning with True and Noisy Gradient

Verify LTL with Fairness Assumptions Efficiently

A Simple Probabilistic Extension of Modal Mu-calculus

An Explicit Sampling Dependent Spectral Error Bound for Column Subset Selection

ATLAS: A Real-Space Finite-Difference Implementation of Orbital-Free Density Functional Theory

Counterexample-Guided Polynomial Loop Invariant Generation by Lagrange Interpolation

Distribution-based Bisimulation and Bisimulation Metric in Probabilistic Automata

Extending Hybrid CSP with Probability and Stochasticity

Fast Sparse Least-Squares Regression with Non-Asymptotic Guarantees

Intrinsic transparent conductors without doping

Lazy Probabilistic Model Checking without Determinisation

N$_2$H: A Novel Polymeric Hydronitrogen as a High Energy Density Material

Observability of Boolean control networks: A unified approach based on the theories of finite automata

Online Stochastic Linear Optimization under One-bit Feedback

Phase Diagram and High-Temperature Superconductivity of Compressed Selenium Hydrides

Stochastic Proximal Gradient Descent for Nuclear Norm Regularization

Tellurium Hydrides at High Pressures: High-temperature Superconductors

Theory of Dual-sparse Regularized Randomized Reduction

Towards Making High Dimensional Distance Metric Learning Practical

Tuning Optical Properties of Transparent Conducting Barium Stannate by Dimensional Reduction

Beating the Minimax Rate of Active Learning with Prior Knowledge

Binary Excess Risk for Smooth Convex Surrogates

Fast LTL Satisfiability Checking by SAT Solvers

High-order S-Lemma with application to stability of a class of switched nonlinear systems

Late Weak Bisimulation for Markov Automata

LTLf satisfiability checking

Model Checking CSL for Markov Population Models

Probably Safe or Live

Recovering the Optimal Solution by Dual Random Projection

When Equivalence and Bisimulation Join Forces in Probabilistic Automata

Bisimulations and Logical Characterizations on Continuous-time Markov Decision Processes

Bisimulations Meet PCTL Equivalences for Probabilistic Automata

Genetic Design of Enhanced Valley Splitting towards a Spin Qubit in Silicon

Polsat: A Portfolio LTL Satisfiability Solver

Efficient CSL Model Checking Using Stratification

Multiple Kernel Learning from Noisy Labels by Stochastic Programming