Source author record

Bo Jiang

Bo Jiang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

53works

23topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

Despite significant progress, RGB-based trackers remain vulnerable to challenging imaging conditions, such as low illumination and fast motion. Event cameras offer a promising alternative by asynchronously capturing pixel-wise brightness changes, providing high dynamic range and high temporal resolution. However, existing event-based trackers often neglect the intrinsic spatial sparsity and temporal density of event data, while relying on a single fixed temporal-window sampling strategy that is suboptimal under varying motion dynamics. In this paper, we propose an event sparsity-aware tracking framework that explicitly models event-density variations across multiple temporal scales. Specifically, the proposed framework progressively injects sparse, medium-density, and dense event search regions into a three-stage Vision Transformer backbone, enabling hierarchical multi-density feature learning. Furthermore, we introduce a sparsity-aware Mixture-of-Experts module to encourage expert specialization under different sparsity patterns, and design a dynamic pondering strategy to adaptively adjust the inference depth according to tracking difficulty. Extensive experiments on FE240hz, COESOT, and EventVOT demonstrate that the proposed approach achieves a favorable trade-off between tracking accuracy and computational efficiency. The source code will be released on https://github.com/Event-AHU/OpenEvTracking.

preprint2026arXiv

Joint Consistency: A Unified Test-Time Aggregation Framework via Energy Minimization

This paper studies test-time aggregation, an approach that generates multiple reasoning traces and aggregates them into a final answer. Most existing methods rely on evaluation signals collected from candidate traces in isolation or answer frequencies, while ignoring comparative interactions among candidates. We propose Joint Consistency (JC), formulated as a constrained Ising-type energy minimization problem, where independent evaluation signals act as external fields and pairwise comparisons act as interactions. JC provides a unified framework for test-time aggregation that subsumes existing voting and weighted aggregation methods as special cases. Our construction of the interaction matrix leverages LLM-as-a-judge comparisons, and admits a theoretical interpretation under answer-level homogeneity assumptions. Moreover, we develop an efficient approximation strategy that makes interaction modeling practical for large-scale test-time aggregation. Experiments on math and code reasoning benchmarks show that JC consistently outperforms existing baselines across tasks, judge models, trace budgets, and trace-generation settings.

preprint2026arXiv

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

Diffusion Language Models (dLLMs) have garnered significant attention for their potential in highly parallel processing. The parallel capabilities of existing dLLMs stem from the assumption of conditional independence at high confidence levels, which ensures negligible discrepancy between the marginal and joint distributions. However, the stringent confidence thresholds required to preserve accuracy severely constrain the scalability of parallelism. Through systematic token-level statistical analysis, we reveal that a substantial proportion of tokens converge to their correct predictions early in the denoising process yet fail to reach standard confidence thresholds, confirming that current confidence-based criteria are overly conservative. In response, we introduce LEAP (Lookahead Early-Convergence Token Detection for Accelerated Parallel Decoding). LEAP is a training-free, plug-and-play method that leverages future context filtering and multi-sequence superposition to detect early-converging tokens. By validating the alignment between early convergence and correctness, we enable reliable early decoding of these tokens. Benchmarking across diverse domains demonstrates that LEAP significantly lowers inference latency and decoding steps. Compared to confidence-based decoding, the average number of denoising steps is reduced by about 30%. On the GSM8K dataset, combining LEAP with dParallel accelerates decoding to 7.2 tokens per step while preserving model precision. LEAP effectively breaks the reliance on high-confidence priors, offering a novel paradigm for parallel decoding.

preprint2026arXiv

Some new results on determinants and permanents

In this paper we confirm several conjectures on determinants and permanents. For example, we prove that for any prime $p\equiv3\pmod 4$ the number $2\det[a_{jk}]_{0\le j,k\le (p-1)/2}$ is congruent to a square modulo $p$, where $a_{jk}=(\frac{j+k}{p})+(\frac{j^2+k^2}{p})$ with $(\frac{\cdot}{p})$ the Legendre symbol. We also prove that ${\rm per}[j^{k-1}]_{1\leq j,k\leq n-1}\equiv0\pmod n$ for any integer $n>1$ with $n\not\equiv2\pmod 4$.

preprint2025arXiv

Benchmarking LLMs for Fine-Grained Code Review with Enriched Context in Practice

Code review is a cornerstone of software quality assurance, and recent advances in Large Language Models (LLMs) have shown promise in its automation. However, existing benchmarks for LLM-based code review face three major limitations. Lack of semantic context: most benchmarks provide only code diffs without textual information such as issue descriptions, which are crucial for understanding developer intent. Data quality issues: without rigorous validation, many samples are noisy-e.g., reviews on outdated or irrelevant code-reducing evaluation reliability. Coarse granularity: most benchmarks operate at the file or commit level, overlooking the fine-grained, line-level reasoning essential for precise review. We introduce ContextCRBench, a high-quality, context-rich benchmark for fine-grained LLM evaluation in code review. Our construction pipeline comprises: Raw Data Crawling, collecting 153.7K issues and pull requests from top-tier repositories; Comprehensive Context Extraction, linking issue-PR pairs for textual context and extracting the full surrounding function or class for code context; and Multi-stage Data Filtering, combining rule-based and LLM-based validation to remove outdated, malformed, or low-value samples, resulting in 67,910 context-enriched entries. ContextCRBench supports three evaluation scenarios aligned with the review workflow: hunk-level quality assessment, line-level defect localization, and line-level comment generation. Evaluating eight leading LLMs (four closed-source and four open-source) reveals that textual context yields greater performance gains than code context alone, while current LLMs remain far from human-level review ability. Deployed at ByteDance, ContextCRBench drives a self-evolving code review system, improving performance by 61.98% and demonstrating its robustness and industrial utility. https://github.com/kinesiatricssxilm14/ContextCRBench.

preprint2024arXiv

CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras

Existing datasets for RGB-DVS tracking are collected with DVS346 camera and their resolution ($346 \times 260$) is low for practical applications. Actually, only visible cameras are deployed in many practical systems, and the newly designed neuromorphic cameras may have different resolutions. The latest neuromorphic sensors can output high-definition event streams, but it is very difficult to achieve strict alignment between events and frames on both spatial and temporal views. Therefore, how to achieve accurate tracking with unaligned neuromorphic and visible sensors is a valuable but unresearched problem. In this work, we formally propose the task of object tracking using unaligned neuromorphic and visible cameras. We build the first unaligned frame-event dataset CRSOT collected with a specially built data acquisition system, which contains 1,030 high-definition RGB-Event video pairs, 304,974 video frames. In addition, we propose a novel unaligned object tracking framework that can realize robust tracking even using the loosely aligned RGB-Event data. Specifically, we extract the template and search regions of RGB and Event data and feed them into a unified ViT backbone for feature embedding. Then, we propose uncertainty perception modules to encode the RGB and Event features, respectively, then, we propose a modality uncertainty fusion module to aggregate the two modalities. These three branches are jointly optimized in the training phase. Extensive experiments demonstrate that our tracker can collaborate the dual modalities for high-performance tracking even without strictly temporal and spatial alignment. The source code, dataset, and pre-trained models will be released at https://github.com/Event-AHU/Cross_Resolution_SOT.

preprint2024arXiv

Revisiting Color-Event based Tracking: A Unified Network, Dataset, and Metric

Combining the Color and Event cameras (also called Dynamic Vision Sensors, DVS) for robust object tracking is a newly emerging research topic in recent years. Existing color-event tracking framework usually contains multiple scattered modules which may lead to low efficiency and high computational complexity, including feature extraction, fusion, matching, interactive learning, etc. In this paper, we propose a single-stage backbone network for Color-Event Unified Tracking (CEUTrack), which achieves the above functions simultaneously. Given the event points and RGB frames, we first transform the points into voxels and crop the template and search regions for both modalities, respectively. Then, these regions are projected into tokens and parallelly fed into the unified Transformer backbone network. The output features will be fed into a tracking head for target object localization. Our proposed CEUTrack is simple, effective, and efficient, which achieves over 75 FPS and new SOTA performance. To better validate the effectiveness of our model and address the data deficiency of this task, we also propose a generic and large-scale benchmark dataset for color-event tracking, termed COESOT, which contains 90 categories and 1354 video sequences. Additionally, a new evaluation metric named BOC is proposed in our evaluation toolkit to evaluate the prominence with respect to the baseline methods. We hope the newly proposed method, dataset, and evaluation metric provide a better platform for color-event-based tracking. The dataset, toolkit, and source code will be released on: \url{https://github.com/Event-AHU/COESOT}.

preprint2024arXiv

Understanding Representation Learnability of Nonlinear Self-Supervised Learning

Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks. There are only a few theoretical works on data representation learnability, and many of those focus on final data representation, treating the nonlinear neural network as a ``black box". However, the accurate learning results of neural networks are crucial for describing the data distribution features learned by SSL models. Our paper is the first to analyze the learning results of the nonlinear SSL model accurately. We consider a toy data distribution that contains two features: the label-related feature and the hidden feature. Unlike previous linear setting work that depends on closed-form solutions, we use the gradient descent algorithm to train a 1-layer nonlinear SSL model with a certain initialization region and prove that the model converges to a local minimum. Furthermore, different from the complex iterative analysis, we propose a new analysis process which uses the exact version of Inverse Function Theorem to accurately describe the features learned by the local minimum. With this local minimum, we prove that the nonlinear SSL model can capture the label-related feature and hidden feature at the same time. In contrast, the nonlinear supervised learning (SL) model can only learn the label-related feature. We also present the learning processes and results of the nonlinear SSL and SL model via simulation experiments.

preprint2024arXiv

Unifying Graph Contrastive Learning via Graph Message Augmentation

Graph contrastive learning is usually performed by first conducting Graph Data Augmentation (GDA) and then employing a contrastive learning pipeline to train GNNs. As we know that GDA is an important issue for graph contrastive learning. Various GDAs have been developed recently which mainly involve dropping or perturbing edges, nodes, node attributes and edge attributes. However, to our knowledge, it still lacks a universal and effective augmentor that is suitable for different types of graph data. To address this issue, in this paper, we first introduce the graph message representation of graph data. Based on it, we then propose a novel Graph Message Augmentation (GMA), a universal scheme for reformulating many existing GDAs. The proposed unified GMA not only gives a new perspective to understand many existing GDAs but also provides a universal and more effective graph data augmentation for graph self-supervised learning tasks. Moreover, GMA introduces an easy way to implement the mixup augmentor which is natural for images but usually challengeable for graphs. Based on the proposed GMA, we then propose a unified graph contrastive learning, termed Graph Message Contrastive Learning (GMCL), that employs attribution-guided universal GMA for graph contrastive learning. Experiments on many graph learning tasks demonstrate the effectiveness and benefits of the proposed GMA and GMCL approaches.

preprint2023arXiv

Flexible Alignment Super-Resolution Network for Multi-Contrast MRI

Magnetic resonance imaging plays an essential role in clinical diagnosis by acquiring the structural information of biological tissue. Recently, many multi-contrast MRI super-resolution networks achieve good effects. However, most studies ignore the impact of the inappropriate foreground scale and patch size of multi-contrast MRI, which probably leads to inappropriate feature alignment. To tackle this problem, we propose the Flexible Alignment Super-Resolution Network (FASR-Net) for multi-contrast MRI Super-Resolution. The Flexible Alignment module of FASR-Net consists of two modules for feature alignment. (1) The Single-Multi Pyramid Alignment(S-A) module solves the situation where low-resolution (LR) images and reference (Ref) images have different scales. (2) The Multi-Multi Pyramid Alignment(M-A) module solves the situation where LR and Ref images have the same scale. Besides, we propose the Cross-Hierarchical Progressive Fusion (CHPF) module aiming at fusing the features effectively, further improving the image quality. Compared with other state-of-the-art methods, FASR-net achieves the most competitive results on FastMRI and IXI datasets. Our code will be available at \href{https://github.com/yimingliu123/FASR-Net}{https://github.com/yimingliu123/FASR-Net}.

preprint2023arXiv

FuncPipe: A Pipelined Serverless Framework for Fast and Cost-efficient Training of Deep Learning Models

Training deep learning (DL) models in the cloud has become a norm. With the emergence of serverless computing and its benefits of true pay-as-you-go pricing and scalability, systems researchers have recently started to provide support for serverless-based training. However, the ability to train DL models on serverless platforms is hindered by the resource limitations of today's serverless infrastructure and DL models' explosive requirement for memory and bandwidth. This paper describes FuncPipe, a novel pipelined training framework specifically designed for serverless platforms that enable fast and low-cost training of DL models. FuncPipe is designed with the key insight that model partitioning can be leveraged to bridge both memory and bandwidth gaps between the capacity of serverless functions and the requirement of DL training. Conceptually simple, we have to answer several design questions, including how to partition the model, configure each serverless function, and exploit each function's uplink/downlink bandwidth. In particular, we tailor a micro-batch scheduling policy for the serverless environment, which serves as the basis for the subsequent optimization. Our Mixed-Integer Quadratic Programming formulation automatically and simultaneously configures serverless resources and partitions models to fit within the resource constraints. Lastly, we improve the bandwidth efficiency of storage-based synchronization with a novel pipelined scatter-reduce algorithm. We implement FuncPipe on two popular cloud serverless platforms and show that it achieves 7%-77% cost savings and 1.3X-2.2X speedup compared to state-of-the-art serverless-based frameworks.

preprint2023arXiv

Understanding the convergence of the preconditioned PDHG method: a view of indefinite proximal ADMM

The primal-dual hybrid gradient (PDHG) algorithm is popular in solving min-max problems which are being widely used in a variety of areas. To improve the applicability and efficiency of PDHG for different application scenarios, we focus on the preconditioned PDHG (PrePDHG) algorithm, which is a framework covering PDHG, alternating direction method of multipliers (ADMM), and other methods. We give the optimal convergence condition of PrePDHG in the sense that the key parameters in the condition can not be further improved, which fills the theoretical gap in the-state-of-art convergence results of PrePDHG, and obtain the ergodic and non-ergodic sublinear convergence rates of PrePDHG. The theoretical analysis is achieved by establishing the equivalence between PrePDHG and indefinite proximal ADMM. Besides, we discuss various choices of the proximal matrices in PrePDHG and derive some interesting results. For example, the convergence condition of diagonal PrePDHG is improved to be tight, the dual stepsize of the balanced augmented Lagrangian method can be enlarged to $4/3$ from $1$, and a balanced augmented Lagrangian method with symmetric Gauss-Seidel iterations is also explored. Numerical results on the matrix game, projection onto the Birkhoff polytope, earth mover's distance, and CT reconstruction verify the effectiveness and superiority of PrePDHG.

preprint2022arXiv

A Novel Negative $\ell_1$ Penalty Approach for Multiuser One-Bit Massive MIMO Downlink with PSK Signaling

This paper considers the one-bit precoding problem for the multiuser downlink massive multiple-input multiple-output (MIMO) system with phase shift keying (PSK) modulation and focuses on the celebrated constructive interference (CI)-based problem formulation. The existence of the discrete one-bit constraint makes the problem generally hard to solve. In this paper, we propose an efficient negative $\ell_1$ penalty approach for finding a high-quality solution of the considered problem. Specifically, we first propose a novel negative $\ell_1$ penalty model, which penalizes the one-bit constraint into the objective with a negative $\ell_1$-norm term, and show the equivalence between (global and local) solutions of the original problem and the penalty problem when the penalty parameter is sufficiently large. We further transform the penalty model into an equivalent min-max problem and propose an efficient alternating optimization (AO) algorithm for solving it. The AO algorithm enjoys low per-iteration complexity and is guaranteed to converge to the stationary point of the min-max problem. Numerical results show that, compared against the state-of-the-art CI-based algorithms, the proposed algorithm generally achieves better bit-error-rate (BER) performance with lower computational cost.

preprint2022arXiv

A Unified Framework for Generalized Moment Problems: a Novel Primal-Dual Approach

Generalized moment problems optimize functional expectation over a class of distributions with generalized moment constraints, i.e., the function in the moment can be any measurable function. These problems have recently attracted growing interest due to their great flexibility in representing nonstandard moment constraints, such as geometry-mean constraints, entropy constraints, and exponential-type moment constraints. Despite the increasing research interest, analytical solutions are mostly missing for these problems, and researchers have to settle for nontight bounds or numerical approaches that are either suboptimal or only applicable to some special cases. In addition, the techniques used to develop closed-form solutions to the standard moment problems are tailored for specific problem structures. In this paper, we propose a framework that provides a unified treatment for any moment problem. The key ingredient of the framework is a novel primal-dual optimality condition. This optimality condition enables us to reduce the original infinite dimensional problem to a nonlinear equation system with a finite number of variables. In solving three specific moment problems, the framework demonstrates a clear path for identifying the analytical solution if one is available, otherwise, it produces semi-analytical solutions that lead to efficient numerical algorithms. Finally, through numerical experiments, we provide further evidence regarding the performance of the resulting algorithms by solving a moment problem and a distributionally robust newsvendor problem.

preprint2022arXiv

Accelerating Adaptive Cubic Regularization of Newton's Method via Random Sampling

In this paper, we consider an unconstrained optimization model where the objective is a sum of a large number of possibly nonconvex functions, though overall the objective is assumed to be smooth and convex. Our bid to solving such model uses the framework of cubic regularization of Newton's method. As well known, the crux in cubic regularization is its utilization of the Hessian information, which may be computationally expensive for large-scale problems. To tackle this, we resort to approximating the Hessian matrix via sub-sampling. In particular, we propose to compute an approximated Hessian matrix by either \textit{uniformly}\/ or \textit{non-uniformly}\/ sub-sampling the components of the objective. Based upon such sampling strategy, we develop accelerated adaptive cubic regularization approaches and provide theoretical guarantees on global iteration complexity of $O(ε^{-1/3})$ with high probability, which matches that of the original accelerated cubic regularization methods \cite{Jiang-2017-Unified} using the \textit{full}\/ Hessian information. Interestingly, we show that in the worst case scenario our algorithm still achieves an $O\left(\log(ε^{-1})ε^{-5/6}\right)$ iteration complexity bound. The performances of the proposed methods on the regularized logistic regression problems show a clear effect of acceleration in terms of the epoch counts on several real data sets.

preprint2022arXiv

BQA: A High-performance Quantum Circuits Scheduling Strategy Based on Heuristic Search

Currently, quantum computing is developing at a high speed because its high parallelism and high computing power bring new solutions to many fields. However, due to chip process technology, it is difficult to achieve full coupling of all qubits on a quantum chip, so when compiling a quantum circuit onto a physical chip, it is necessary to ensure that the two-qubit gate acts on a pair of coupled qubits by inserting swap gates. It will cause great additional cost when a large number of swap gates are inserted, leading to the execution time of quantum circuits longer. In this paper, we designed a way based on the business to insert swap gates BQA(Busy Qubits Avoid). We exploit the imbalance of the number of gates on qubits, trying to hide the overhead of swap gates. At the same time, we also expect swap gates to make as little negative impact on subsequent two-qubit gates as possible. We have designed a heuristic function that can take into account both of these points. Compared with qiskit, the execution time of the circuit optimized by our proposed method is only 0.5 times that of the qiskit compiled circuit. And when the number of two-qubit gates is large, it will achieve higher level than general conditions. This implies higher execution efficiency and lower decoherence error rate.

preprint2022arXiv

Complexity and computation for the spectral norm and nuclear norm of order three tensors with one fixed dimension

The recent decade has witnessed a surge of research in modelling and computing from two-way data (matrices) to multiway data (tensors). However, there is a drastic phase transition for most tensor optimization problems when the order of a tensor increases from two (a matrix) to three: Most tensor problems are NP-hard while that for matrices are easy. It triggers a question on where exactly the transition occurs. The paper aims to study this kind of question for the spectral norm and the nuclear norm. Although computing the spectral norm for a general $\ell\times m\times n$ tensor is NP-hard, we show that it can be computed in polynomial time if $\ell$ is fixed. This is the same for the nuclear norm. While these polynomial-time methods are not implementable in practice, we propose fully polynomial-time approximation schemes (FPTAS) for the spectral norm based on spherical grids and for the nuclear norm with further help of duality theory and semidefinite optimization. Numerical experiments on simulated data show that our FPTAS can compute these tensor norms for small $\ell \le 6$ but large $m, n\ge50$. To the best of our knowledge, this is the first method that can compute the nuclear norm of general asymmetric tensors. Both our polynomial-time algorithms and FPTAS can be extended to higher-order tensors as well.

preprint2022arXiv

Exploring students' backtracking behaviors in digital textbooks and its relationship to learning styles

The purpose of this study is to explore students' backtracking patterns in using a digital textbook and reveal the relationship between backtracking behaviors and academic performance as well as learning styles. The study was carried out for two semesters on 102 university students and they are required to use a digital textbook system called DITeL to review courseware. Students' backtracking behaviors are characterized by seven backtracking features extracted from interaction log data and their learning styles are measured by Felder-Silverman learning style model. The results of the study reveal that there is a subgroup of students called backtrackers who backtrack more frequently and performed better than the average students. Furthermore, the causal inference analysis reveals that a higher initial ability can directly cause a higher frequency of backtracking, thus affecting the final test score. In addition, most backtrackers are reflective and visual learners, and the seven backtracking features are good predictors in automatically identifying learning styles. Based on the results of qualitative data analysis, recommendations were made on how to provide prompt backtracking assistants and automatically detect learning styles in digital textbooks.

preprint2022arXiv

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

Recently, transformer-based methods have achieved promising progresses in object detection, as they can eliminate the post-processes like NMS and enrich the deep representations. However, these methods cannot well cope with scene text due to its extreme variance of scales and aspect ratios. In this paper, we present a simple yet effective transformer-based architecture for scene text detection. Different from previous approaches that learn robust deep representations of scene text in a holistic manner, our method performs scene text detection based on a few representative features, which avoids the disturbance by background and reduces the computational cost. Specifically, we first select a few representative features at all scales that are highly relevant to foreground text. Then, we adopt a transformer for modeling the relationship of the sampled features, which effectively divides them into reasonable groups. As each feature group corresponds to a text instance, its bounding box can be easily obtained without any post-processing operation. Using the basic feature pyramid network for feature extraction, our method consistently achieves state-of-the-art results on several popular datasets for scene text detection.

preprint2022arXiv

Few-Shot Learning Meets Transformer: Unified Query-Support Transformers for Few-Shot Classification

Few-shot classification which aims to recognize unseen classes using very limited samples has attracted more and more attention. Usually, it is formulated as a metric learning problem. The core issue of few-shot classification is how to learn (1) consistent representations for images in both support and query sets and (2) effective metric learning for images between support and query sets. In this paper, we show that the two challenges can be well modeled simultaneously via a unified Query-Support TransFormer (QSFormer) model. To be specific,the proposed QSFormer involves global query-support sample Transformer (sampleFormer) branch and local patch Transformer (patchFormer) learning branch. sampleFormer aims to capture the dependence of samples in support and query sets for image representation. It adopts the Encoder, Decoder and Cross-Attention to respectively model the Support, Query (image) representation and Metric learning for few-shot classification task. Also, as a complementary to global learning branch, we adopt a local patch Transformer to extract structural representation for each image sample by capturing the long-range dependence of local image patches. In addition, a novel Cross-scale Interactive Feature Extractor (CIFE) is proposed to extract and fuse multi-scale CNN features as an effective backbone module for the proposed few-shot learning method. All modules are integrated into a unified framework and trained in an end-to-end manner. Extensive experiments on four popular datasets demonstrate the effectiveness and superiority of the proposed QSFormer.

preprint2022arXiv

Generalizing Aggregation Functions in GNNs:High-Capacity GNNs via Nonlinear Neighborhood Aggregators

Graph neural networks (GNNs) have achieved great success in many graph learning tasks. The main aspect powering existing GNNs is the multi-layer network architecture to learn the nonlinear graph representations for the specific learning tasks. The core operation in GNNs is message propagation in which each node updates its representation by aggregating its neighbors' representations. Existing GNNs mainly adopt either linear neighborhood aggregation (mean,sum) or max aggregator in their message propagation. (1) For linear aggregators, the whole nonlinearity and network's capacity of GNNs are generally limited due to deeper GNNs usually suffer from over-smoothing issue. (2) For max aggregator, it usually fails to be aware of the detailed information of node representations within neighborhood. To overcome these issues, we re-think the message propagation mechanism in GNNs and aim to develop the general nonlinear aggregators for neighborhood information aggregation in GNNs. One main aspect of our proposed nonlinear aggregators is that they provide the optimally balanced aggregators between max and mean/sum aggregations. Thus, our aggregators can inherit both (i) high nonlinearity that increases network's capacity and (ii) detail-sensitivity that preserves the detailed information of representations together in GNNs' message propagation. Promising experiments on several datasets show the effectiveness of the proposed nonlinear aggregators.

preprint2022arXiv

HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction

In this report, we introduce our solution to the Occupancy and Flow Prediction challenge in the Waymo Open Dataset Challenges at CVPR 2022, which ranks 1st on the leaderboard. We have developed a novel hierarchical spatial-temporal network featured with spatial-temporal encoders, a multi-scale aggregator enriched with latent variables, and a recursive hierarchical 3D decoder. We use multiple losses including focal loss and modified flow trace loss to efficiently guide the training process. Our method achieves a Flow-Grounded Occupancy AUC of 0.8389 and outperforms all the other teams on the leaderboard.

preprint2022arXiv

MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking

Many RGB-T trackers attempt to attain robust feature representation by utilizing an adaptive weighting scheme (or attention mechanism). Different from these works, we propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data by adaptively adjusting the convolutional kernels for various input images in practical tracking. Given the image pairs as input, we first encode their features with the backbone network. Then, we concatenate these feature maps and generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively. Inspired by residual connection, both the generated visible and thermal feature maps will be summarized with input feature maps. The augmented feature maps will be fed into the RoI align module to generate instance-level features for subsequent classification. To address issues caused by heavy occlusion, fast motion and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target driven attention mechanism. The spatial and temporal recurrent neural network is used to capture the direction-aware context for accurate global attention prediction. Extensive experiments on three large-scale RGB-T tracking benchmark datasets validated the effectiveness of our proposed algorithm. The source code of this paper is available at \textcolor{magenta}{\url{https://github.com/wangxiao5791509/MFG_RGBT_Tracking_PyTorch}}.

preprint2022arXiv

Unified GCNs: Towards Connecting GCNs with CNNs

Graph Convolutional Networks (GCNs) have been widely demonstrated their powerful ability in graph data representation and learning. Existing graph convolution layers are mainly designed based on graph signal processing and transform aspect which usually suffer from some limitations, such as over-smoothing, over-squashing and non-robustness, etc. As we all know that Convolution Neural Networks (CNNs) have received great success in many computer vision and machine learning. One main aspect is that CNNs leverage many learnable convolution filters (kernels) to obtain rich feature descriptors and thus can have high capacity to encode complex patterns in visual data analysis. Also, CNNs are flexible in designing their network architecture, such as MobileNet, ResNet, Xception, etc. Therefore, it is natural to arise a question: can we design graph convolutional layer as flexibly as that in CNNs? Innovatively, in this paper, we consider connecting GCNs with CNNs deeply from a general perspective of depthwise separable convolution operation. Specifically, we show that GCN and GAT indeed perform some specific depthwise separable convolution operations. This novel interpretation enables us to better understand the connections between GCNs (GCN, GAT) and CNNs and further inspires us to design more Unified GCNs (UGCNs). As two showcases, we implement two UGCNs, i.e., Separable UGCN (S-UGCN) and General UGCN (G-UGCN) for graph data representation and learning. Promising experiments on several graph representation benchmarks demonstrate the effectiveness and advantages of the proposed UGCNs.

preprint2022arXiv

Varying Coefficient Model via Adaptive Spline Fitting

The varying coefficient model has received broad attention from researchers as it is a powerful dimension reduction tool for non-parametric modeling. Most existing varying coefficient models fitted with polynomial spline assume equidistant knots and take the number of knots as the hyperparameter. However, imposing equidistant knots appears to be too rigid, and determining the optimal number of knots systematically is also a challenge. In this article, we deal with this challenge by utilizing polynomial splines with adaptively selected and predictor-specific knots to fit the coefficients in varying coefficient models. An efficient dynamic programming algorithm is proposed to find the optimal solution. Numerical results show that the new method can achieve significantly smaller mean squared errors for coefficients compared with the equidistant spline fitting method.

preprint2021arXiv

Tightness and Equivalence of Semidefinite Relaxations for MIMO Detection

The multiple-input multiple-output (MIMO) detection problem, a fundamental problem in modern digital communications, is to detect a vector of transmitted symbols from the noisy outputs of a fading MIMO channel. The maximum likelihood detector can be formulated as a complex least-squares problem with discrete variables, which is NP-hard in general. Various semidefinite relaxation (SDR) methods have been proposed in the literature to solve the problem due to their polynomial-time worst-case complexity and good detection error rate performance. In this paper, we consider two popular classes of SDR-based detectors and study the conditions under which the SDRs are tight and the relationship between different SDR models. For the enhanced complex and real SDRs proposed recently by Lu et al., we refine their analysis and derive the necessary and sufficient condition for the complex SDR to be tight, as well as a necessary condition for the real SDR to be tight. In contrast, we also show that another SDR proposed by Mobasher et al. is not tight with high probability under mild conditions. Moreover, we establish a general theorem that shows the equivalence between two subsets of positive semidefinite matrices in different dimensions by exploiting a special "separable" structure in the constraints. Our theorem recovers two existing equivalence results of SDRs defined in different settings and has the potential to find other applications due to its generality.

preprint2020arXiv

\emph{cm}SalGAN: RGB-D Salient Object Detection with Cross-View Generative Adversarial Networks

Image salient object detection (SOD) is an active research topic in computer vision and multimedia area. Fusing complementary information of RGB and depth has been demonstrated to be effective for image salient object detection which is known as RGB-D salient object detection problem. The main challenge for RGB-D salient object detection is how to exploit the salient cues of both intra-modality (RGB, depth) and cross-modality simultaneously which is known as cross-modality detection problem. In this paper, we tackle this challenge by designing a novel cross-modality Saliency Generative Adversarial Network (\emph{cm}SalGAN). \emph{cm}SalGAN aims to learn an optimal view-invariant and consistent pixel-level representation for RGB and depth images via a novel adversarial learning framework, which thus incorporates both information of intra-view and correlation information of cross-view images simultaneously for RGB-D saliency detection problem. To further improve the detection results, the attention mechanism and edge detection module are also incorporated into \emph{cm}SalGAN. The entire \emph{cm}SalGAN can be trained in an end-to-end manner by using the standard deep neural network framework. Experimental results show that \emph{cm}SalGAN achieves the new state-of-the-art RGB-D saliency detection performance on several benchmark datasets.

preprint2020arXiv

A Unified Adaptive Tensor Approximation Scheme to Accelerate Composite Convex Optimization

In this paper, we propose a unified two-phase scheme to accelerate any high-order regularized tensor approximation approach on the smooth part of a composite convex optimization model. The proposed scheme has the advantage of not needing to assume any prior knowledge of the Lipschitz constants for the gradient, the Hessian and/or high-order derivatives. This is achieved by tuning the parameters used in the algorithm \textit{adaptively} in its process of progression, which has been successfully incorporated in high-order nonconvex optimization (CartisGouldToint2018, Birgin-Gardenghi-Martinez-Santos-Toint-2017). In general, we show that the adaptive high-order method has an iteration bound of $O\left( 1 / ε^{1/(p+1)} \right)$ if the first $p$-th order derivative information is used in the approximation, which has the same iteration complexity as in that of the nonadaptive version in (Baes-2009, Nesterov-2018) where the Lipschitz constants are assumed to be known and the subproblems are assumed to be solved exactly. Thus, our results partially address the problem of incorporating adaptive strategies into the high-order {\it accelerated} methods raised by Nesterov in (Nesterov-2018), although our strategies cannot assure the convexity of the auxiliary problem and such adaptive strategies are already popular in high-order nonconvex optimization (CartisGouldToint2018, Birgin-Gardenghi-Martinez-Santos-Toint-2017). Our numerical experiment results show a clear effect of real acceleration displayed in the adaptive Newton's method with cubic regularization on a set of regularized logistic regression instances.

preprint2020arXiv

An Adaptive High Order Method for Finding Third-Order Critical Points of Nonconvex Optimization

It is well known that finding a global optimum is extremely challenging for nonconvex optimization. There are some recent efforts \cite{anandkumar2016efficient, cartis2018second, cartis2020sharp, chen2019high} regarding the optimization methods for computing higher-order critical points, which can exclude the so-called degenerate saddle points and reach a solution with better quality. Desipte theoretical development in \cite{anandkumar2016efficient, cartis2018second, cartis2020sharp, chen2019high}, the corresponding numerical experiments are missing. In this paper, we propose an implementable higher-order method, named adaptive high order method (AHOM), that aims to find the third-order critical points. This is achieved by solving an ``easier'' subproblem and incorporating the adaptive strategy of parameter-tuning in each iteration of the algorithm. The iteration complexity of the proposed method is established. Some preliminary numerical results are provided to show AHOM is able to escape the degenerate saddle points, where the second-order method could possibly get stuck.

preprint2020arXiv

An exact penalty approach for optimization with nonnegative orthogonality constraints

Optimization with nonnegative orthogonality constraints has wide applications in machine learning and data sciences. It is NP-hard due to some combinatorial properties of the constraints. We first propose an equivalent optimization formulation with nonnegative and multiple spherical constraints and an additional single nonlinear constraint. Various constraint qualifications, the first- and second-order optimality conditions of the equivalent formulation are discussed. By establishing a local error bound of the feasible set, we design a class of (smooth) exact penalty models via keeping the nonnegative and multiple spherical constraints. The penalty models are exact if the penalty parameter is sufficiently large other than going to infinity. A practical penalty algorithm with postprocessing is then developed. It uses a second-order method to approximately solve a series of subproblems with nonnegative and multiple spherical constraints. We study the asymptotic convergence of the penalty algorithm and establish that any limit point is a weakly stationary point of the original problem and becomes a stationary point under some additional mild conditions. Extensive numerical results on the projection problem, orthogonal nonnegative matrix factorization problems and the K-indicators model show the effectiveness of our proposed approach.

preprint2020arXiv

An Optimal High-Order Tensor Method for Convex Optimization

This paper is concerned with finding an optimal algorithm for minimizing a composite convex objective function. The basic setting is that the objective is the sum of two convex functions: the first function is smooth with up to the d-th order derivative information available, and the second function is possibly non-smooth, but its proximal tensor mappings can be computed approximately in an efficient manner. The problem is to find -- in that setting -- the best possible (optimal) iteration complexity for convex optimization. Along that line, for the smooth case (without the second non-smooth part in the objective), Nesterov (1983) proposed an optimal algorithm for the first-order methods (d=1) with iteration complexity O( 1 / k^2 ). A high-order tensor algorithm with iteration complexity of O( 1 / k^{d+1} ) was proposed by Baes (2009) and Nesterov (2018). In this paper, we propose a new high-order tensor algorithm for the general composite case, with the iteration complexity of O( 1 / k^{(3d+1)/2} ), which matches the lower bound for the d-th order methods as established in Nesterov (2018), and Shamir et al. (2018), and hence is optimal. Our approach is based on the Accelerated Hybrid Proximal Extragradient (A-HPE) framework proposed in Monteiro and Svaiter (2013), where a bisection procedure is installed for each A-HPE iteration. At each bisection step a proximal tensor subproblem is approximately solved, and the total number of bisection steps per A-HPE iteration is bounded by a logarithmic factor in the precision required.

preprint2020arXiv

Dynamic Graph Learning based on Graph Laplacian

The purpose of this paper is to infer a global (collective) model of time-varying responses of a set of nodes as a dynamic graph, where the individual time series are respectively observed at each of the nodes. The motivation of this work lies in the search for a connectome model which properly captures brain functionality upon observing activities in different regions of the brain and possibly of individual neurons. We formulate the problem as a quadratic objective functional of observed node signals over short time intervals, subjected to the proper regularization reflecting the graph smoothness and other dynamics involving the underlying graph's Laplacian, as well as the time evolution smoothness of the underlying graph. The resulting joint optimization is solved by a continuous relaxation and an introduced novel gradient-projection scheme. We apply our algorithm to a real-world dataset comprising recorded activities of individual brain cells. The resulting model is shown to not only be viable but also efficiently computable.

preprint2020arXiv

EOSFuzzer: Fuzzing EOSIO Smart Contracts for Vulnerability Detection

EOSIO is one typical public blockchain platform. It is scalable in terms of transaction speeds and has a growing ecosystem supporting smart contracts and decentralized applications. However, the vulnerabilities within the EOSIO smart contracts have led to serious attacks, which caused serious financial loss to its end users. In this work, we systematically analyzed three typical EOSIO smart contract vulnerabilities and their related attacks. Then we presented EOSFuzzer, a general black-box fuzzing framework to detect vulnerabilities within EOSIO smart contracts. In particular, EOSFuzzer proposed effective attacking scenarios and test oracles for EOSIO smart contract fuzzing. Our fuzzing experiment on 3963 EOSIO smart contracts shows that EOSFuzzer is both effective and efficient to detect EOSIO smart contract vulnerabilities with high accuracy.

preprint2020arXiv

Revisiting L21-norm Robustness with Vector Outlier Regularization

In many real-world applications, data usually contain outliers. One popular approach is to use L2,1 norm function as a robust error/loss function. However, the robustness of L2,1 norm function is not well understood so far. In this paper, we propose a new Vector Outlier Regularization (VOR) framework to understand and analyze the robustness of L2,1 norm function. Our VOR function defines a data point to be outlier if it is outside a threshold with respect to a theoretical prediction, and regularize it-pull it back to the threshold line. We then prove that L2,1 function is the limiting case of this VOR with the usual least square/L2 error function as the threshold shrinks to zero. One interesting property of VOR is that how far an outlier lies away from its theoretically predicted value does not affect the final regularization and analysis results. This VOR property unmasks one of the most peculiar property of L2,1 norm function: The effects of outliers seem to be independent of how outlying they are-if an outlier is moved further away from the intrinsic manifold/subspace, the final analysis results do not change. VOR provides a new way to understand and analyze the robustness of L2,1 norm function. Applying VOR to matrix factorization leads to a new VORPCA model. We give a comprehensive comparison with trace-norm based L21-norm PCA to demonstrate the advantages of VORPCA.

preprint2020arXiv

Visual Object Tracking by Segmentation with Graph Convolutional Network

Segmentation-based tracking has been actively studied in computer vision and multimedia. Superpixel based object segmentation and tracking methods are usually developed for this task. However, they independently perform feature representation and learning of superpixels which may lead to sub-optimal results. In this paper, we propose to utilize graph convolutional network (GCN) model for superpixel based object tracking. The proposed model provides a general end-to-end framework which integrates i) label linear prediction, and ii) structure-aware feature information of each superpixel together to obtain object segmentation and further improves the performance of tracking. The main benefits of the proposed GCN method have two main aspects. First, it provides an effective end-to-end way to exploit both spatial and temporal consistency constraint for target object segmentation. Second, it utilizes a mixed graph convolution module to learn a context-aware and discriminative feature for superpixel representation and labeling. An effective algorithm has been developed to optimize the proposed model. Extensive experiments on five datasets demonstrate that our method obtains better performance against existing alternative methods.

preprint2020arXiv

WANA: Symbolic Execution of Wasm Bytecode for Cross-Platform Smart Contract Vulnerability Detection

Many popular blockchain platforms are supporting smart contracts for building decentralized applications. However, the vulnerabilities within smart contracts have led to serious financial loss to their end users. For the EOSIO blockchain platform, effective vulnerability detectors are still limited. Furthermore, existing vulnerability detection tools can only support one blockchain platform. In this work, we present WANA, a cross-platform smart contract vulnerability detection tool based on the symbolic execution of WebAssembly bytecode. Furthermore, WANA proposes a set of test oracles to detect the vulnerabilities in EOSIO and Ethereum smart contracts based on WebAssembly bytecode analysis. Our experimental analysis shows that WANA can effectively detect vulnerabilities in both EOSIO and Ethereum smart contracts with high efficiency.

preprint2016arXiv

$L_p$-norm regularization algorithms for optimization over permutation matrices

Optimization problems over permutation matrices appear widely in facility layout, chip design, scheduling, pattern recognition, computer vision, graph matching, etc. Since this problem is NP-hard due to the combinatorial nature of permutation matrices, we relax the variable to be the more tractable doubly stochastic matrices and add an $L_p$-norm ($0 < p < 1$) regularization term to the objective function. The optimal solutions of the $L_p$-regularized problem are the same as the original problem if the regularization parameter is sufficiently large. A lower bound estimation of the nonzero entries of the stationary points and some connections between the local minimizers and the permutation matrices are further established. Then we propose an $L_p$ regularization algorithm with local refinements. The algorithm approximately solves a sequence of $L_p$ regularization subproblems by the projected gradient method using a nonmontone line search with the Barzilai-Borwein step sizes. Its performance can be further improved if it is combined with certain local search methods, the cutting plane techniques as well as a new negative proximal point scheme. Extensive numerical results on QAPLIB and the bandwidth minimization problem show that our proposed algorithms can often find reasonably high quality solutions within a competitive amount of time.

preprint2016arXiv

Characterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations

In this paper we study multivariate polynomial functions in complex variables and the corresponding associated symmetric tensor representations. The focus is on finding conditions under which such complex polynomials/tensors always take real values. We introduce the notion of symmetric conjugate forms and general conjugate forms, and present characteristic conditions for such complex polynomials to be real-valued. As applications of our results, we discuss the relation between nonnegative polynomials and sums of squares in the context of complex polynomials. Moreover, new notions of eigenvalues/eigenvectors for complex tensors are introduced, extending properties from the Hermitian matrices. Finally, we discuss an important property for symmetric tensors, which states that the largest absolute value of eigenvalue of a symmetric real tensor is equal to its largest singular value; the result is known as Banach's theorem. We show that a similar result holds in the complex case as well.

preprint2016arXiv

Generalized R-squared for Detecting Dependence

Detecting dependence between two random variables is a fundamental problem. Although the Pearson correlation is effective for capturing linear dependency, it can be entirely powerless for detecting nonlinear and/or heteroscedastic patterns. We introduce a new measure, G-squared, to test whether two univariate random variables are independent and to measure the strength of their relationship. The G-squared is almost identical to the square of the Pearson correlation coefficient, R-squared, for linear relationships with constant error variance, and has the intuitive meaning of the piecewise R-squared between the variables. It is particularly effective in handling nonlinearity and heteroscedastic errors. We propose two estimators of G-squared and show their consistency. Simulations demonstrate that G-squared estimates are among the most powerful test statistics compared with several state-of-the-art methods.

preprint2016arXiv

Tensor and Its Tucker Core: the Invariance Relationships

In [13], Hillar and Lim famously demonstrated that "multilinear (tensor) analogues of many efficiently computable problems in numerical linear algebra are NP-hard". Despite many recent advancements, the state-of-the-art methods for computing such `tensor analogues' still suffer severely from the curse of dimensionality. In this paper we show that the Tucker core of a tensor however, retains many properties of the original tensor, including the CP rank, the border rank, the tensor Schatten quasi norms, and the Z-eigenvalues. When the core tensor is smaller than the original tensor, this property leads to considerable computational advantages as confirmed by our numerical experiments. In our analysis, we in fact work with a generalized Tucker-like decomposition that can accommodate any full column-rank factor matrices.

preprint2016arXiv

The Anion Effect on Li+ Ion Coordination Structure in Ethylene Carbonate Solutions

Rechargeable lithium ion batteries are an attractive alternative power source for a wide variety of applications. To optimize their performances, a complete description of the solvation properties of the ion in the electrolyte is crucial. A comprehensive understanding at the nanoscale of the solvation structure of lithium ions in nonaqueous carbonate electrolytes is, however, still unclear. We have measured by femtosecond vibrational spectroscopy the orientational correlation time of the CO stretching mode of Li+-bound and Li+-unbound ethylene carbonate molecules, in LiBF4, LiPF6, and LiClO4 ethylene carbonate solutions with different concentrations. Surprisingly, we have found that the coordination number of ethylene carbonate in the first solvation shell of Li+ is only two, in all solutions with concentrations higher than 0.5 M. Density functional theory calculations indicate that the presence of anions in the first coordination shell modifies the generally accepted tetrahedral structure of the complex, allowing only two EC molecules to coordinate to Li+ directly. Our results demonstrate for the first time, to the best of our knowledge, the anion influence on the overall structure of the first solvation shell of the Li+ ion. The formation of such a cation/solvent/anion complex provides a rational explanation for the ionic conductivity drop of lithium/carbonate electrolyte solutions at high concentrations.

preprint2015arXiv

Bayesian nonparametric tests via sliced inverse modeling

We study the problem of independence and conditional independence tests between categorical covariates and a continuous response variable, which has an immediate application in genetics. Instead of estimating the conditional distribution of the response given values of covariates, we model the conditional distribution of covariates given the discretized response (aka "slices"). By assigning a prior probability to each possible discretization scheme, we can compute efficiently a Bayes factor (BF)-statistic for the independence (or conditional independence) test using a dynamic programming algorithm. Asymptotic and finite-sample properties such as power and null distribution of the BF statistic are studied, and a stepwise variable selection method based on the BF statistic is further developed. We compare the BF statistic with some existing classical methods and demonstrate its statistical power through extensive simulation studies. We apply the proposed method to a mouse genetics data set aiming to detect quantitative trait loci (QTLs) and obtain promising results.

preprint2015arXiv

Hadoop Scheduling Base On Data Locality

In hadoop, the job scheduling is an independent module, users can design their own job scheduler based on their actual application requirements, thereby meet their specific business needs. Currently, hadoop has three schedulers: FIFO, computing capacity scheduling and fair scheduling policy, all of them are take task allocation strategy that considerate data locality simply. They neither support data locality well nor fully apply to all cases of jobs scheduling. In this paper, we took the concept of resources-prefetch into consideration, and proposed a job scheduling algorithm based on data locality. By estimate the remaining time to complete a task, compared with the time to transfer a resources block, to preselect candidate nodes for task allocation. Then we preselect a non-local map tasks from the unfinished job queue as resources-prefetch tasks. Getting information of resources blocks of preselected map task, select a nearest resources blocks from the candidate node and transferred to local through network. Thus we would ensure data locality good enough. Eventually, we design a experiment and proved resources-prefetch method can guarantee good job data locality and reduce the time to complete the job to a certain extent.

preprint2015arXiv

Reciprocity in Social Networks with Capacity Constraints

Directed links -- representing asymmetric social ties or interactions (e.g., "follower-followee") -- arise naturally in many social networks and other complex networks, giving rise to directed graphs (or digraphs) as basic topological models for these networks. Reciprocity, defined for a digraph as the percentage of edges with a reciprocal edge, is a key metric that has been used in the literature to compare different directed networks and provide "hints" about their structural properties: for example, are reciprocal edges generated randomly by chance or are there other processes driving their generation? In this paper we study the problem of maximizing achievable reciprocity for an ensemble of digraphs with the same prescribed in- and out-degree sequences. We show that the maximum reciprocity hinges crucially on the in- and out-degree sequences, which may be intuitively interpreted as constraints on some "social capacities" of nodes and impose fundamental limits on achievable reciprocity. We show that it is NP-complete to decide the achievability of a simple upper bound on maximum reciprocity, and provide conditions for achieving it. We demonstrate that many real networks exhibit reciprocities surprisingly close to the upper bound, which implies that users in these social networks are in a sense more "social" than suggested by the empirical reciprocity alone in that they are more willing to reciprocate, subject to their "social capacity" constraints. We find some surprising linear relationships between empirical reciprocity and the bound. We also show that a particular type of small network motifs that we call 3-paths are the major source of loss in reciprocity for real networks.

preprint2015arXiv

To What Extent Is Stress Testing of Android TV Applications Automated in Industrial Environments?

An Android-based smart Television (TV) must reliably run its applications in an embedded program environment under diverse hardware resource conditions. Owing to the diverse hardware components used to build numerous TV models, TV simulators are usually not high enough in fidelity to simulate various TV models, and thus are only regarded as unreliable alternatives when stress testing such applications. Therefore, even though stress testing on real TV sets is tedious, it is the de facto approach to ensure the reliability of these applications in the industry. In this paper, we study to what extent stress testing of smart TV applications can be fully automated in the industrial environments. To the best of our knowledge, no previous work has addressed this important question. We summarize the find-ings collected from 10 industrial test engineers to have tested 20 such TV applications in a real production environment. Our study shows that the industry required test automation supports on high-level GUI object controls and status checking, setup of resource conditions and the interplay between the two. With such supports, 87% of the industrial test specifications of one TV model can be fully automated and 71.4% of them were found to be fully reusable to test a subsequent TV model with major up-grades of hardware, operating system and application. It repre-sents a significant improvement with margins of 28% and 38%, respectively, compared to stress testing without such supports.

preprint2015arXiv

Towards a solid solution of real-time fire and flame detection

Although the object detection and recognition has received growing attention for decades, a robust fire and flame detection method is rarely explored. This paper presents an empirical study, towards a general and solid approach to fast detect fire and flame in videos, with the applications in video surveillance and event retrieval. Our system consists of three cascaded steps: (1) candidate regions proposing by a background model, (2) fire region classifying with color-texture features and a dictionary of visual words, and (3) temporal verifying. The experimental evaluation and analysis are done for each step. We believe that it is a useful service to both academic research and real-world application. In addition, we release the software of the proposed system with the source code, as well as a public benchmark and data set, including 64 video clips covered both indoor and outdoor scenes under different conditions. We achieve an 82% Recall with 93% Precision on the data set, and greatly improve the performance by state-of-the-arts methods.

preprint2014arXiv

A Framework of Constraint Preserving Update Schemes for Optimization on Stiefel Manifold

This paper considers optimization problems on the Stiefel manifold $X^{\mathsf{T}}X=I_p$, where $X\in \mathbb{R}^{n \times p}$ is the variable and $I_p$ is the $p$-by-$p$ identity matrix. A framework of constraint preserving update schemes is proposed by decomposing each feasible point into the range space of $X$ and the null space of $X^{\mathsf{T}}$. While this general framework can unify many existing schemes, a new update scheme with low complexity cost is also discovered. Then we study a feasible Barzilai-Borwein-like method under the new update scheme. The global convergence of the method is established with an adaptive nonmonotone line search. The numerical tests on the nearest low-rank correlation matrix problem, the Kohn-Sham total energy minimization and a specific problem from statistics demonstrate the efficiency of the new method. In particular, the new method performs remarkably well for the nearest low-rank correlation matrix problem in terms of speed and solution quality and is considerably competitive with the widely used SCF iteration for the Kohn-Sham total energy minimization.

preprint2014arXiv

Iteration Bounds for Finding the $ε$-Stationary Points for Structured Nonconvex Optimization

In this paper we study proximal conditional-gradient (CG) and proximal gradient-projection type algorithms for a block-structured constrained nonconvex optimization model, which arises naturally from tensor data analysis. First, we introduce a new notion of $ε$-stationarity, which is suitable for the structured problem under consideration. %, compared with other similar solution concepts. We then propose two types of first-order algorithms for the model based on the proximal conditional-gradient (CG) method and the proximal gradient-projection method respectively. If the nonconvex objective function is in the form of mathematical expectation, we then discuss how to incorporate randomized sampling to avoid computing the expectations exactly. For the general block optimization model, the proximal subroutines are performed for each block according to either the block-coordinate-descent (BCD) or the maximum-block-improvement (MBI) updating rule. If the gradient of the nonconvex part of the objective $f$ satisfies $\| \nabla f(x) - \nabla f(y)\|_q \le M \|x-y\|_p^δ$ where $δ=p/q$ with $1/p+1/q=1$, then we prove that the new algorithms have an overall iteration complexity bound of $O(1/ε^q)$ in finding an $ε$-stationary solution. If $f$ is concave then the iteration complexity reduces to $O(1/ε)$. Our numerical experiments for tensor approximation problems show promising performances of the new solution algorithms.

preprint2014arXiv

On the Complexity of Optimal Routing and Content Caching in Heterogeneous Networks

We investigate the problem of optimal request routing and content caching in a heterogeneous network supporting in-network content caching with the goal of minimizing average content access delay. Here, content can either be accessed directly from a back-end server (where content resides permanently) or be obtained from one of multiple in-network caches. To access a piece of content, a user must decide whether to route its request to a cache or to the back-end server. Additionally, caches must decide which content to cache. We investigate the problem complexity of two problem formulations, where the direct path to the back-end server is modeled as i) a congestion-sensitive or ii) a congestion-insensitive path, reflecting whether or not the delay of the uncached path to the back-end server depends on the user request load, respectively. We show that the problem is NP-complete in both cases. We prove that under the congestion-insensitive model the problem can be solved optimally in polynomial time if each piece of content is requested by only one user, or when there are at most two caches in the network. We also identify a structural property of the user-cache graph that potentially makes the problem NP-complete. For the congestion-sensitive model, we prove that the problem remains NP-complete even if there is only one cache in the network and each content is requested by only one user. We show that approximate solutions can be found for both models within a (1-1/e) factor of the optimal solution, and demonstrate a greedy algorithm that is found to be within 1% of optimal for small problem sizes. Through trace-driven simulations we evaluate the performance of our greedy algorithms, which show up to a 50% reduction in average delay over solutions based on LRU content caching.

preprint2014arXiv

On the duration and intensity of cumulative advantage competitions

The role of skill (fitness) and luck (randomness) as driving forces on the dynamics of resource accumulation in a myriad of systems have long puzzled scientists. Fueled by undisputed inequalities that emerge from actual competitions, there is a pressing need for better understanding the effects of skill and luck in resource accumulation. When such competitions are driven by externalities such as cumulative advantage (CA), the rich-get-richer effect, little is known with respect to fundamental properties such as their duration and intensity. In this work we provide a mathematical understanding of how CA exacerbates the role of luck in detriment of skill in simple and well-studied competition models. We show, for instance, that if two agents are competing for resources that arrive sequentially at each time unit, an early stroke of luck can place the less skilled in the lead for an extremely long period of time, a phenomenon we call "struggle of the fittest". In the absence of CA, the more skilled quickly prevails despite any early stroke of luck that the less skilled may have. We prove that duration of a simple skill and luck competition model exhibit power law tails when CA is present, regardless of skill difference, which is in sharp contrast to exponential tails when CA is absent. Our findings have important implications to competitions not only in complex social systems but also in contexts that leverage such models.

preprint2014arXiv

Variable selection for general index models via sliced inverse regression

Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential variables under the general index model, in which the response is dependent of predictors through an unknown function of one or more linear combinations of them. Instead of building a predictive model of the response given combinations of predictors, we model the conditional distribution of predictors given the response. This inverse modeling perspective motivates us to propose a stepwise procedure based on likelihood-ratio tests, which is effective and computationally efficient in identifying important variables without specifying a parametric relationship between predictors and the response. For example, the proposed procedure is able to detect variables with pairwise, three-way or even higher-order interactions among $p$ predictors with a computational time of $O(p)$ instead of $O(p^k)$ (with $k$ being the highest order of interactions). Its excellent empirical performance in comparison with existing methods is demonstrated through simulation studies as well as real data examples. Consistency of the variable selection procedure when both the number of predictors and the sample size go to infinity is established.

preprint2013arXiv

On Modeling Economic Default Time: A Reduced-Form Model Approach

In the aftermath of the global financial crisis, much attention has been paid to investigating the appropriateness of the current practice of default risk modeling in banking, finance and insurance industries. A recent empirical study by Guo et al.(2008) shows that the time difference between the economic and recorded default dates has a significant impact on recovery rate estimates. Guo et al.(2011) develop a theoretical structural firm asset value model for a firm default process that embeds the distinction of these two default times. To be more consistent with the practice, in this paper, we assume the market participants cannot observe the firm asset value directly and developed a reduced-form model to characterize the economic and recorded default times. We derive the probability distribution of these two default times. The numerical study on the difference between these two shows that our proposed model can both capture the features and fit the empirical data.

preprint2013arXiv

Tensor Principal Component Analysis via Convex Optimization

This paper is concerned with the computation of the principal components for a general tensor, known as the tensor principal component analysis (PCA) problem. We show that the general tensor PCA problem is reducible to its special case where the tensor in question is super-symmetric with an even degree. In that case, the tensor can be embedded into a symmetric matrix. We prove that if the tensor is rank-one, then the embedded matrix must be rank-one too, and vice versa. The tensor PCA problem can thus be solved by means of matrix optimization under a rank-one constraint, for which we propose two solution methods: (1) imposing a nuclear norm penalty in the objective to enforce a low-rank solution; (2) relaxing the rank-one constraint by Semidefinite Programming. Interestingly, our experiments show that both methods yield a rank-one solution with high probability, thereby solving the original tensor PCA problem to optimality with high probability. To further cope with the size of the resulting convex optimization models, we propose to use the alternating direction method of multipliers, which reduces significantly the computational efforts. Various extensions of the model are considered as well.

Bo Jiang

What is connected

Connect this record

See the researcher in context

Building this map preview

53 published item(s)

Dynamic Pondering Sparsity-aware Mixture-of-Experts Transformer for Event Stream based Visual Object Tracking

Joint Consistency: A Unified Test-Time Aggregation Framework via Energy Minimization

LEAP: Unlocking dLLM Parallelism via Lookahead Early-Convergence Token Detection

Some new results on determinants and permanents

Benchmarking LLMs for Fine-Grained Code Review with Enriched Context in Practice

CRSOT: Cross-Resolution Object Tracking using Unaligned Frame and Event Cameras

Revisiting Color-Event based Tracking: A Unified Network, Dataset, and Metric

Understanding Representation Learnability of Nonlinear Self-Supervised Learning

Unifying Graph Contrastive Learning via Graph Message Augmentation

Flexible Alignment Super-Resolution Network for Multi-Contrast MRI

FuncPipe: A Pipelined Serverless Framework for Fast and Cost-efficient Training of Deep Learning Models

Understanding the convergence of the preconditioned PDHG method: a view of indefinite proximal ADMM

A Novel Negative $\ell_1$ Penalty Approach for Multiuser One-Bit Massive MIMO Downlink with PSK Signaling

A Unified Framework for Generalized Moment Problems: a Novel Primal-Dual Approach

Accelerating Adaptive Cubic Regularization of Newton's Method via Random Sampling

BQA: A High-performance Quantum Circuits Scheduling Strategy Based on Heuristic Search

Complexity and computation for the spectral norm and nuclear norm of order three tensors with one fixed dimension

Exploring students' backtracking behaviors in digital textbooks and its relationship to learning styles

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection

Few-Shot Learning Meets Transformer: Unified Query-Support Transformers for Few-Shot Classification

Generalizing Aggregation Functions in GNNs:High-Capacity GNNs via Nonlinear Neighborhood Aggregators

HOPE: Hierarchical Spatial-temporal Network for Occupancy Flow Prediction

MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking

Unified GCNs: Towards Connecting GCNs with CNNs

Varying Coefficient Model via Adaptive Spline Fitting

Tightness and Equivalence of Semidefinite Relaxations for MIMO Detection

\emph{cm}SalGAN: RGB-D Salient Object Detection with Cross-View Generative Adversarial Networks

A Unified Adaptive Tensor Approximation Scheme to Accelerate Composite Convex Optimization

An Adaptive High Order Method for Finding Third-Order Critical Points of Nonconvex Optimization

An exact penalty approach for optimization with nonnegative orthogonality constraints

An Optimal High-Order Tensor Method for Convex Optimization

Dynamic Graph Learning based on Graph Laplacian

EOSFuzzer: Fuzzing EOSIO Smart Contracts for Vulnerability Detection

Revisiting L21-norm Robustness with Vector Outlier Regularization

Visual Object Tracking by Segmentation with Graph Convolutional Network

WANA: Symbolic Execution of Wasm Bytecode for Cross-Platform Smart Contract Vulnerability Detection

$L_p$-norm regularization algorithms for optimization over permutation matrices

Characterizing Real-Valued Multivariate Complex Polynomials and Their Symmetric Tensor Representations

Generalized R-squared for Detecting Dependence

Tensor and Its Tucker Core: the Invariance Relationships

The Anion Effect on Li+ Ion Coordination Structure in Ethylene Carbonate Solutions

Bayesian nonparametric tests via sliced inverse modeling

Hadoop Scheduling Base On Data Locality

Reciprocity in Social Networks with Capacity Constraints

To What Extent Is Stress Testing of Android TV Applications Automated in Industrial Environments?

Towards a solid solution of real-time fire and flame detection

A Framework of Constraint Preserving Update Schemes for Optimization on Stiefel Manifold

Iteration Bounds for Finding the $ε$-Stationary Points for Structured Nonconvex Optimization

On the Complexity of Optimal Routing and Content Caching in Heterogeneous Networks

On the duration and intensity of cumulative advantage competitions

Variable selection for general index models via sliced inverse regression

On Modeling Economic Default Time: A Reduced-Form Model Approach

Tensor Principal Component Analysis via Convex Optimization