Source author record

Wei Lin

Wei Lin appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

38works

32topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems

The integration of Large Language Model (LLM) agents is transforming recommender systems from simple query-item matching towards deeply personalized and interactive recommendations. Reinforcement Learning (RL) provides an essential framework for the optimization of these agents in recommendation tasks. However, current methodologies remain limited by a reliance on single dimensional outcome-based rewards that focus exclusively on final user interactions, overlooking critical intermediate capabilities, such as instruction following and complex intent understanding. Despite the necessity for designing multi-dimensional reward, the field lacks a standardized benchmark to facilitate this development. To bridge this gap, we introduce RecRM-Bench, the largest and most comprehensive benchmark to date for agentic recommender systems. It comprises over 1 million structured entries across four core evaluation dimensions: instruction following, factual consistency, query-item relevance, and fine-grained user behavior prediction. By supporting comprehensive assessment from syntactic compliance to complex intent grounding and preference modeling, RecRM-Bench provides a foundational dataset for training sophisticated reward models. Furthermore, we propose a systematic framework for the construction of multi-dimensional reward models and the integration of a hybrid reward function, establishing a robust foundation for developing reliable and highly capable agentic recommender systems. The complete RecRM-Bench dataset is publicly available at https://huggingface.co/datasets/wwzeng/RecRM-Bench.

preprint2022arXiv

A meridian lemma for fully alternating links in thickened surfaces

Menasco showed that a closed surface in the complement of a non-split prime alternating link in $S^3$ contains a circle isotopic in the link complement to a meridian of the links. This result is known as the meridian lemma for alternating links. We give a meridian lemma for the class of fully alternating links in the thickened orientable surfaces of positive genus.

preprint2022arXiv

AC-Feasible Power Transfer Regions of Virtual Power Plants: Characterization and Application

Distributed energy resources (DERs) in distribution networks can be aggregated as a virtual power plant (VPP) for transmission-level operations. A critical challenge for such coordination is the complexity of the AC-feasible power transfer region between a VPP and the transmission system at their point of common coupling. To overcome this challenge, this paper develops a characterization method for such regions. The proposed method constructs linear constraints to inner-approximate the AC-feasible power transfer regions. To guarantee AC-feasibility, the parameters in these constraints are determined by applying the Brouwer fixed point theorem to the second-order Taylor expansion of the nonlinear Dist-Flow equations. Based on the power transfer regions characterized with our method, a transmission-level operation problem with VPP participation is formulated and solved through big-M linearization. The proposed methods are verified by numerical experiments in the IEEE 33-bus and IEEE 136-bus test systems.

preprint2022arXiv

CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph Similarity Learning

Graph similarity learning refers to calculating the similarity score between two graphs, which is required in many realistic applications, such as visual tracking, graph classification, and collaborative filtering. As most of the existing graph neural networks yield effective graph representations of a single graph, little effort has been made for jointly learning two graph representations and calculating their similarity score. In addition, existing unsupervised graph similarity learning methods are mainly clustering-based, which ignores the valuable information embodied in graph pairs. To this end, we propose a contrastive graph matching network (CGMN) for self-supervised graph similarity learning in order to calculate the similarity between any two input graph objects. Specifically, we generate two augmented views for each graph in a pair respectively. Then, we employ two strategies, namely cross-view interaction and cross-graph interaction, for effective node representation learning. The former is resorted to strengthen the consistency of node representations in two views. The latter is utilized to identify node differences between different graphs. Finally, we transform node representations into graph-level representations via pooling operations for graph similarity computation. We have evaluated CGMN on eight real-world datasets, and the experiment results show that the proposed new approach is superior to the state-of-the-art methods in graph similarity learning downstream tasks.

preprint2022arXiv

Continuity scaling: A rigorous framework for detecting and quantifying causality accurately

Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interpretation of causality. In particular, instead of measuring the smoothness of the cross map as conventionally implemented, we define causation through measuring the {\it scaling law} for the continuity of the investigated dynamical system directly. The uncovered scaling law enables accurate, reliable, and efficient detection of causation and assessment of its strength in general complex dynamical systems, outperforming those existing representative methods. The continuity scaling based framework is rigorously established and demonstrated using datasets from model complex systems and the real world.

preprint2022arXiv

Cross-View Cross-Scene Multi-View Crowd Counting

Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution. However, the current multi-view paradigm trains and tests on the same single scene and camera-views, which limits its practical application. In this paper, we propose a cross-view cross-scene (CVCS) multi-view crowd counting paradigm, where the training and testing occur on different scenes with arbitrary camera layouts. To dynamically handle the challenge of optimal view fusion under scene and camera layout change and non-correspondence noise due to camera calibration errors or erroneous features, we propose a CVCS model that attentively selects and fuses multiple views together using camera layout geometry, and a noise view regularization method to train the model to handle non-correspondence errors. We also generate a large synthetic multi-camera crowd counting dataset with a large number of scenes and camera views to capture many possible variations, which avoids the difficulty of collecting and annotating such a large real dataset. We then test our trained CVCS model on real multi-view counting datasets, by using unsupervised domain transfer. The proposed CVCS model trained on synthetic data outperforms the same model trained only on real data, and achieves promising performance compared to fully supervised methods that train and test on the same single scene.

preprint2022arXiv

Efficient Pipeline Planning for Expedited Distributed DNN Training

To train modern large DNN models, pipeline parallelism has recently emerged, which distributes the model across GPUs and enables different devices to process different microbatches in pipeline. Earlier pipeline designs allow multiple versions of model parameters to co-exist (similar to asynchronous training), and cannot ensure the same model convergence and accuracy performance as without pipelining. Synchronous pipelining has recently been proposed which ensures model performance by enforcing a synchronization barrier between training iterations. Nonetheless, the synchronization barrier requires waiting for gradient aggregation from all microbatches and thus delays the training progress. Optimized pipeline planning is needed to minimize such wait and hence the training time, which has not been well studied in the literature. This paper designs efficient, near-optimal algorithms for expediting synchronous pipeline-parallel training of modern large DNNs over arbitrary inter-GPU connectivity. Our algorithm framework comprises two components: a pipeline partition and device mapping algorithm, and a pipeline scheduler that decides processing order of microbatches over the partitions, which together minimize the per-iteration training time. We conduct thorough theoretical analysis, extensive testbed experiments and trace-driven simulation, and demonstrate our scheme can accelerate training up to 157% compared with state-of-the-art designs.

preprint2022arXiv

GDsmith: Detecting Bugs in Graph Database Engines

Graph database engines stand out in the era of big data for their efficiency of modeling and processing linked data. There is a strong need of testing graph database engines. However, random testing, the most practical way of automated test generation, faces the challenges of semantic validity, non-empty result, and behavior diversity to detect bugs in graph database engines. To address these challenges, in this paper, we propose GDsmith, the first black-box approach for testing graph database engines. It ensures that each randomly generated Cypher query satisfies the semantic requirements via skeleton generation and completion. GDsmith includes our technique to increase the probability of producing Cypher queries that return non-empty results by leveraging three types of structural mutation strategies. GDsmith also includes our technique to improve the behavior diversity of the generated Cypher queries by selecting property keys according to their previous frequencies when generating new queries. Our evaluation results demonstrate that GDsmith is effective and efficient for automated query generation and substantially outperforms the baseline. GDsmith successfully detects 27 previously unknown bugs on the released versions of three popular open-source graph database engines and receive positive feedback from their developers.

preprint2022arXiv

Neural Piecewise-Constant Delay Differential Equations

Continuous-depth neural networks, such as the Neural Ordinary Differential Equations (ODEs), have aroused a great deal of interest from the communities of machine learning and data science in recent years, which bridge the connection between deep neural networks and dynamical systems. In this article, we introduce a new sort of continuous-depth neural network, called the Neural Piecewise-Constant Delay Differential Equations (PCDDEs). Here, unlike the recently proposed framework of the Neural Delay Differential Equations (DDEs), we transform the single delay into the piecewise-constant delay(s). The Neural PCDDEs with such a transformation, on one hand, inherit the strength of universal approximating capability in Neural DDEs. On the other hand, the Neural PCDDEs, leveraging the contributions of the information from the multiple previous time steps, further promote the modeling capability without augmenting the network dimension. With such a promotion, we show that the Neural PCDDEs do outperform the several existing continuous-depth neural frameworks on the one-dimensional piecewise-constant delay population dynamics and real-world datasets, including MNIST, CIFAR10, and SVHN.

preprint2022arXiv

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

The development of personalized recommendation has significantly improved the accuracy of information matching and the revenue of e-commerce platforms. Recently, it has 2 trends: 1) recommender systems must be trained timely to cope with ever-growing new products and ever-changing user interests from online marketing and social network; 2) SOTA recommendation models introduce DNN modules to improve prediction accuracy. Traditional CPU-based recommender systems cannot meet these two trends, and GPU- centric training has become a trending approach. However, we observe that GPU devices in training recommender systems are underutilized, and they cannot attain an expected throughput improvement as what it has achieved in CV and NLP areas. This issue can be explained by two characteristics of these recommendation models: First, they contain up to a thousand input feature fields, introducing fragmentary and memory-intensive operations; Second, the multiple constituent feature interaction submodules introduce substantial small-sized compute kernels. To remove this roadblock to the development of recommender systems, we propose a novel framework named PICASSO to accelerate the training of recommendation models on commodity hardware. Specifically, we conduct a systematic analysis to reveal the bottlenecks encountered in training recommendation models. We leverage the model structure and data distribution to unleash the potential of hardware through our packing, interleaving, and caching optimization. Experiments show that PICASSO increases the hardware utilization by an order of magnitude on the basis of SOTA baselines and brings up to 6x throughput improvement for a variety of industrial recommendation models. Using the same hardware budget in production, PICASSO on average shortens the walltime of daily training tasks by 7 hours, significantly reducing the delay of continuous delivery.

preprint2022arXiv

RAW-GNN: RAndom Walk Aggregation based Graph Neural Network

Graph-Convolution-based methods have been successfully applied to representation learning on homophily graphs where nodes with the same label or similar attributes tend to connect with one another. Due to the homophily assumption of Graph Convolutional Networks (GCNs) that these methods use, they are not suitable for heterophily graphs where nodes with different labels or dissimilar attributes tend to be adjacent. Several methods have attempted to address this heterophily problem, but they do not change the fundamental aggregation mechanism of GCNs because they rely on summation operators to aggregate information from neighboring nodes, which is implicitly subject to the homophily assumption. Here, we introduce a novel aggregation mechanism and develop a RAndom Walk Aggregation-based Graph Neural Network (called RAW-GNN) method. The proposed approach integrates the random walk strategy with graph neural networks. The new method utilizes breadth-first random walk search to capture homophily information and depth-first search to collect heterophily information. It replaces the conventional neighborhoods with path-based neighborhoods and introduces a new path-based aggregator based on Recurrent Neural Networks. These designs make RAW-GNN suitable for both homophily and heterophily graphs. Extensive experimental results showed that the new method achieved state-of-the-art performance on a variety of homophily and heterophily graphs.

preprint2022arXiv

Tie-line Security Regions in High Dimension for Renewable Accommodations

Tie-line power exchanges among regional power systems facilitate renewable accommodations. Power exchanges can be calculated via a tie-line security region that provides the feasible region of the coupling parameters among regional power systems. However, a tie-line security region is a high-dimension polytope due to multiple time periods and border buses inherently in power system operations, leading to the considerable computational burden. A fast calculation method for tie-line security regions in high dimension is studied in this paper. The high-dimension polytope across all the time periods is decomposed as a Cartesian production of lower-dimension polytopes at each time period by leveraging dispatch levels of generations. For each lower-dimension polytope, the computational burden brought by multiple border buses is alleviated by aggregating tie-line power. Also, minimum renewable curtailments are preserved by incorporating an additional dimension in the tie-line security region. For the coupling parameters located within our tie-line security region, a feasible decision of the regional power system exists. Finally, the tie-line security region is used to reduce renewable curtailments in an interconnected power system under a decentralized and non-iterative framework. The performance of the presented methods is corroborated in the IEEE 9-bus system, a 661-bus utility system and a five-region system.

preprint2022arXiv

Whale: Efficient Giant Model Training over Heterogeneous GPUs

The scaling up of deep neural networks has been demonstrated to be effective in improving model quality, but also encompasses several training challenges in terms of training efficiency, programmability, and resource adaptability. We present Whale, a general and efficient distributed training framework for giant models. To support various parallel strategies and their hybrids, Whale generalizes the programming interface by defining two new primitives in the form of model annotations, allowing for incorporating user hints. The Whale runtime utilizes those annotations and performs graph optimizations to transform a local deep learning DAG graph for distributed multi-GPU execution. Whale further introduces a novel hardware-aware parallel strategy, which improves the performance of model training on heterogeneous GPUs in a balanced manner. Deployed in a production cluster with 512 GPUs, Whale successfully trains an industry-scale multimodal model with over ten trillion model parameters, named M6, demonstrating great scalability and efficiency.

preprint2021arXiv

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search

Large pre-trained language models such as BERT have shown their effectiveness in various natural language processing tasks. However, the huge parameter size makes them difficult to be deployed in real-time applications that require quick inference with limited resources. Existing methods compress BERT into small models while such compression is task-independent, i.e., the same compressed BERT for all different downstream tasks. Motivated by the necessity and benefits of task-oriented BERT compression, we propose a novel compression method, AdaBERT, that leverages differentiable Neural Architecture Search to automatically compress BERT into task-adaptive small models for specific tasks. We incorporate a task-oriented knowledge distillation loss to provide search hints and an efficiency-aware loss as search constraints, which enables a good trade-off between efficiency and effectiveness for task-adaptive BERT compression. We evaluate AdaBERT on several NLP tasks, and the results demonstrate that those task-adaptive compressed models are 12.7x to 29.3x faster than BERT in inference time and 11.5x to 17.0x smaller in terms of parameter size, while comparable performance is maintained.

preprint2021arXiv

Joule-Thomson expansion of the torus-like black hole

In this paper, we study Joule-Thomson effects for the torus-like black hole. The Joule-Thomson coefficients, the inversion curves and the isenthalpic curves are studied. Furthermore, we investigate similarities and differences between the Van der Waals fluid, the torus-like black hole and the charged AdS black holes for the expansion. The isenthalpic curves in the $T-P$ plane are obtained. Moreover, we determine the cooling-heating regions.

preprint2021arXiv

Neural Delay Differential Equations

Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with some representative datasets. Recently, an augmented framework has been successfully developed for conquering some limitations emergent in application of the original framework. Here we propose a new class of continuous-depth neural networks with delay, named as Neural Delay Differential Equations (NDDEs), and, for computing the corresponding gradients, we use the adjoint sensitivity method to obtain the delayed dynamics of the adjoint. Since the differential equations with delays are usually seen as dynamical systems of infinite dimension possessing more fruitful dynamics, the NDDEs, compared to the NODEs, own a stronger capacity of nonlinear representations. Indeed, we analytically validate that the NDDEs are of universal approximators, and further articulate an extension of the NDDEs, where the initial function of the NDDEs is supposed to satisfy ODEs. More importantly, we use several illustrative examples to demonstrate the outstanding capacities of the NDDEs and the NDDEs with ODEs' initial value. Specifically, (1) we successfully model the delayed dynamics where the trajectories in the lower-dimensional phase space could be mutually intersected, while the traditional NODEs without any argumentation are not directly applicable for such modeling, and (2) we achieve lower loss and higher accuracy not only for the data produced synthetically by complex models but also for the real-world image datasets, i.e., CIFAR10, MNIST, and SVHN. Our results on the NDDEs reveal that appropriately articulating the elements of dynamical systems into the network design is truly beneficial to promoting the network performance.

preprint2020arXiv

Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads

The last decade has witnessed growth in the computational requirements for training deep neural networks. Current approaches (e.g., data/model parallelism, pipeline parallelism) parallelize training tasks onto multiple devices. However, these approaches always rely on specific deep learning frameworks and requires elaborate manual design, which make it difficult to maintain and share between different type of models. In this paper, we propose Auto-MAP, a framework for exploring distributed execution plans for DNN workloads, which can automatically discovering fast parallelization strategies through reinforcement learning on IR level of deep learning models. Efficient exploration remains a major challenge for reinforcement learning. We leverage DQN with task-specific pruning strategies to help efficiently explore the search space including optimized strategies. Our evaluation shows that Auto-MAP can find the optimal solution in two hours, while achieving better throughput on several NLP and convolution models.

preprint2020arXiv

DAPPLE: A Pipelined Data Parallel Approach for Training Large Models

It is a challenging task to train large DNN models on sophisticated GPU platforms with diversified interconnect capabilities. Recently, pipelined training has been proposed as an effective approach for improving device utilization. However, there are still several tricky issues to address: improving computing efficiency while ensuring convergence, and reducing memory usage without incurring additional computing costs. We propose DAPPLE, a synchronous training framework which combines data parallelism and pipeline parallelism for large DNN models. It features a novel parallelization strategy planner to solve the partition and placement problems, and explores the optimal hybrid strategy of data and pipeline parallelism. We also propose a new runtime scheduling algorithm to reduce device memory usage, which is orthogonal to re-computation approach and does not come at the expense of training throughput. Experiments show that DAPPLE planner consistently outperforms strategies generated by PipeDream's planner by up to 3.23x under synchronous training scenarios, and DAPPLE runtime outperforms GPipe by 1.6x speedup of training throughput and reduces the memory consumption of 12% at the same time.

preprint2020arXiv

Graph Structural-topic Neural Network

Graph Convolutional Networks (GCNs) achieved tremendous success by effectively gathering local features for nodes. However, commonly do GCNs focus more on node features but less on graph structures within the neighborhood, especially higher-order structural patterns. However, such local structural patterns are shown to be indicative of node properties in numerous fields. In addition, it is not just single patterns, but the distribution over all these patterns matter, because networks are complex and the neighborhood of each node consists of a mixture of various nodes and structural patterns. Correspondingly, in this paper, we propose Graph Structural-topic Neural Network, abbreviated GraphSTONE, a GCN model that utilizes topic models of graphs, such that the structural topics capture indicative graph structures broadly from a probabilistic aspect rather than merely a few structures. Specifically, we build topic models upon graphs using anonymous walks and Graph Anchor LDA, an LDA variant that selects significant structural patterns first, so as to alleviate the complexity and generate structural topics efficiently. In addition, we design multi-view GCNs to unify node features and structural topic features and utilize structural topics to guide the aggregation. We evaluate our model through both quantitative and qualitative experiments, where our model exhibits promising performance, high efficiency, and clear interpretability.

preprint2020arXiv

Grasping Detection Network with Uncertainty Estimation for Confidence-Driven Semi-Supervised Domain Adaptation

Data-efficient domain adaptation with only a few labelled data is desired for many robotic applications, e.g., in grasping detection, the inference skill learned from a grasping dataset is not universal enough to directly apply on various other daily/industrial applications. This paper presents an approach enabling the easy domain adaptation through a novel grasping detection network with confidence-driven semi-supervised learning, where these two components deeply interact with each other. The proposed grasping detection network specially provides a prediction uncertainty estimation mechanism by leveraging on Feature Pyramid Network (FPN), and the mean-teacher semi-supervised learning utilizes such uncertainty information to emphasizing the consistency loss only for those unlabelled data with high confidence, which we referred it as the confidence-driven mean teacher. This approach largely prevents the student model to learn the incorrect/harmful information from the consistency loss, which speeds up the learning progress and improves the model accuracy. Our results show that the proposed network can achieve high success rate on the Cornell grasping dataset, and for domain adaptation with very limited data, the confidence-driven mean teacher outperforms the original mean teacher and direct training by more than 10% in evaluation loss especially for avoiding the overfitting and model diverging.

preprint2020arXiv

Impact of intra and inter-cluster coupling balance on the performance of nonlinear networked systems

The dynamical and structural aspects of cluster synchronization (CS) in complex systems have been intensively investigated in recent years. Here, we study CS of dynamical systems with intra and inter-cluster couplings. We propose new metrics that describe the performance of such systems and evaluate them as a function of the strength of the couplings within and between clusters. We obtain analytical results that indicate that spectral differences between the Laplacian matrices associated with the partition between intra and inter-couplings directly affect the proposed metrics of system performance. Our results show that the dynamics of the system might exhibit an optimal balance that optimizes its performance. Our work provides new insights into the way specific symmetry properties relate to collective behavior, and could lead to new forms to increase the controllability of complex systems and to optimize their stability.

preprint2020arXiv

NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization

In the last decade, crowd counting and localization attract much attention of researchers due to its wide-spread applications, including crowd monitoring, public safety, space design, etc. Many Convolutional Neural Networks (CNN) are designed for tackling this task. However, currently released datasets are so small-scale that they can not meet the needs of the supervised CNN-based algorithms. To remedy this problem, we construct a large-scale congested crowd counting and localization dataset, NWPU-Crowd, consisting of 5,109 images, in a total of 2,133,375 annotated heads with points and boxes. Compared with other real-world datasets, it contains various illumination scenes and has the largest density range (0~20,033). Besides, a benchmark website is developed for impartially evaluating the different methods, which allows researchers to submit the results of the test set. Based on the proposed dataset, we further describe the data characteristics, evaluate the performance of some mainstream state-of-the-art (SOTA) methods, and analyze the new problems that arise on the new data. What's more, the benchmark is deployed at \url{https://www.crowdbenchmark.com/}, and the dataset/code/models/results are available at \url{https://gjy3035.github.io/NWPU-Crowd-Sample-Code/}.

preprint2020arXiv

One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction

Structured information extraction from document images usually consists of three steps: text detection, text recognition, and text field labeling. While text detection and text recognition have been heavily studied and improved a lot in literature, text field labeling is less explored and still faces many challenges. Existing learning based methods for text labeling task usually require a large amount of labeled examples to train a specific model for each type of document. However, collecting large amounts of document images and labeling them is difficult and sometimes impossible due to privacy issues. Deploying separate models for each type of document also consumes a lot of resources. Facing these challenges, we explore one-shot learning for the text field labeling task. Existing one-shot learning methods for the task are mostly rule-based and have difficulty in labeling fields in crowded regions with few landmarks and fields consisting of multiple separate text regions. To alleviate these problems, we proposed a novel deep end-to-end trainable approach for one-shot text field labeling, which makes use of attention mechanism to transfer the layout information between document images. We further applied conditional random field on the transferred layout information for the refinement of field labeling. We collected and annotated a real-world one-shot field labeling dataset with a large variety of document types and conducted extensive experiments to examine the effectiveness of the proposed model. To stimulate research in this direction, the collected dataset and the one-shot model will be released1.

preprint2020arXiv

Pixel-wise Crowd Understanding via Synthetic Data

Crowd analysis via computer vision techniques is an important topic in the field of video surveillance, which has wide-spread applications including crowd monitoring, public safety, space design and so on. Pixel-wise crowd understanding is the most fundamental task in crowd analysis because of its finer results for video sequences or still images than other analysis tasks. Unfortunately, pixel-level understanding needs a large amount of labeled training data. Annotating them is an expensive work, which causes that current crowd datasets are small. As a result, most algorithms suffer from over-fitting to varying degrees. In this paper, take crowd counting and segmentation as examples from the pixel-wise crowd understanding, we attempt to remedy these problems from two aspects, namely data and methodology. Firstly, we develop a free data collector and labeler to generate synthetic and labeled crowd scenes in a computer game, Grand Theft Auto V. Then we use it to construct a large-scale, diverse synthetic crowd dataset, which is named as "GCC Dataset". Secondly, we propose two simple methods to improve the performance of crowd understanding via exploiting the synthetic data. To be specific, 1) supervised crowd understanding: pre-train a crowd analysis model on the synthetic data, then fine-tune it using the real data and labels, which makes the model perform better on the real world; 2) crowd understanding via domain adaptation: translate the synthetic data to photo-realistic images, then train the model on translated data and labels. As a result, the trained model works well in real crowd scenes.

preprint2020arXiv

RPM-Oriented Query Rewriting Framework for E-commerce Keyword-Based Sponsored Search

Sponsored search optimizes revenue and relevance, which is estimated by Revenue Per Mille (RPM). Existing sponsored search models are all based on traditional statistical models, which have poor RPM performance when queries follow a heavy-tailed distribution. Here, we propose an RPM-oriented Query Rewriting Framework (RQRF) which outputs related bid keywords that can yield high RPM. RQRF embeds both queries and bid keywords to vectors in the same implicit space, converting the rewriting probability between each query and keyword to the distance between the two vectors. For label construction, we propose an RPM-oriented sample construction method, labeling keywords based on whether or not they can lead to high RPM. Extensive experiments are conducted to evaluate performance of RQRF. In a one month large-scale real-world traffic of e-commerce sponsored search system, the proposed model significantly outperforms traditional baseline.

preprint2020arXiv

SwapText: Image Based Texts Transfer in Scenes

Swapping text in scene images while preserving original fonts, colors, sizes and background textures is a challenging task due to the complex interplay between different factors. In this work, we present SwapText, a three-stage framework to transfer texts across scene images. First, a novel text swapping network is proposed to replace text labels only in the foreground image. Second, a background completion network is learned to reconstruct background images. Finally, the generated foreground image and background image are used to generate the word image by the fusion network. Using the proposing framework, we can manipulate the texts of the input images even with severe geometric distortion. Qualitative and quantitative results are presented on several scene text datasets, including regular and irregular text datasets. We conducted extensive experiments to prove the usefulness of our method such as image based text translation, text image synthesis, etc.

preprint2019arXiv

Supercontinuum generation without residual pump peak through multiple coherent pump seeds

Residual pump peak in fiber-based supercontinuum, as a general phenomenon, limits its practical application. We report a novel supercontinuum generation (SCG) in a conventional highly nonlinear fiber (HNLF) through multiple coherent pump technique, which eliminates the residual pump peak existed in conventional SCG. The multiple coherent pump technique is realized by double bound-state solitons achieved from a homemade modelocked fiber laser. We further compare the SCGs pumped by conventional bound-state soliton and single soliton. It confirms that the effective elimination of the residual pump peak in supercontinuum owes to higher transferring efficiency of the pump energy to new generated frequencies in the multiple coherent pump scheme. The use of multiple coherent pump scheme, i.e., double bound-state solitons, provides a new, simple and promising method to obtain flat supercontinuum source.

preprint2016arXiv

Large Covariance Estimation for Compositional Data via Composition-Adjusted Thresholding

High-dimensional compositional data arise naturally in many applications such as metagenomic data analysis. The observed data lie in a high-dimensional simplex, and conventional statistical methods often fail to produce sensible results due to the unit-sum constraint. In this article, we address the problem of covariance estimation for high-dimensional compositional data, and introduce a composition-adjusted thresholding (COAT) method under the assumption that the basis covariance matrix is sparse. Our method is based on a decomposition relating the compositional covariance to the basis covariance, which is approximately identifiable as the dimensionality tends to infinity. The resulting procedure can be viewed as thresholding the sample centered log-ratio covariance matrix and hence is scalable for large covariance matrices. We rigorously characterize the identifiability of the covariance parameters, derive rates of convergence under the spectral norm, and provide theoretical guarantees on support recovery. Simulation studies demonstrate that the COAT estimator outperforms some naive thresholding estimators that ignore the unique features of compositional data. We apply the proposed method to the analysis of a microbiome dataset in order to understand the dependence structure among bacterial taxa in the human gut.

preprint2016arXiv

Neural Networks Models for Entity Discovery and Linking

This paper describes the USTC_NELSLIP systems submitted to the Trilingual Entity Detection and Linking (EDL) track in 2016 TAC Knowledge Base Population (KBP) contests. We have built two systems for entity discovery and mention detection (MD): one uses the conditional RNNLM and the other one uses the attention-based encoder-decoder framework. The entity linking (EL) system consists of two modules: a rule based candidate generation and a neural networks probability ranking model. Moreover, some simple string matching rules are used for NIL clustering. At the end, our best system has achieved an F1 score of 0.624 in the end-to-end typed mention ceaf plus metric.

preprint2015arXiv

Consistent Pricing of VIX and Equity Derivatives with the 4/2 Stochastic Volatility Plus Jumps Model

In this paper, we develop a 4/2 stochastic volatility plus jumps model, namely, a new stochastic volatility model including the Heston model and 3/2 model as special cases. Our model is highly tractable by applying the Lie symmetries theory for PDEs, which means that the pricing procedure can be performed efficiently. In fact, we obtain a closed-form solution for the joint Fourier-Laplace transform so that equity and realized-variance derivatives can be priced. We also employ our model to consistently price equity and VIX derivatives. In this process, the quasi-closed-form solutions for future and option prices are derived. Furthermore, through adopting data on daily VIX future and option prices, we investigate our model along with the Heston model and 3/2 model and compare their different performance in practice. Our result illustrates that the 4/2 model with an instantaneous volatility of the form $(a\sqrt{V_t}+b/\sqrt{V_t})$ for some constants $a, b$ presents considerable advantages in pricing VIX derivatives.

preprint2015arXiv

Observable Signatures of a Classical Transition

Eternal inflation arising from a potential landscape predicts that our universe is one realization of many possible cosmological histories. One way to access different cosmological histories is via the nucleation of bubble universes from a metastable false vacuum. Another way to sample different cosmological histories is via classical transitions, the creation of pocket universes through the collision between bubbles. Using relativistic numerical simulations, we examine the possibility of observationally determining if our observable universe resulted from a classical transition. We find that classical transitions produce spatially infinite, approximately open Friedman-Robertson-Walker universes. The leading set of observables in the aftermath of a classical transition are negative spatial curvature and a contribution to the Cosmic Microwave Background temperature quadrupole. The level of curvature and magnitude of the quadrupole are dependent on the position of the observer, and we determine the possible range of observables for two classes of single-scalar field models. For the first class, where the inflationary phase has a lower energy than the vacuum preceding the classical transition, the magnitude of the observed quadrupole generally falls to zero with distance from the collision while the spatial curvature grows to a constant. For the second class, where the inflationary phase has a higher energy than the vacuum preceding the classical transition, the magnitude of the observed quadrupole generically falls to zero with distance from the collision while the spatial curvature grows without bound. We find that the magnitude of the quadrupole and curvature grow with increasing centre of mass energy of the collision, and explore variations of the parameters in the scalar field lagrangian.

preprint2014arXiv

Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics

In genetical genomics studies, it is important to jointly analyze gene expression data and genetic variants in exploring their associations with complex traits, where the dimensionality of gene expressions and genetic variants can both be much larger than the sample size. Motivated by such modern applications, we consider the problem of variable selection and estimation in high-dimensional sparse instrumental variables models. To overcome the difficulty of high dimensionality and unknown optimal instruments, we propose a two-stage regularization framework for identifying and estimating important covariate effects while selecting and estimating optimal instruments. The methodology extends the classical two-stage least squares estimator to high dimensions by exploiting sparsity using sparsity-inducing penalty functions in both stages. The resulting procedure is efficiently implemented by coordinate descent optimization. For the representative $L_1$ regularization and a class of concave regularization methods, we establish estimation, prediction, and model selection properties of the two-stage regularized estimators in the high-dimensional setting where the dimensionality of covariates and instruments are both allowed to grow exponentially with the sample size. The practical performance of the proposed method is evaluated by simulation studies and its usefulness is illustrated by an analysis of mouse obesity data. Supplementary materials for this article are available online.

preprint2013arXiv

Evaluation on the Financial Competitiveness of Chinese Listed Real Estate Companies Based on Entropy Method

The real estate is a pillar industry of China's national economy. Due to changes in policy and market conditions, the real estate companies are facing greater pressures to survive in a competitive environment. They must improve their financial competitiveness. Based on the conceptual framework of financial competitiveness, this paper presented a financial competitiveness evaluation index system, covering four aspects, including profitability, solvency, sustainable development and operational capacity. Entropy value method is applied to determine the index weight. 105 listed real estate company's financial competitiveness are evaluated, the results show that: high-scoring company has strong profitability, sustainable development and operational capacity; low-scoring company has weak profitability and poor ability of sustainable development; solvency doesn't affect the company's financial competitiveness obviously.

preprint2012arXiv

A Dynamical Model Reveals Gene Co-Localizations in Nucleus

Co-localization of networks of genes in the nucleus is thought to play an important role in determining gene expression patterns. Based upon experimental data, we built a dynamical model to test whether pure diffusion could account for the observed co-localization of genes within a defined subnuclear region. A simple standard Brownian motion model in two and three dimensions shows that preferential co-localization is possible for co-regulated genes without any direct interaction, and suggests the occurrence may be due to a limitation in the number of available transcription factors. Experimental data of chromatin movements demonstrates that fractional rather than standard Brownian motion is more appropriate to model gene mobilizations, and we tested our dynamical model against recent static experimental data, using a sub-diffusion process by which the genes tend to colocalize more easily. Moreover, in order to compare our model with recently obtained experimental data, we studied the association level between genes and factors, and presented data supporting the validation of this dynamic model. As further applications of our model, we applied it to test against more biological observations. We found that increasing transcription factor number, rather than factory number and nucleus size, might be the reason for decreasing gene co-localization. In the scenario of frequency- or amplitude-modulation of transcription factors, our model predicted that frequency-modulation may increase the co-localization between its targeted genes.

preprint2012arXiv

High-Dimensional Sparse Additive Hazards Regression

High-dimensional sparse modeling with censored survival data is of great practical importance, as exemplified by modern applications in high-throughput genomic data analysis and credit risk analysis. In this article, we propose a class of regularization methods for simultaneous variable selection and estimation in the additive hazards model, by combining the nonconcave penalized likelihood approach and the pseudoscore method. In a high-dimensional setting where the dimensionality can grow fast, polynomially or nonpolynomially, with the sample size, we establish the weak oracle property and oracle property under mild, interpretable conditions, thus providing strong performance guarantees for the proposed methodology. Moreover, we show that the regularity conditions required by the $L_1$ method are substantially relaxed by a certain class of sparsity-inducing concave penalties. As a result, concave penalties such as the smoothly clipped absolute deviation (SCAD), minimax concave penalty (MCP), and smooth integration of counting and absolute deviation (SICA) can significantly improve on the $L_1$ method and yield sparser models with better prediction performance. We present a coordinate descent algorithm for efficient implementation and rigorously investigate its convergence properties. The practical utility and effectiveness of the proposed methods are demonstrated by simulation studies and a real data example.

preprint2011arXiv

A Novel VSWR-Protected and Controllable CMOS Class E Power Amplifier for Bluetooth Applications

This paper describes the design of a differential class-E PA for Bluetooth applications in 0.18um CMOS technology with load mismatch protection and power control features. The breakdown induced by load mismatch can be avoided by attenuating the RF power to the final stage during over voltage conditions. Power control is realized by means of "open loop" techniques to regulate the power supply voltage, and a novel controllable bias network with temperature compensated is proposed, which allows a moderate power control slope (dB/V) to be achieved. Post-layout Simulation results show that the level of output power can be controlled in 2dBm steps; especially the output power in every step is quite insensitive to temperature variations.

preprint2011arXiv

Bifurcations of Emergent Bursting in a Neuronal Network

Currently we routinely develop a complex neuronal network to explain observed but often paradoxical phenomena based upon biological recordings. Here we present a general approach to demonstrate how to mathematically tackle such a complex neuronal network so that we can fully understand the underlying mechanism. Using an oxytocin network developed earlier as an example, we show how we can reduce a complex model with many variables to a tractable model with two variables, while retaining all key qualitative features of the model. The approach enables us to uncover how emergent synchronous bursting could arise from a neuronal network which embodies all known biological features. Surprisingly, the discovered mechanisms for bursting are similar to those found in other systems reported in the literature, and illustrate a generic way to exhibit emergent and multi-time scale spikes: at the membrane potential level and the firing rate level.

preprint2010arXiv

Measurements of Transit Timing Variations for WASP-5b

We have observed 7 new transits of the `hot Jupiter' WASP-5b using a 61 cm telescope located in New Zealand, in order to search for transit timing variations (TTVs) which can be induced by additional bodies existing in the system. When combined with other available photometric and radial velocity (RV) data, we find that its transit timings do not match a linear ephemeris; the best fit χ^2 values is 32.2 with 9 degrees of freedom which corresponds to a confidence level of 99.982 % or 3.7 σ. This result indicates that excess variations of transit timings has been observed, due either to unknown systematic effects or possibly to real TTVs. The TTV amplitude is as large as 50 s, and if this is real, it cannot be explained by other effects than that due to an additional body or bodies. From the RV data, we put an upper limit on the RV amplitude caused by the possible secondary body (planet) as 21 m s^{-1}, which corresponds to its mass of 22-70 M_{Earth} over the orbital period ratio of the two planets from 0.2 to 5.0. From the TTVs data, using the numerical simulations, we place more stringent limits down to 2 M_{Earth} near 1:2 and 2:1 mean motion resonances (MMRs) with WASP-5b at the 3 σlevel, assuming that the two planets are co-planer. We also put an upper limit on excess of Trojan mass as 43 M_{Earth} (3 σ) using both RV and photometric data. We also find that if the possible secondary planet has non- or a small eccentricity, its orbit would likely be near low-order MMRs. Further follow-up photometric and spectroscopic observations will be required to confirm the reality of the TTV signal, and results such as these will provide important information for the migration mechanisms of planetary systems.

Wei Lin

What is connected

Connect this record

See the researcher in context

Building this map preview

38 published item(s)

RecRM-Bench: Benchmarking Multidimensional Reward Modeling for Agentic Recommender Systems

A meridian lemma for fully alternating links in thickened surfaces

AC-Feasible Power Transfer Regions of Virtual Power Plants: Characterization and Application

CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph Similarity Learning

Continuity scaling: A rigorous framework for detecting and quantifying causality accurately

Cross-View Cross-Scene Multi-View Crowd Counting

Efficient Pipeline Planning for Expedited Distributed DNN Training

GDsmith: Detecting Bugs in Graph Database Engines

Neural Piecewise-Constant Delay Differential Equations

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

RAW-GNN: RAndom Walk Aggregation based Graph Neural Network

Tie-line Security Regions in High Dimension for Renewable Accommodations

Whale: Efficient Giant Model Training over Heterogeneous GPUs

AdaBERT: Task-Adaptive BERT Compression with Differentiable Neural Architecture Search

Joule-Thomson expansion of the torus-like black hole

Neural Delay Differential Equations

Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads

DAPPLE: A Pipelined Data Parallel Approach for Training Large Models

Graph Structural-topic Neural Network

Grasping Detection Network with Uncertainty Estimation for Confidence-Driven Semi-Supervised Domain Adaptation

Impact of intra and inter-cluster coupling balance on the performance of nonlinear networked systems

NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization

One-shot Text Field Labeling using Attention and Belief Propagation for Structure Information Extraction

Pixel-wise Crowd Understanding via Synthetic Data

RPM-Oriented Query Rewriting Framework for E-commerce Keyword-Based Sponsored Search

SwapText: Image Based Texts Transfer in Scenes

Supercontinuum generation without residual pump peak through multiple coherent pump seeds

Large Covariance Estimation for Compositional Data via Composition-Adjusted Thresholding

Neural Networks Models for Entity Discovery and Linking

Consistent Pricing of VIX and Equity Derivatives with the 4/2 Stochastic Volatility Plus Jumps Model

Observable Signatures of a Classical Transition

Regularization Methods for High-Dimensional Instrumental Variables Regression With an Application to Genetical Genomics

Evaluation on the Financial Competitiveness of Chinese Listed Real Estate Companies Based on Entropy Method

A Dynamical Model Reveals Gene Co-Localizations in Nucleus

High-Dimensional Sparse Additive Hazards Regression

A Novel VSWR-Protected and Controllable CMOS Class E Power Amplifier for Bluetooth Applications

Bifurcations of Emergent Bursting in a Neuronal Network

Measurements of Transit Timing Variations for WASP-5b