Source author record

Zhao Zhang

Zhao Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

80works

41topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

LASAR: Latent Adaptive Semantic Aligned Reasoning for Generative Recommendation

Large Language Models (LLMs) have demonstrated powerful reasoning capabilities through Chain-of-Thought (CoT) in various tasks, yet the inefficiency of token-by-token generation hinders real-world deployment in latency-sensitive recommender systems. Latent reasoning has emerged as an effective paradigm in LLMs, performing multi-step inference in a continuous hidden-state space to achieve stronger reasoning at lower cost. However, this paradigm remains underexplored in mainstream generative recommendation. Adapting it reveals three unique challenges: (1) the gap between prior-less Semantic ID (SID) symbols and continuous latent reasoning - SIDs lack pre-trained semantics, hindering joint optimization; (2) representation drift due to a lack of reasoning chain supervision; and (3) the suboptimality of applying a globally fixed reasoning depth. To address these, we propose LASAR (Latent Adaptive Semantic Aligned Reasoning), an SFT-then-RL framework. First, we bridge this gap via two-stage training: Stage 1 grounds SID semantics before Stage 2 introduces latent reasoning, ensuring efficient convergence. Second, we mitigate representation drift through explicit CoT semantic alignment. Step-wise bidirectional KL divergence constrains the latent reasoning trajectory using hidden-state anchors extracted from CoT text, while a Policy Head predicts per-sample reasoning depth. Third, during the GRPO-based RL phase, terminal-only KL alignment accommodates variable-length reasoning, and REINFORCE optimizes the Policy Head to dynamically allocate steps. This nearly halves the average latent step count while simultaneously improving recommendation quality. Experiments on three real-world datasets demonstrate that LASAR outperforms all baselines. It adds marginal inference latency and is roughly 20 times faster than generating explicit CoT text.

preprint2026arXiv

Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety

Ensuring that Large Language Models (LLMs) adhere to safety principles without refusing benign requests remains a significant challenge. While OpenAI introduces deliberative alignment (DA) to enhance the safety of its o-series models through reasoning over detailed ``code-like'' safety rules, the effectiveness of this approach in open-source LLMs, which typically lack advanced reasoning capabilities, is understudied. In this work, we systematically evaluate the impact of explicitly specifying extensive safety codes versus demonstrating them through illustrative cases. We find that referencing explicit codes inconsistently improves harmlessness and systematically degrades helpfulness, whereas training on case-augmented simple codes yields more robust and generalized safety behaviors. By guiding LLMs with case-augmented reasoning instead of extensive code-like safety rules, we avoid rigid adherence to narrowly enumerated rules and enable broader adaptability. Building on these insights, we propose CADA, a case-augmented deliberative alignment method for LLMs utilizing reinforcement learning on self-generated safety reasoning chains. CADA effectively enhances harmlessness, improves robustness against attacks, and reduces over-refusal while preserving utility across diverse benchmarks, offering a practical alternative to rule-only DA for improving safety while maintaining helpfulness.

preprint2026arXiv

Route Before Retrieve: Activating Latent Routing Abilities of LLMs for RAG vs. Long-Context Selection

Recent advances in large language models (LLMs) have expanded the context window to beyond 128K tokens, enabling long-document understanding and multi-source reasoning. A key challenge, however, lies in choosing between retrieval-augmented generation (RAG) and long-context (LC) strategies: RAG is efficient but constrained by retrieval quality, while LC supports global reasoning at higher cost and with position sensitivity. Existing methods such as Self-Route adopt failure-driven fallback from RAG to LC, but remain passive, inefficient, and hard to interpret. We propose Pre-Route, a proactive routing framework that performs structured reasoning before answering. Using lightweight metadata (e.g., document type, length, initial snippet), Pre-Route enables task analysis, coverage estimation, and information-need prediction, producing explainable and cost-efficient routing decisions. Our study shows three key findings: (i) LLMs possess latent routing ability that can be reliably elicited with guidelines, allowing single-sample performance to approach that of multi-sample (Best-of-N) results; (ii) linear probes reveal that structured prompts sharpen the separability of the "optimal routing dimension" in representation space; and (iii) distillation transfers this reasoning structure to smaller models for lightweight deployment. Experiments on LaRA (in-domain) and LongBench-v2 (OOD) confirm that Pre-Route outperforms Always-RAG, Always-LC, and Self-Route baselines, achieving superior overall cost-effectiveness.

preprint2026arXiv

SynGR: Unleashing the Potential of Cross-Modal Synergy for Generative Recommendation

Generative Recommendation (GR) has emerged as a promising paradigm by formulating item recommendation as a sequence-to-sequence generation task over item identifiers. Recent studies have incorporated multimodal signals to provide richer token-level evidence for generation. However, existing approaches largely rely on alignment-centric fusion and underexplore synergistic information across modalities. In practice, synergistic information plays a critical role in capturing emergent item properties that cannot be inferred from any single modality alone. Such properties encode intrinsic item semantics and guide user preferences, enabling models to move beyond surface-level feature matching. To address this limitation, we propose \textbf{SynGR}, a synergistic generative recommendation framework that explicitly encourages the exploitation of cross-modal dependencies during generation. By constraining overreliance on dominant modalities, SynGR enables the model to capture emergent item semantics beyond shared or modality-specific signals. Extensive experiments across three benchmark datasets demonstrate that SynGR achieves superior performance.

preprint2025arXiv

Absolute frequency measurement of a Lu$^+$ $(^{3}\rm D_1)$ optical frequency standard via link to international atomic time

We report on an absolute frequency measurement of the ${\rm Lu}^{+}\,(^{3}\rm D_1)$ standard frequency which is defined as the hyperfine-average of $^{1}\rm S_0$ to $^{3}\rm D_1$ optical clock transitions in $^{176}{\rm Lu}^{+}$. The measurement result of $353\,638\,794\,073\,800.35(33)$Hz with a fractional uncertainty of $9.2 \times 10^{-16}$ was obtained by operating a single-ion $^{176}{\rm Lu}^{+}$ frequency standard intermittently over 3 months with a total uptime of 162 hours. Traceability to the International System of Units (SI) is realized by remote link to International Atomic Time. This is the first reported absolute frequency value for a ${\rm Lu}^{+}\,(^{3}\rm D_1)$ optical frequency standard.

preprint2025arXiv

Benchmarking LLMs for Fine-Grained Code Review with Enriched Context in Practice

Code review is a cornerstone of software quality assurance, and recent advances in Large Language Models (LLMs) have shown promise in its automation. However, existing benchmarks for LLM-based code review face three major limitations. Lack of semantic context: most benchmarks provide only code diffs without textual information such as issue descriptions, which are crucial for understanding developer intent. Data quality issues: without rigorous validation, many samples are noisy-e.g., reviews on outdated or irrelevant code-reducing evaluation reliability. Coarse granularity: most benchmarks operate at the file or commit level, overlooking the fine-grained, line-level reasoning essential for precise review. We introduce ContextCRBench, a high-quality, context-rich benchmark for fine-grained LLM evaluation in code review. Our construction pipeline comprises: Raw Data Crawling, collecting 153.7K issues and pull requests from top-tier repositories; Comprehensive Context Extraction, linking issue-PR pairs for textual context and extracting the full surrounding function or class for code context; and Multi-stage Data Filtering, combining rule-based and LLM-based validation to remove outdated, malformed, or low-value samples, resulting in 67,910 context-enriched entries. ContextCRBench supports three evaluation scenarios aligned with the review workflow: hunk-level quality assessment, line-level defect localization, and line-level comment generation. Evaluating eight leading LLMs (four closed-source and four open-source) reveals that textual context yields greater performance gains than code context alone, while current LLMs remain far from human-level review ability. Deployed at ByteDance, ContextCRBench drives a self-evolving code review system, improving performance by 61.98% and demonstrating its robustness and industrial utility. https://github.com/kinesiatricssxilm14/ContextCRBench.

preprint2025arXiv

Zeeman Degenerate Sideband Cooling in $^{176}$Lu$^+$

We explore degenerate Raman sideband cooling in which neighboring Zeeman states of a fixed hyperfine level are coupled via a two-photon Raman transition. The degenerate coupling between $|F,m_F\rangle\rightarrow |F,m_F-1\rangle$ facilitates the removal of multiple motional quanta in a single cycle. This method greatly reduces the number of cooling cycles required to reach the ground state compared to traditional sideband cooling. We show that near ground state cooling can be achieved with a pulse number as low as $\bar{n}$ where $\bar{n}$ is the average phonon number in the initial thermal state. We demonstrate proof-of-concept in $^{176}\mathrm{Lu}^+$ by coupling neighboring Zeeman levels on the motional sideband for the $F=7$ hyperfine level in $^3D_1$. Starting from a thermal distribution with an average phonon number of 6, we demonstrate near ground-state cooling with $\sim10$ pulses. A theoretical description is given that applies to any $F$ level and demonstrates how effective this approach can be.

preprint2022arXiv

A Hierarchical Interactive Network for Joint Span-based Aspect-Sentiment Analysis

Recently, some span-based methods have achieved encouraging performances for joint aspect-sentiment analysis, which first extract aspects (aspect extraction) by detecting aspect boundaries and then classify the span-level sentiments (sentiment classification). However, most existing approaches either sequentially extract task-specific features, leading to insufficient feature interactions, or they encode aspect features and sentiment features in a parallel manner, implying that feature representation in each task is largely independent of each other except for input sharing. Both of them ignore the internal correlations between the aspect extraction and sentiment classification. To solve this problem, we novelly propose a hierarchical interactive network (HI-ASA) to model two-way interactions between two tasks appropriately, where the hierarchical interactions involve two steps: shallow-level interaction and deep-level interaction. First, we utilize cross-stitch mechanism to combine the different task-specific features selectively as the input to ensure proper two-way interactions. Second, the mutual information technique is applied to mutually constrain learning between two tasks in the output layer, thus the aspect input and the sentiment input are capable of encoding features of the other task via backpropagation. Extensive experiments on three real-world datasets demonstrate HI-ASA's superiority over baselines.

preprint2022arXiv

A Multi-Task Learning Model for Super Resolution of Wireless Channel Characteristics

Channel modeling has always been the core part in communication system design and development, especially in 5G and 6G era. Traditional approaches like stochastic channel modeling and ray-tracing (RT) based channel modeling depend heavily on measurement data or simulation, which are usually expensive and time consuming. In this paper, we propose a novel super resolution (SR) model for generating channel characteristics data. The model is based on multi-task learning (MTL) convolutional neural networks (CNN) with residual connection. Experiments demonstrate that the proposed SR model could achieve excellent performances in mean absolute error and standard deviation of error. Advantages of the proposed model are demonstrated in comparisons with other state-of-the-art deep learning models. Ablation study also proved the necessity of multi-task learning and techniques in model design. The contribution in this paper could be helpful in channel modeling, network optimization, positioning and other wireless channel characteristics related work by largely reducing workload of simulation or measurement.

preprint2022arXiv

A parallel algorithm for minimum weight set cover with small neighborhood property

This paper studies the minimum weight set cover (MinWSC) problem with a {\em small neighborhood cover} (SNC) property proposed by Agarwal {\it et al.} in \cite{Agarwal.}. A parallel algorithm for MinWSC with $τ$-SNC property is presented, obtaining approximation ratio $τ(1+3\varepsilon)$ in $O(L\log_{1+\varepsilon}\frac{n^3}{\varepsilon^2}+ 4τ^{3}2^τL^2\log n)$ rounds, where $0< \varepsilon <\frac{1}{2}$ is a constant, $n$ is the number of elements, and $L$ is a parameter related to SNC property. Our results not only improve the approximation ratio obtained in \cite{Agarwal.}, but also answer two questions proposed in \cite{Agarwal.}.

preprint2022arXiv

A Search for Millilensing Gamma-Ray Bursts in the Observations of Fermi GBM

Millilensing of Gamma-Ray Bursts (GRBs) is expected to manifest as multiple emission episodes in a single triggered GRB with similar light-curve patterns and similar spectrum properties. Identifying such lensed GRBs could help improve constraints on the abundance of compact dark matter. Here we present a systemic search for millilensing among 3000 GRBs observed by the \textit{Fermi} GBM up to 2021 April. Eventually we find 4 interesting candidates by performing auto-correlation test, hardness test, and time-integrated/resolved spectrum test to the whole sample. GRB 081126A and GRB 090717A are ranked as the first class candidate based on their excellent performance both in temporal and spectrum analysis. GRB 081122A and GRB 110517B are ranked as the second class candidates (suspected candidates), mainly because their two emission episodes show clear deviations in part of the time-resolved spectrum or in the time-integrated spectrum. Considering a point mass model for the gravitational lens, our results suggest that the density parameter of lens objects with mass $M_{\rm L}\sim10^{6} M_{\odot}$ is larger than $1.5\times10^{-3}$.

preprint2022arXiv

A Survey on Incomplete Multi-view Clustering

Conventional multi-view clustering seeks to partition data into respective groups based on the assumption that all views are fully observed. However, in practical applications, such as disease diagnosis, multimedia analysis, and recommendation system, it is common to observe that not all views of samples are available in many cases, which leads to the failure of the conventional multi-view clustering methods. Clustering on such incomplete multi-view data is referred to as incomplete multi-view clustering. In view of the promising application prospects, the research of incomplete multi-view clustering has noticeable advances in recent years. However, there is no survey to summarize the current progresses and point out the future research directions. To this end, we review the recent studies of incomplete multi-view clustering. Importantly, we provide some frameworks to unify the corresponding incomplete multi-view clustering methods, and make an in-depth comparative analysis for some representative methods from theoretical and experimental perspectives. Finally, some open problems in the incomplete multi-view clustering field are offered for researchers.

preprint2022arXiv

Approximation Algorithm for Minimum $p$ Union Under a Geometric Setting

In a minimum $p$ union problem (Min$p$U), given a hypergraph $G=(V,E)$ and an integer $p$, the goal is to find a set of $p$ hyperedges $E'\subseteq E$ such that the number of vertices covered by $E'$ (that is $|\bigcup_{e\in E'}e|$) is minimized. It was known that Min$p$U is at least as hard as the densest $k$-subgraph problem. A question is: how about the problem in some geometric settings? In this paper, we consider the unit square Min$p$U problem (Min$p$U-US) in which $V$ is a set of points on the plane, and each hyperedge of $E$ consists of a set of points in a unit square. A $(\frac{1}{1+\varepsilon},4)$-bicriteria approximation algorithm is presented, that is, the algorithm finds at least $\frac{p}{1+\varepsilon}$ unit squares covering at most $4opt$ points, where $opt$ is the optimal value for the Min$p$U-US instance (the minimum number of points that can be covered by $p$ unit squares).

preprint2022arXiv

Customized Conversational Recommender Systems

Conversational recommender systems (CRS) aim to capture user's current intentions and provide recommendations through real-time multi-turn conversational interactions. As a human-machine interactive system, it is essential for CRS to improve the user experience. However, most CRS methods neglect the importance of user experience. In this paper, we propose two key points for CRS to improve the user experience: (1) Speaking like a human, human can speak with different styles according to the current dialogue context. (2) Identifying fine-grained intentions, even for the same utterance, different users have diverse finegrained intentions, which are related to users' inherent preference. Based on the observations, we propose a novel CRS model, coined Customized Conversational Recommender System (CCRS), which customizes CRS model for users from three perspectives. For human-like dialogue services, we propose multi-style dialogue response generator which selects context-aware speaking style for utterance generation. To provide personalized recommendations, we extract user's current fine-grained intentions from dialogue context with the guidance of user's inherent preferences. Finally, to customize the model parameters for each user, we train the model from the meta-learning perspective. Extensive experiments and a series of analyses have shown the superiority of our CCRS on both the recommendation and dialogue services.

preprint2022arXiv

DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling

Incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks. However, previous works commonly have difficulty dealing with large-scale dynamic lexicons which often cause excessive matching noise and problems of frequent updates. In this paper, we propose DyLex, a plug-in lexicon incorporation approach for BERT based sequence labeling tasks. Instead of leveraging embeddings of words in the lexicon as in conventional methods, we adopt word-agnostic tag embeddings to avoid re-training the representation while updating the lexicon. Moreover, we employ an effective supervised lexical knowledge denoising method to smooth out matching noise. Finally, we introduce a col-wise attention based knowledge fusion mechanism to guarantee the pluggability of the proposed framework. Experiments on ten datasets of three tasks show that the proposed framework achieves new SOTA, even with very large scale lexicons.

preprint2022arXiv

Glassy crystals with colossal multi-baroresponsivities

As a nontrivial solid state of matter, the glassy-crystal state embraces physical features of both crystalline and amorphous solids, where a long-range ordered periodic structure formed by the mass centers of constituent molecules accommodates orientational glasses. Here, we discover and validate a glassy-crystal state in 2-amino-2-methyl-1,3-propanediol (AMP, C4H11NO2) by neutron scattering and complementary broadband dielectric spectroscopy (BDS) measurements. The freezing process of the dynamic orientational disorder is manifested at relaxation times well described by the Vogel-Fulcher-Tammann (VFT) law and the strongly frequency-dependent freezing temperature ranging from around 225 K at 0.1 Hz to above room temperature in the GHz region. At room temperature, the supercooled state is extremely sensitive to pressure such that a few MPa pressure can induce crystallization to the ordered crystal state, eventually leading to a temperature increase by 48 K within 20 s, a significant reduction of visible light transmittance from about 95% to a few percentages, and a remarkable decrease of electrical conductivity by three orders of magnitude. These ultrasensitive baroresponsivities might find their applications in low-grade waste heat recycling, pressure sensors and non-volatile memory devices. It is expected that glassy crystals serve as an emerging platform for exploiting exotic states of matter and the associated fantastic applications.

preprint2022arXiv

Hidden SU(2)_D vector dark matter with a scalar septuplet

We propose a vector dark matter model from a hidden SU(2)_D gauge symmetry at TeV scale. A scalar septuplet is introduced to break the SU(2)_D symmetry spontaneously. The septuplet also play the role of a portal between the standard model and the dark sector. We find that there are two different vacuum configurations corresponding to the sign of the quartic coupling $λ_3$, which yields different mass spectrum for the gauge bosons. For a $λ_3<0$, the masses of gauge bosons are splitting, while for a $λ_3\geq0$, the masses are degenerate. We also study the RG evolutions of the couplings, and find that the perturbativity and vacuum stability can set a stringent bound on the parameter space. For the phenomenological aspect, we consider the experimental constraints including dark matter direct detection, indirect detection, relic density, and Higgs couplings measurements. We find that there are parameter space survive from all the constraints, and they can be tested in future dark matter direct and indirect detection experiments.

preprint2022arXiv

Image Harmonization by Matching Regional References

To achieve visual consistency in composite images, recent image harmonization methods typically summarize the appearance pattern of global background and apply it to the global foreground without location discrepancy. However, for a real image, the appearances (illumination, color temperature, saturation, hue, texture, etc) of different regions can vary significantly. So previous methods, which transfer the appearance globally, are not optimal. Trying to solve this issue, we firstly match the contents between the foreground and background and then adaptively adjust every foreground location according to the appearance of its content-related background regions. Further, we design a residual reconstruction strategy, that uses the predicted residual to adjust the appearance, and the composite foreground to reserve the image details. Extensive experiments demonstrate the effectiveness of our method. The source code will be available publicly.

preprint2022arXiv

Improved Parallel Algorithm for Minimum Cost Submodular Cover Problem

In the minimum cost submodular cover problem (MinSMC), we are given a monotone nondecreasing submodular function $f\colon 2^V \rightarrow \mathbb{Z}^+$, a linear cost function $c: V\rightarrow \mathbb R^{+}$, and an integer $k\leq f(V)$, the goal is to find a subset $A\subseteq V$ with the minimum cost such that $f(A)\geq k$. The MinSMC can be found at the heart of many machine learning and data mining applications. In this paper, we design a parallel algorithm for the MinSMC that takes at most $O(\frac{\log km\log k(\log m+\log\log mk)}{\varepsilon^4})$ adaptive rounds, and it achieves an approximation ratio of $\frac{H(\min\{Δ,k\})}{1-5\varepsilon}$ with probability at least $1-3\varepsilon$, where $Δ=\max_{v\in V}f(v)$, $H(\cdot)$ is the Harmonic number, $m=|V|$, and $\varepsilon$ is a constant in $(0,\frac{1}{5})$.

preprint2022arXiv

Interactive Style Transfer: All is Your Palette

Neural style transfer (NST) can create impressive artworks by transferring reference style to content image. Current image-to-image NST methods are short of fine-grained controls, which are often demanded by artistic editing. To mitigate this limitation, we propose a drawing-like interactive style transfer (IST) method, by which users can interactively create a harmonious-style image. Our IST method can serve as a brush, dip style from anywhere, and then paint to any region of the target content image. To determine the action scope, we formulate a fluid simulation algorithm, which takes styles as pigments around the position of brush interaction, and diffusion in style or content images according to the similarity maps. Our IST method expands the creative dimension of NST. By dipping and painting, even employing one style image can produce thousands of eye-catching works. The demo video is available in supplementary files or in http://mmcheng.net/ist.

preprint2022arXiv

Macroscopic Traffic Flow Modeling with Physics Regularized Gaussian Process: Generalized Formulations

Despite the success of classical traffic flow (e.g., second-order macroscopic) models and data-driven (e.g., Machine Learning - ML) approaches in traffic state estimation, those approaches either require great efforts for parameter calibrations or lack theoretical interpretation. To fill this research gap, this study presents a new modeling framework, named physics regularized Gaussian process (PRGP). This novel approach can encode physics models, i.e., classical traffic flow models, into the Gaussian process architecture and so as to regularize the ML training process. Particularly, this study aims to discuss how to develop a PRGP model when the original physics model is with discrete formulations. Then based on the posterior regularization inference framework, an efficient stochastic optimization algorithm is developed to maximize the evidence lowerbound of the system likelihood. To prove the effectiveness of the proposed model, this paper conducts empirical studies on a real-world dataset that is collected from a stretch of I-15 freeway, Utah. Results show the new PRGP model can outperform the previous compatible methods, such as calibrated physics models and pure machine learning methods, in estimation precision and input robustness.

preprint2022arXiv

On Deep Recurrent Reinforcement Learning for Active Visual Tracking of Space Noncooperative Objects

Active tracking of space noncooperative object that merely relies on vision camera is greatly significant for autonomous rendezvous and debris removal. Considering its Partial Observable Markov Decision Process (POMDP) property, this paper proposes a novel tracker based on deep recurrent reinforcement learning, named as RAMAVT which drives the chasing spacecraft to follow arbitrary space noncooperative object with high-frequency and near-optimal velocity control commands. To further improve the active tracking performance, we introduce Multi-Head Attention (MHA) module and Squeeze-and-Excitation (SE) layer into RAMAVT, which remarkably improve the representative ability of neural network with almost no extra computational cost. Extensive experiments and ablation study implemented on SNCOAT benchmark show the effectiveness and robustness of our method compared with other state-of-the-art algorithm. The source codes are available on https://github.com/Dongzhou-1996/RAMAVT.

preprint2022arXiv

Performance Guaranteed Evolutionary Algorithm for Minimum Connected Dominating Set

A connected dominating set is a widely adopted model for the virtual backbone of a wireless sensor network. In this paper, we design an evolutionary algorithm for the minimum connected dominating set problem (MinCDS), whose performance is theoretically guaranteed in terms of both computation time and approximation ratio. Given a connected graph $G=(V,E)$, a connected dominating set (CDS) is a subset $C\subseteq V$ such that every vertex in $V\setminus C$ has a neighbor in $C$, and the subgraph of $G$ induced by $C$ is connected. The goal of MinCDS is to find a CDS of $G$ with the minimum cardinality. We show that our evolutionary algorithm can find a CDS in expected $O(n^3)$ time which approximates the optimal value within factor $(2+\lnΔ)$, where $n$ and $Δ$ are the number of vertices and the maximum degree of graph $G$, respectively.

preprint2022arXiv

PointScatter: Point Set Representation for Tubular Structure Extraction

This paper explores the point set representation for tubular structure extraction tasks. Compared with the traditional mask representation, the point set representation enjoys its flexibility and representation ability, which would not be restricted by the fixed grid as the mask. Inspired by this, we propose PointScatter, an alternative to the segmentation models for the tubular structure extraction task. PointScatter splits the image into scatter regions and parallelly predicts points for each scatter region. We further propose the greedy-based region-wise bipartite matching algorithm to train the network end-to-end and efficiently. We benchmark the PointScatter on four public tubular datasets, and the extensive experiments on tubular structure segmentation and centerline extraction task demonstrate the effectiveness of our approach. Code is available at https://github.com/zhangzhao2022/pointscatter.

preprint2022arXiv

Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion

Most real-world knowledge graphs (KG) are far from complete and comprehensive. This problem has motivated efforts in predicting the most plausible missing facts to complete a given KG, i.e., knowledge graph completion (KGC). However, existing KGC methods suffer from two main issues, 1) the false negative issue, i.e., the sampled negative training instances may include potential true facts; and 2) the data sparsity issue, i.e., true facts account for only a tiny part of all possible facts. To this end, we propose positive-unlabeled learning with adversarial data augmentation (PUDA) for KGC. In particular, PUDA tailors positive-unlabeled risk estimator for the KGC task to deal with the false negative issue. Furthermore, to address the data sparsity issue, PUDA achieves a data augmentation strategy by unifying adversarial training and positive-unlabeled learning under the positive-unlabeled minimax game. Extensive experimental results on real-world benchmark datasets demonstrate the effectiveness and compatibility of our proposed method.

preprint2022arXiv

Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting

Multivariate Time Series (MTS) forecasting plays a vital role in a wide range of applications. Recently, Spatial-Temporal Graph Neural Networks (STGNNs) have become increasingly popular MTS forecasting methods. STGNNs jointly model the spatial and temporal patterns of MTS through graph neural networks and sequential models, significantly improving the prediction accuracy. But limited by model complexity, most STGNNs only consider short-term historical MTS data, such as data over the past one hour. However, the patterns of time series and the dependencies between them (i.e., the temporal and spatial patterns) need to be analyzed based on long-term historical MTS data. To address this issue, we propose a novel framework, in which STGNN is Enhanced by a scalable time series Pre-training model (STEP). Specifically, we design a pre-training model to efficiently learn temporal patterns from very long-term history time series (e.g., the past two weeks) and generate segment-level representations. These representations provide contextual information for short-term time series input to STGNNs and facilitate modeling dependencies between time series. Experiments on three public real-world datasets demonstrate that our framework is capable of significantly enhancing downstream STGNNs, and our pre-training model aptly captures temporal patterns.

preprint2022arXiv

Scalable all-optical cold damping of levitated nanoparticles

The field of levitodynamics has made significant progress towards controlling and studying the motion of a levitated nanoparticle. Motional control relies on either autonomous feedback via a cavity or measurement-based feedback via external forces. Recent demonstrations of measurement-based ground-state cooling of a single nanoparticle employ linear velocity feedback, also called cold damping, and require the use of electrostatic forces on charged particles via external electrodes. Here we introduce a novel all-optical cold damping scheme based on spatial modulation of the trap position that is scalable to multiple particles. The scheme relies on using programmable optical tweezers to provide full independent control over trap frequency and position of each tweezer. We show that the technique cools the center-of-mass motion of particles down to $17\,$mK at a pressure of $2 \times 10^{-6}\,$mbar and demonstrate its scalability by simultaneously cooling the motion of two particles. Our work paves the way towards studying quantum interactions between particles, achieving 3D quantum control of particle motion without cavity-based cooling, electrodes or charged particles, and probing multipartite entanglement in levitated optomechanical systems.

preprint2022arXiv

Spatial-Temporal Identity: A Simple yet Effective Baseline for Multivariate Time Series Forecasting

Multivariate Time Series (MTS) forecasting plays a vital role in a wide range of applications. Recently, Spatial-Temporal Graph Neural Networks (STGNNs) have become increasingly popular MTS forecasting methods due to their state-of-the-art performance. However, recent works are becoming more sophisticated with limited performance improvements. This phenomenon motivates us to explore the critical factors of MTS forecasting and design a model that is as powerful as STGNNs, but more concise and efficient. In this paper, we identify the indistinguishability of samples in both spatial and temporal dimensions as a key bottleneck, and propose a simple yet effective baseline for MTS forecasting by attaching Spatial and Temporal IDentity information (STID), which achieves the best performance and efficiency simultaneously based on simple Multi-Layer Perceptrons (MLPs). These results suggest that we can design efficient and effective models as long as they solve the indistinguishability of samples, without being limited to STGNNs.

preprint2022arXiv

Tunable Non-equilibrium Phase Transitions between Spatial and Temporal Order through Dissipation

We propose an experiment with a driven quantum gas coupled to a dissipative optical cavity that realizes a novel kind of far-from-equilibrium phase transition between spatial and temporal order. The control parameter of the transition is the detuning between the drive frequency and the cavity resonance. For negative detunings, the system features a spatially ordered phase, while positive detunings lead to a phase with both spatial order and persistent oscillations, which we call dissipative spatio-temporal lattice. We give numerical and analytical evidence for this superradiant phase transition and show that the spatio-temporal lattice originates from cavity dissipation. In both regimes the atoms are subject to an accelerated transport, either via a uniform acceleration or via abrupt transitions to higher momentum states. Our work provides perspectives for temporal phases of matter that are not possible at equilibrium.

preprint2021arXiv

A Comprehensive Consistency Check between Synchrotron radiation and the Observed Gamma-ray Burst Spectra

We performed a time-resolved spectral analysis of 53 bright gamma-ray bursts (GRBs) observed by \textit{Fermi}/GBM. Our sample consists of 908 individual spectra extracted from the finest time slices in each GRB. We fitted them with the synchrotron radiation model by considering the electron distributions in five different cases: mono-energetic, single power-law, Maxwellian, traditional fast cooling, and broken power-law. Our results were further qualified through Bayesian Information Criterion (BIC) by comparing with the fit by empirical models, namely the so-called Band function and cut-off power-law models. Our study showed that the synchrotron models, except for the fast-cooling case, can successfully fit most observed spectra, with the single power-law case being the most preferred. We also found that the electron distribution indices for the single power-law synchrotron fit in more than half of our spectra exhibits flux-tracking behavior, i.e., the index increases/decreases with the flux increasing/decreasing, implying that the distribution of the radiating electrons is increasingly narrower with time before the flux peaks and becomes more spreading afterward. Our results indicate that the synchrotron radiation is still feasible as a radiation mechanism of the GRB prompt emission phase.

preprint2021arXiv

A Survey on Concept Factorization: From Shallow to Deep Representation Learning

The quality of learned features by representation learning determines the performance of learning algorithms and the related application tasks (such as high-dimensional data clustering). As a relatively new paradigm for representation learning, Concept Factorization (CF) has attracted a great deal of interests in the areas of machine learning and data mining for over a decade. Lots of effective CF based methods have been proposed based on different perspectives and properties, but note that it still remains not easy to grasp the essential connections and figure out the underlying explanatory factors from exiting studies. In this paper, we therefore survey the recent advances on CF methodologies and the potential benchmarks by categorizing and summarizing the current methods. Specifically, we first re-view the root CF method, and then explore the advancement of CF-based representation learning ranging from shallow to deep/multilayer cases. We also introduce the potential application areas of CF-based methods. Finally, we point out some future directions for studying the CF-based representation learning. Overall, this survey provides an insightful overview of both theoretical basis and current developments in the field of CF, which can also help the interested researchers to understand the current trends of CF and find the most appropriate CF techniques to deal with particular applications.

preprint2021arXiv

Dense Residual Network: Enhancing Global Dense Feature Flow for Character Recognition

Deep Convolutional Neural Networks (CNNs), such as Dense Convolutional Networks (DenseNet), have achieved great success for image representation by discovering deep hierarchical information. However, most existing networks simply stacks the convolutional layers and hence failing to fully discover local and global feature information among layers. In this paper, we mainly explore how to enhance the local and global dense feature flow by exploiting hierarchical features fully from all the convolution layers. Technically, we propose an efficient and effective CNN framework, i.e., Fast Dense Residual Network (FDRN), for text recognition. To construct FDRN, we propose a new fast residual dense block (f-RDB) to retain the ability of local feature fusion and local residual learning of original RDB, which can reduce the computing efforts at the same time. After fully learning local residual dense features, we utilize the sum operation and several f-RDBs to define a new block termed global dense block (GDB) by imitating the construction of dense blocks to learn global dense residual features adaptively in a holistic way. Finally, we use two convolution layers to construct a down-sampling block to reduce the global feature size and extract deeper features. Extensive simulations show that FDRN obtains the enhanced recognition results, compared with other related models.

preprint2021arXiv

Phase transition gravitational waves from pseudo-Nambu-Goldstone dark matter and two Higgs doublets

We investigate the potential stochastic gravitational waves from first-order electroweak phase transitions in a model with pseudo-Nambu-Goldstone dark matter and two Higgs doublets. The dark matter candidate can naturally evade direct detection bounds, and can achieve the observed relic abundance via the thermal mechanism. Three scalar fields in the model obtain vacuum expectation values, related to phase transitions at the early Universe. We search for the parameter points that can cause first-order phase transitions, taking into account the existed experimental constraints. The resulting gravitational wave spectra are further evaluated. Some parameter points are found to induce strong gravitational wave signals, which have the opportunity to be detected in future space-based interferometer experiments LISA, Taiji, and TianQin.

preprint2020arXiv

Bilateral Attention Network for RGB-D Salient Object Detection

Most existing RGB-D salient object detection (SOD) methods focus on the foreground region when utilizing the depth images. However, the background also provides important information in traditional SOD methods for promising performance. To better explore salient information in both foreground and background regions, this paper proposes a Bilateral Attention Network (BiANet) for the RGB-D SOD task. Specifically, we introduce a Bilateral Attention Module (BAM) with a complementary attention mechanism: foreground-first (FF) attention and background-first (BF) attention. The FF attention focuses on the foreground region with a gradual refinement style, while the BF one recovers potentially useful salient information in the background region. Benefitted from the proposed BAM module, our BiANet can capture more meaningful foreground and background cues, and shift more attention to refining the uncertain details between foreground and background regions. Additionally, we extend our BAM by leveraging the multi-scale techniques for better SOD performance. Extensive experiments on six benchmark datasets demonstrate that our BiANet outperforms other state-of-the-art RGB-D SOD methods in terms of objective metrics and subjective visual comparison. Our BiANet can run up to 80fps on $224\times224$ RGB-D images, with an NVIDIA GeForce RTX 2080Ti GPU. Comprehensive ablation studies also validate our contributions.

preprint2020arXiv

Compressed DenseNet for Lightweight Character Recognition

Convolutional Recurrent Neural Network (CRNN) is a popular network for recognizing texts in images. Advances like the variant of CRNN, such as Dense Convolutional Network with Connectionist Temporal Classification, has reduced the running time of the network, but exposing the inner computation cost and weight size of the convolutional networks as a bottleneck. Specifically, the DenseNet based models utilize the dense blocks as the core module, but the inner features are combined in the form of concatenation in dense blocks. As such, the number of channels of combined features delivered as the input of the layers close to the output and the relevant computational cost grows rapidly with the dense blocks getting deeper. This will severely bring heavy computational cost and big weight size, which restrict the depth of dense blocks. In this paper, we propose a compressed convolution block called Lightweight Dense Block (LDB). To reduce the computing cost and weight size, we re-define and re-design the way of combining internal features of the dense blocks. LDB is a convolutional block similarly as dense block, but it can reduce the computation cost and weight size to (1/L, 2/L), compared with original ones, where L is the number of layers in blocks. Moreover, LDB can be used to replace the original dense block in any DenseNet based models. Based on the LDBs, we propose a Compressed DenseNet (CDenseNet) for the lightweight character recognition. Extensive experiments demonstrate that CDenseNet can effectively reduce the weight size while delivering the promising recognition results.

preprint2020arXiv

Convolutional Dictionary Pair Learning Network for Image Representation Learning

Both the Dictionary Learning (DL) and Convolutional Neural Networks (CNN) are powerful image representation learning systems based on different mechanisms and principles, however whether we can seamlessly integrate them to improve the per-formance is noteworthy exploring. To address this issue, we propose a novel generalized end-to-end representation learning architecture, dubbed Convolutional Dictionary Pair Learning Network (CDPL-Net) in this paper, which integrates the learning schemes of the CNN and dictionary pair learning into a unified framework. Generally, the architecture of CDPL-Net includes two convolutional/pooling layers and two dictionary pair learn-ing (DPL) layers in the representation learning module. Besides, it uses two fully-connected layers as the multi-layer perception layer in the nonlinear classification module. In particular, the DPL layer can jointly formulate the discriminative synthesis and analysis representations driven by minimizing the batch based reconstruction error over the flatted feature maps from the convolution/pooling layer. Moreover, DPL layer uses l1-norm on the analysis dictionary so that sparse representation can be delivered, and the embedding process will also be robust to noise. To speed up the training process of DPL layer, the efficient stochastic gradient descent is used. Extensive simulations on real databases show that our CDPL-Net can deliver enhanced performance over other state-of-the-art methods.

preprint2020arXiv

Convolutional Neural Network Training with Distributed K-FAC

Training neural networks with many processors can reduce time-to-solution; however, it is challenging to maintain convergence and efficiency at large scales. The Kronecker-factored Approximate Curvature (K-FAC) was recently proposed as an approximation of the Fisher Information Matrix that can be used in natural gradient optimizers. We investigate here a scalable K-FAC design and its applicability in convolutional neural network (CNN) training at scale. We study optimization techniques such as layer-wise distribution strategies, inverse-free second-order gradient evaluation, and dynamic K-FAC update decoupling to reduce training time while preserving convergence. We use residual neural networks (ResNet) applied to the CIFAR-10 and ImageNet-1k datasets to evaluate the correctness and scalability of our K-FAC gradient preconditioner. With ResNet-50 on the ImageNet-1k dataset, our distributed K-FAC implementation converges to the 75.9% MLPerf baseline in 18-25% less time than does the classic stochastic gradient descent (SGD) optimizer across scales on a GPU cluster.

preprint2020arXiv

Discovery of oscillations above 200 keV in a black hole X-ray binary with Insight-HXMT

Low-frequency quasi-periodic oscillations (LFQPOs) are commonly found in black hole X-ray binaries, and their origin is still under debate. The properties of LFQPOs at high energies (above 30 keV) are closely related to the nature of the accretion flow in the innermost regions, and thus play a crucial role in critically testing various theoretical models. The Hard X-ray Modulation Telescope (Insight-HXMT) is capable of detecting emissions above 30 keV, and is therefore an ideal instrument to do so. Here we report the discovery of LFQPOs above 200 keV in the new black hole MAXI J1820+070 in the X-ray hard state, which allows us to understand the behaviours of LFQPOs at hundreds of kiloelectronvolts. The phase lag of the LFQPO is constant around zero below 30 keV, and becomes a soft lag (that is, the high-energy photons arrive first) above 30 keV. The soft lag gradually increases with energy and reaches ~0.9s in the 150-200 keV band. The detection at energies above 200 keV, the large soft lag and the energy-related behaviors of the LFQPO pose a great challenge for most currently existing models, but suggest that the LFQPO probably originates from the precession of a small-scale jet.

preprint2020arXiv

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition

The Deep Convolutional Neural Networks (CNNs) have obtained a great success for pattern recognition, such as recognizing the texts in images. But existing CNNs based frameworks still have several drawbacks: 1) the traditaional pooling operation may lose important feature information and is unlearnable; 2) the tradi-tional convolution operation optimizes slowly and the hierar-chical features from different layers are not fully utilized. In this work, we address these problems by developing a novel deep network model called Fully-Convolutional Intensive Feature Flow Neural Network (IntensiveNet). Specifically, we design a further dense block called intensive block to extract the feature information, where the original inputs and two dense blocks are connected tightly. To encode data appropriately, we present the concepts of dense fusion block and further dense fusion opera-tions for our new intensive block. By adding short connections to different layers, the feature flow and coupling between layers are enhanced. We also replace the traditional convolution by depthwise separable convolution to make the operation efficient. To prevent important feature information being lost to a certain extent, we use a convolution operation with stride 2 to replace the original pooling operation in the customary transition layers. The recognition results on large-scale Chinese string and MNIST datasets show that our IntensiveNet can deliver enhanced recog-nition results, compared with other related deep models.

preprint2020arXiv

LinSBFT: Linear-Communication One-Step BFT Protocol for Public Blockchains

This paper presents LinSBFT, a Byzantine Fault Tolerance (BFT) protocol with the capacity of processing over 2000 smart contract transactions per second in production. LinSBFT applies to a permissionless, public blockchain system, in which there is no public-key infrastructure, based on the classic PBFT with 4 improvements: (\romannumeral1) LinSBFT achieves $O(n)$ worst-case communication volume, in contract to PBFT's $O(n^4)$; (\romannumeral2) LinSBFT rotates the leader of protocol randomly to reduce the risk of denial-of-service attacks on leader; and (\romannumeral3) each run of LinSBFT finalizes one block, which is robust against participants that are honest in one run of the protocol, and dishonest in another, and the set of participants is dynamic, which is update periodically. (\romannumeral4) LinSBFT helps the delayed nodes to catch up via a synchronization mechanism to promise the liveness. Further, in the ordinary case, LinSBFT involves only a single round of voting instead of two in PBFT, which reduces both communication overhead and confirmation time, and employs the \emph{proof-of-stake} scheme to reward all participants. Extensive experiments using data obtained from the Ethereum demonstrate that LinSBFT consistently and significantly outperforms existing in-production BFT protocols for blockchains.

preprint2020arXiv

Macroscopic Traffic Flow Modeling with Physics Regularized Gaussian Process: A New Insight into Machine Learning Applications

Despite the wide implementation of machine learning (ML) techniques in traffic flow modeling recently, those data-driven approaches often fall short of accuracy in the cases with a small or noisy dataset. To address this issue, this study presents a new modeling framework, named physics regularized machine learning (PRML), to encode classical traffic flow models (referred as physical models) into the ML architecture and to regularize the ML training process. More specifically, a stochastic physics regularized Gaussian process (PRGP) model is developed and a Bayesian inference algorithm is used to estimate the mean and kernel of the PRGP. A physical regularizer based on macroscopic traffic flow models is also developed to augment the estimation via a shadow GP and an enhanced latent force model is used to encode physical knowledge into stochastic processes. Based on the posterior regularization inference framework, an efficient stochastic optimization algorithm is also developed to maximize the evidence lowerbound of the system likelihood. To prove the effectiveness of the proposed model, this paper conducts empirical studies on a real-world dataset which is collected from a stretch of I-15 freeway, Utah. Results show the new PRGP model can outperform the previous compatible methods, such as calibrated pure physical models and pure machine learning methods, in estimation precision and input robustness.

preprint2020arXiv

Mitochondria in higher plants possess H2 evolving activity which is closely related to complex I

Hydrogenase occupy a central place in the energy metabolism of anaerobic bacteria. Although the structure of mitochondrial complex I is similar to that of hydrogenase, whether it has hydrogen metabolic activity remain unclear. Here, we show that a H2 evolving activity exists in higher plants mitochondria and is closely related to complex I, especially around ubiquinone binding site. The H2 production could be inhibited by rotenone and ubiquinone. Hypoxia could simultaneously promote H2 evolution and succinate accumulation. Redox properties of quinone pool, adjusted by NADH or succinate according to oxygen concentration, acts as a valve to control the flow of protons and electrons and the production of H2. The coupling of H2 evolving activity of mitochondrial complex I with metabolic regulation reveals a more effective redox homeostasis regulation mechanism. Considering the ubiquity of mitochondria in eukaryotes, H2 metabolism might be the innate function of higher organisms. This may serve to explain, at least in part, the broad physiological effects of H2.

preprint2020arXiv

Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery

For subspace recovery, most existing low-rank representation (LRR) models performs in the original space in single-layer mode. As such, the deep hierarchical information cannot be learned, which may result in inaccurate recoveries for complex real data. In this paper, we explore the deep multi-subspace recovery problem by designing a multilayer architecture for latent LRR. Technically, we propose a new Multilayer Collabora-tive Low-Rank Representation Network model termed DeepLRR to discover deep features and deep subspaces. In each layer (>2), DeepLRR bilinearly reconstructs the data matrix by the collabo-rative representation with low-rank coefficients and projection matrices in the previous layer. The bilinear low-rank reconstruc-tion of previous layer is directly fed into the next layer as the input and low-rank dictionary for representation learning, and is further decomposed into a deep principal feature part, a deep salient feature part and a deep sparse error. As such, the coher-ence issue can be also resolved due to the low-rank dictionary, and the robustness against noise can also be enhanced in the feature subspace. To recover the sparse errors in layers accurately, a dynamic growing strategy is used, as the noise level will be-come smaller for the increase of layers. Besides, a neighborhood reconstruction error is also included to encode the locality of deep salient features by deep coefficients adaptively in each layer. Extensive results on public databases show that our DeepLRR outperforms other related models for subspace discovery and clustering.

preprint2020arXiv

Observation of E8 Particles in an Ising Chain Antiferromagnet

Near the transverse-field induced quantum critical point of the Ising chain, an exotic dynamic spectrum consisting of exactly eight particles was predicted, which is uniquely described by an emergent quantum integrable field theory with the symmetry of the $E_8$ Lie algebra, but rarely explored experimentally. Here we use high-resolution terahertz spectroscopy to resolve quantum spin dynamics of the quasi-one-dimensional Ising antiferromagnet BaCo$_2$V$_2$O$_8$ in an applied transverse field. By comparing to an analytical calculation of the dynamical spin correlations, we identify $E_8$ particles as well as their two-particle excitations.

preprint2020arXiv

The Limit of the Batch Size

Large-batch training is an efficient approach for current distributed deep learning systems. It has enabled researchers to reduce the ImageNet/ResNet-50 training from 29 hours to around 1 minute. In this paper, we focus on studying the limit of the batch size. We think it may provide a guidance to AI supercomputer and algorithm designers. We provide detailed numerical optimization instructions for step-by-step comparison. Moreover, it is important to understand the generalization and optimization performance of huge batch training. Hoffer et al. introduced "ultra-slow diffusion" theory to large-batch training. However, our experiments show contradictory results with the conclusion of Hoffer et al. We provide comprehensive experimental results and detailed analysis to study the limitations of batch size scaling and "ultra-slow diffusion" theory. For the first time we scale the batch size on ImageNet to at least a magnitude larger than all previous work, and provide detailed studies on the performance of many state-of-the-art optimization schemes under this setting. We propose an optimization recipe that is able to improve the top-1 test accuracy by 18% compared to the baseline.

preprint2020arXiv

Ultimate precision of multi-parameter quantum magnetometry under the parallel scheme

The precise measurement of a magnetic field is one of the most fundamental and important tasks in quantum metrology. Although extensive studies on quantum magnetometry have been carried out over past decades, the ultimate precision that can be achieved for the estimation of all three components of a magnetic field with entangled probe states under the parallel scheme remains unknown. Here we present the ultimate lower bound for the sum of arbitrarily weighted variances in the estimation of all three components of a magnetic field under the parallel scheme and show that this lower bound can be achieved for sufficiently large N. The optimal entangled probe state that achieves the ultimate precision is also explicitly constructed. The obtained precision sets the ultimate limit for the multi-parameter quantum magnetometry under the parallel scheme, which is of fundamental interest and importance in quantum metrology. Our approach also provides a way to characterize the tradeoff among the precisions of multiple parameters that arise from the constraints on the probe states.

preprint2020arXiv

Unsupervised Vehicle Re-identification with Progressive Adaptation

Vehicle re-identification (reID) aims at identifying vehicles across different non-overlapping cameras views. The existing methods heavily relied on well-labeled datasets for ideal performance, which inevitably causes fateful drop due to the severe domain bias between the training domain and the real-world scenes; worse still, these approaches required full annotations, which is labor-consuming. To tackle these challenges, we propose a novel progressive adaptation learning method for vehicle reID, named PAL, which infers from the abundant data without annotations. For PAL, a data adaptation module is employed for source domain, which generates the images with similar data distribution to unlabeled target domain as ``pseudo target samples''. These pseudo samples are combined with the unlabeled samples that are selected by a dynamic sampling strategy to make training faster. We further proposed a weighted label smoothing (WLS) loss, which considers the similarity between samples with different clusters to balance the confidence of pseudo labels. Comprehensive experimental results validate the advantages of PAL on both VehicleID and VeRi-776 dataset.

preprint2019arXiv

Deep Self-representative Concept Factorization Network for Representation Learning

In this paper, we investigate the unsupervised deep representation learning issue and technically propose a novel framework called Deep Self-representative Concept Factorization Network (DSCF-Net), for clustering deep features. To improve the representation and clustering abilities, DSCF-Net explicitly considers discovering hidden deep semantic features, enhancing the robustness proper-ties of the deep factorization to noise and preserving the local man-ifold structures of deep features. Specifically, DSCF-Net seamlessly integrates the robust deep concept factorization, deep self-expressive representation and adaptive locality preserving feature learning into a unified framework. To discover hidden deep repre-sentations, DSCF-Net designs a hierarchical factorization architec-ture using multiple layers of linear transformations, where the hierarchical representation is performed by formulating the prob-lem as optimizing the basis concepts in each layer to improve the representation indirectly. DSCF-Net also improves the robustness by subspace recovery for sparse error correction firstly and then performs the deep factorization in the recovered visual subspace. To obtain locality-preserving representations, we also present an adaptive deep self-representative weighting strategy by using the coefficient matrix as the adaptive reconstruction weights to keep the locality of representations. Extensive comparison results with several other related models show that DSCF-Net delivers state-of-the-art performance on several public databases.

preprint2019arXiv

Dynamic 3-D measurement based on fringe-to-fringe transformation using deep learning

Fringe projection profilometry (FPP) has become increasingly important in dynamic 3-D shape measurement. In FPP, it is necessary to retrieve the phase of the measured object before shape profiling. However, traditional phase retrieval techniques often require a large number of fringes, which may generate motion-induced error for dynamic objects. In this paper, a novel phase retrieval technique based on deep learning is proposed, which uses an end-to-end deep convolution neural network to transform a single or two fringes into the phase retrieval required fringes. When the object's surface is located in a restricted depth, the presented network only requires a single fringe as the input, which otherwise requires two fringes in an unrestricted depth. The proposed phase retrieval technique is first theoretically analyzed, and then numerically and experimentally verified on its applicability for dynamic 3-D measurement.

preprint2019arXiv

Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite

As China's first X-ray astronomical satellite, the Hard X-ray Modulation Telescope (HXMT), which was dubbed as Insight-HXMT after the launch on June 15, 2017, is a wide-band (1-250 keV) slat-collimator-based X-ray astronomy satellite with the capability of all-sky monitoring in 0.2-3 MeV. It was designed to perform pointing, scanning and gamma-ray burst (GRB) observations and, based on the Direct Demodulation Method (DDM), the image of the scanned sky region can be reconstructed. Here we give an overview of the mission and its progresses, including payload, core sciences, ground calibration/facility, ground segment, data archive, software, in-orbit performance, calibration, background model, observations and some preliminary results.

preprint2019arXiv

Stationary State Degeneracy of Open Quantum Systems with Non-Abelian Symmetries

We study the null space degeneracy of open quantum systems with multiple non-Abelian, strong symmetries. By decomposing the Hilbert space representation of these symmetries into an irreducible representation involving the direct sum of multiple, commuting, invariant subspaces we derive a tight lower bound for the stationary state degeneracy. We apply these results within the context of open quantum many-body systems, presenting three illustrative examples: a fully-connected quantum network, the XXX Heisenberg model and the Hubbard model. We find that the derived bound, which scales at least cubically in the system size the $SU(2)$ symmetric cases, is often saturated. Moreover, our work provides a theory for the systematic block-decomposition of a Liouvillian with non-Abelian symmetries, reducing the computational difficulty involved in diagonalising these objects and exposing a natural, physical structure to the steady states - which we observe in our examples.

preprint2018arXiv

Holographic rainbow networks for colorful Motzkin and Fredkin spin chains

We present bulk tensor networks that exactly represent the ground states of a continuous family of one-dimensional frustration-free Hamiltonians. These states, which are known as area-deformed Motzkin and Fredkin states, exhibit a novel quantum phase transition. By tuning a single parameter, they go from a phase obeying an area law to a highly entangled rainbow phase, where the half-chain entropy scales with the volume. Using the representation of these ground states as superpositions of random walks, we introduce tensor networks for these ground states where local and global rules of the walker are baked into bulk tensors, thereby providing an efficient description of the ground states (some of which satisfy a volume law scaling of entanglement entropy).

preprint2017arXiv

Dual meson condensates in the Polyakov-loop extended linear sigma model

Dual meson condensates as possible order parameters for deconfinement are investigated in a Polyakov-loop enhanced linear sigma model of QCD at both zero and finite isospin chemical potential $μ_I$. We find that the rapid rise of the dual sigma condensate (corresponding to the dressed Polyakov-loop) with $T$ is driven by the chiral transition, no matter whether the Polyakov-loop dynamics is included or not. For $μ_I>m_π/2$, the dual sigma condensate shows abnormal thermal behavior which even decreases with $T$ below the melting temperature $T_c^{I_3}$ of pion superfluidity; On the other hand, even the dual pion condensate always increases with $T$, its maximum slope locates exactly at $T_c^{I_3}$ rather than the deconfinement temperature $T_c^{P}$ determined by the Polyakov-loop. All these are qualitatively consistent with the previous results obtained in the Nambu-Jona-Lasinio type models. The dual vector meson condensate for $μ_I>m_π/2$ is also calculated. This quantity is more sensitive to the chiral transition when taking into account the Dirac-sea contribution. Our study further suggests that it should be cautious to use dual observables to indicate the deconfinement transition, especially in QCD models.

preprint2016arXiv

A simple setup to measure muon lifetime and electron energy spectrum of muon decay and its Monte Carlo simulation

We designed a simple setup to measure the muon lifetime and the electron energy spectra of muon decay. A low cost coincidental circuit was designed to select the signals of muon decay events detected by a plastic scintillator detector. It triggered a digital oscilloscope to record the signals of muon decay events for measuring the muon lifetime and electron energy spectrum. A Landau-distribution energy loss method was introduced to conduct the energy calibration of the system. The experimental results were well reproduced by the Monte Carlo simulation. The software and hardware of the system are completely open to students, thus more helpful for instruction and motivation.

preprint2016arXiv

Combined chiral and diquark fluctuations along QCD critical line and enhanced baryon production with parity doubling

We argue that there should exist the large combined fluctuations of chiral and diquark condensates along the phase boundary of QCD at moderately high density and relatively low temperature. Such fluctuations might lead to anomalous production of nucleons and its parity partner, which we propose to detect at NICA.

preprint2016arXiv

Integrating Abstractions to Enhance the Execution of Distributed Applications

One of the factors that limits the scale, performance, and sophistication of distributed applications is the difficulty of concurrently executing them on multiple distributed computing resources. In part, this is due to a poor understanding of the general properties and performance of the coupling between applications and dynamic resources. This paper addresses this issue by integrating abstractions representing distributed applications, resources, and execution processes into a pilot-based middleware. The middleware provides a platform that can specify distributed applications, execute them on multiple resource and for different configurations, and is instrumented to support investigative analysis. We analyzed the execution of distributed applications using experiments that measure the benefits of using multiple resources, the late-binding of scheduling decisions, and the use of backfill scheduling.

preprint2016arXiv

Performance Guaranteed Approximation Algorithm for Minimum $k$-Connected $m$-Fold Dominating Set

To achieve an efficient routing in a wireless sensor network, connected dominating set (CDS) is used as virtual backbone. A fault-tolerant virtual backbone can be modeled as a $(k,m)$-CDS. For a connected graph $G=(V,E)$ and two fixed integers $k$ and $m$, a node set $C\subseteq V$ is a $(k,m)$-CDS of $G$ if every node in $V\setminus C$ has at least $m$ neighbors in $C$, and the subgraph of $G$ induced by $C$ is $k$-connected. Previous to this work, approximation algorithms with guaranteed performance ratio in a general graph were know only for $k\leq 3$. This paper makes a significant progress by presenting a $(2k-1)α_0$ approximation algorithm for general $k$ and $m$ with $m\geq k$, where $α_0$ is the performance ratio for the minimum CDS problem. Using currently best known ratio for $α_0$, our algorithm has performance ratio $O(\lnΔ)$, where $Δ$ is the maximum degree of the graph.

preprint2016arXiv

Quantum phase transition from bounded to extensive entanglement entropy in a frustration-free spin chain

We introduce a continuous family of frustration-free Hamiltonians with exactly solvable ground states. We prove that the {ground state of our model is non-degenerate and exhibits} a novel quantum phase transition from bounded entanglement entropy to a massively entangled state with volume entropy scaling. The ground state may be interpreted as a deformation away from the uniform superposition of colored Motzkin paths, showed by Movassagh and Shor to have a large (square-root) but sub-extensive scaling of entanglement into a state with an extensive entropy.

preprint2016arXiv

Scientific Computing Meets Big Data Technology: An Astronomy Use Case

Scientific analyses commonly compose multiple single-process programs into a dataflow. An end-to-end dataflow of single-process programs is known as a many-task application. Typically, tools from the HPC software stack are used to parallelize these analyses. In this work, we investigate an alternate approach that uses Apache Spark -- a modern big data platform -- to parallelize many-task applications. We present Kira, a flexible and distributed astronomy image processing toolkit using Apache Spark. We then use the Kira toolkit to implement a Source Extractor application for astronomy images, called Kira SE. With Kira SE as the use case, we study the programming flexibility, dataflow richness, scheduling capacity and performance of Apache Spark running on the EC2 cloud. By exploiting data locality, Kira SE achieves a 2.5x speedup over an equivalent C program when analyzing a 1TB dataset using 512 cores on the Amazon EC2 cloud. Furthermore, we show that by leveraging software originally designed for big data infrastructure, Kira SE achieves competitive performance to the C implementation running on the NERSC Edison supercomputer. Our experience with Kira indicates that emerging Big Data platforms such as Apache Spark are a performant alternative for many-task scientific applications.

preprint2015arXiv

Coupled wire model of symmetric Majorana surfaces of topological superconductors

Time reversal symmetric topological superconductors in three spatial dimensions carry gapless surface Majorana fermions. They are robust against any time reversal symmetric single-body perturbation weaker than the bulk energy gap. We mimic the massless surface Majorana's by coupled wire models in two spatial dimensions. We introduce explicit many-body interwire interactions that preserve time reversal symmetry and give energy gaps to all low energy degrees of freedom. We show the gapped models generically carry non-trivial topological order and support anyonic excitations.

preprint2015arXiv

Dual condensates at finite isospin chemical potential

The dual observables as order parameters for center symmetry are tested at finite isospin chemical potential $μ_I$ in a Polyakov-loop enhanced chiral model of QCD with physical quark masses. As a counterpart of the dressed Polyakov-loop, the first Fourier moment of pion condensate is introduced for $μ_I>{m_π}/{2}$ under the temporal twisted boundary conditions for quarks. We demonstrate that this dual condensate exhibits the similar temperature dependence as the conventional Polyakov-loop. We confirm that its rapid increase with $T$ is driven by the evaporating of pion condensation. On the other hand, the dressed Polyakov-loop shows abnormal thermal behavior, which even decreases with $T$ at low temperatures due to the influence of pion condensate. We thus argue that in QCD the critical temperature extracting from a dual observable may have nothing to do with the quark confinement-deconfinement transition if the quark mass is very small.

preprint2014arXiv

The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox

To support complex data-intensive applications such as personalized recommendations, targeted advertising, and intelligent services, the data management community has focused heavily on the design of systems to support training complex models on large datasets. Unfortunately, the design of these systems largely ignores a critical component of the overall analytics process: the deployment and serving of models at scale. In this work, we present Velox, a new component of the Berkeley Data Analytics Stack. Velox is a data management system for facilitating the next steps in real-world, large-scale analytics pipelines: online model management, maintenance, and serving. Velox provides end-user applications and services with a low-latency, intuitive interface to models, transforming the raw statistical models currently trained using existing offline large-scale compute frameworks into full-blown, end-to-end data products capable of recommending products, targeting advertisements, and personalizing web content. To provide up-to-date results for these complex models, Velox also facilitates lightweight online model maintenance and selection (i.e., dynamic weighting). In this paper, we describe the challenges and architectural considerations required to achieve this functionality, including the abilities to span online and offline systems, to adaptively adjust model materialization strategies, and to exploit inherent statistical properties such as model error tolerance, all while operating at "Big Data" scale.

preprint2013arXiv

Conditioning the gamma spectrometer for activity measurement at very high background

The application of a high purity germanium (HPGe) gamma spectrometer in determining the fuel element burnup in a future reactor is studied. The HPGe detector is exposed by a Co60 source with varying irradiation rate from 10 kcps to 150 kcps to simulate the input counting rate in real reactor environment. A Cs137 and a Eu152 source are positioned at given distances to generate certain event rate in the detector with the former being proposed as a labeling nuclide to measure the burnup of fuel element. It is shown that both the energy resolution slightly increasing with the irradiation rate and the passthrough rate at high irradiation level match the requirement of the real application. The influence of the background is studied in different parameter sets used in the particularly developed procedure of the background subtraction. It is demonstrated that with the typical input irradiation rate and Cs137 intensity relevant to deep burnup situation, the precision of the Cs137 counting rate in the current experiment is consistently below 2.8%, indicating a promising feasibility of utilizing an HPGe detector in the burnup measurement in future bed-like reactor.

preprint2013arXiv

Fate of separate chiral transitions at finite $μ_I$ under the influence of mismatched vector interactions

The flavor-mixing induced by the mismatched vector-isoscalar and vector-isovector interactions at finite baryon chemical potential $μ$ and isospin chemical potential $μ_I$ is demonstrated in the Nambu-Jona-Lasinio (NJL) type model of QCD. The influence of this non-anomaly flavor-mixing on the possible separate chiral transitions at nonzero $μ_I$ is studied under the assumption of the effective restoration of the $U(1)_A$ symmetry. We find that for the weak isospin asymmetry, the two separate phase boundaries found previously can be converted into one only if the vector-isovector coupling $g_v^v$ is significantly stronger than the vector-isoscalar one $g_v^s$ without the axial anomaly. When the weak Kabayashi-Maskawa-'t Hooft (KMT) interaction is included, we find that the separation of the chiral transition with two critical endpoints for the relatively strong isospin asymmetry can still be removed owning to the vector interactions. In this case, it is not the vector coupling difference but the strength of $g_v^v$ which is crucial for the only phase boundary. We also point out that, in the NJL-type model with mismatched vector interactions, the recently proposed equivalence for chiral transitions at finite $μ$ and $μ_I$ does not hold even at the mean field approximation.

preprint2012arXiv

Correction to the Chiral Magnetic Effect from axial-vector interaction

The recent lattice calculation at finite axial chemical potential suggests that the induced current density of the chiral magnetic effect (CME) is somehow suppressed comparing with the standard analytical formula. We show in a NJL-type model of QCD that such a suppression is a natural result when considering the influence of the attractive axial-vector interaction. We point out that the lattice result doesn't need to be quantitatively consistent with the analytical formula due to the chirality density-density correlation. We also investigate the nonperturbative effect of instanton molecules on the CME. Since an unconventional repulsive axial-vector interaction is induced, the CME will be enhanced significantly by the instanton-anti-instanton pairings. Such a prediction needs to be tested by more improved lattice simulations. We further demonstrate that the axial-vector interaction plays an important role on the $T-μ_A$ phase diagram.

preprint2012arXiv

Many-Task Computing and Blue Waters

This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware.

preprint2011arXiv

A Molecular Mechanics Study of Morphologic Interaction between Graphene and Si Nanowires on a SiO2 Substrate

In this paper, we study the morphologic interaction between graphene and Si nanowires on a SiO2 substrate, using molecular mechanics simulations. Two cases are considered: 1) a graphene nanoribbon intercalated by a single Si nanowire on a SiO2 substrate and 2) a blanket graphene flake intercalated by an array of Si nanowires evenly patterned in parallel on a SiO2 substrate. Various graphene morphologies emerge from the simulation results of these two cases, which are shown to depend on both geometric parameters (e.g., graphene nanoribbon width, nanowire diameter, and nanowire spacing) and material properties (e.g., graphene-nanowire and graphene-substrate bonding strength). While the quantitative results at the atomistic resolution in this study can be further used to determine the change of electronic properties of graphene under morphologic regulation, the qualitative understandings from this study can be extended to help exploring graphene morphology in other material systems.

preprint2011arXiv

Axial Anomaly, Mismatched Fermi Surfaces and Vector Interaction in Dense Neutral Quark Matter

We report on effects of the q-$\bar{\rm q}$ vector interaction and/or the U(1)$_A$-anomaly-induced chiral-diquark coupling on the charge-neutral quark matter in $β$-equilibrium. We show that when the vector coupling is absent, there can appear a cross-over region sandwiched by two critical points in the intermediate temperature ($T$) region, while the phase transition in the low-$T$ region including zero temperature keeps being first order until the strength of the anomaly term is increased to have a critical value. On the other hand, when the vector coupling is also present, there appears a crossover region in the low-$T$ area including zero temperature with a new critical point, as was first demonstrated by Kitazawa et al and the present authors without and with the charge-neutral condition, respectively. We remark that the possible chromomagnetic instability is suppressed and can be even completely absent owing to the enhanced diquark coupling due to the anomaly term and/or by the vector interaction.

preprint2011arXiv

Carbon Nanotube Initiated Formation of Carbon Nanoscrolls

The unique topology and exceptional properties of carbon nanoscrolls (CNSs) have inspired unconventional nano-device concepts, yet the fabrication of CNSs remains rather challenging. Using molecular dynamics simulations, we demonstrate the spontaneous formation of a CNS from graphene on a substrate, initiated by a carbon nanotube (CNT). The rolling of graphene into a CNS is modulated by the CNT size, the carbon-carbon interlayer adhesion, and the graphene-substrate interaction. A phase diagram emerging from the simulations can offer quantitative guideline toward a feasible and robust physical approach to fabricating CNSs.

preprint2011arXiv

Determining Graphene Adhesion via Substrate-regulated Morphology of Graphene

Understanding the adhesion between graphene and other materials is crucial for achieving more reliable graphene-based applications in electronic devices and nanocomposites. The ultra-thin profile of graphene, however, poses significant challenge to direct measurement of its adhesion property using conventional approaches. We show that there is a strong correlation between the morphology of graphene on a compliant substrate with patterned surface and the graphene-substrate adhesion. We establish an analytic model to quantitatively determine such a strong correlation. Results show that, depending on the graphene-substrate adhesion, number of graphene layers and substrate stiffness, graphene exhibits two distinct types of morphology: I) graphene remains bonded to the substrate and corrugates to an amplitude up to that of the substrate surface patterns; II) graphene debonds from the substrate and remains flat on top of the substrate surface patterns. The sharp transition between these two types of graphene morphology occurs at a critical adhesion between the graphene and the compliant substrate material. These results potentially open up a feasible pathway to measuring the adhesion property of graphene.

preprint2011arXiv

Dynamic Coherent Acceptability Indices and their Applications to Finance

In this paper we present a theoretical framework for studying coherent acceptability indices in a dynamic setup. We study dynamic coherent acceptability indices and dynamic coherent risk measures, and we establish a duality between them. We derive a representation theorem for dynamic coherent risk measures in terms of so called dynamically consistent sequence of sets of probability measures. Based on these results, we give a specific construction of dynamic coherent acceptability indices. We also provide examples of dynamic coherent acceptability indices, both abstract and also some that generalize selected classical financial measures of portfolio performance.

preprint2011arXiv

Graphene morphology regulated by nanowires patterned in parallel on a substrate surface

The graphene morphology regulated by nanowires patterned in parallel on a substrate surface is quantitatively determined using energy minimization. The regulated graphene morphology is shown to be governed by the nanowire diameter, the nanowire spacing and the interfacial bonding energies between the graphene and the underlying nanowires and substrate. We demonstrate two representative regulated graphene morphologies and determine critical values of the nanowire spacing, nanowire diameter and interfacial bonding energies at which graphene switches between the two representative morphologies. Interestingly, we identify a rule-of-thumb formula that correlates the critical nanowire spacing, the critical interfacial bonding energies and the nanowire diameter in quite well agreement with the full-scale simulation results. Results from the present study offer guidelines in nano-structural design to achieve desired graphene morphology via regulation with a resolution approaching the atomic feature size of graphene.

preprint2011arXiv

Multiple critical point structure for chiral phase transition induced by charge neutrality and vector interaction

The combined effect of the repulsive vector interaction and the positive electric chemical potential on the chiral phase transition is investigated by considering neutral color superconductivity. Under the charge-neutrality constraint, the chiral condensate, diquark condensate and quark number densities are obtained in two-plus-one-flavor Nambu-Jona-Lasinio model with the so called Kobayashi-Maskawa-'t Hooft term. We demonstrate that multiple chiral critical-point structures always exist in the Nambu-Jona-Lasinio model within the self-consistent mean-field approximation, and that the number of chiral critical points can vary from zero to four, which is dependent on the magnitudes of vector interaction and the diquark coupling.

preprint2011arXiv

Roles of axial anomaly on neutral quark matter with color superconducting phase

We investigate effects of the axial anomaly term with a chiral-diquark coupling on the phase diagram within a two-plus-one-flavor Nambu-Jona-Lasinio (NJL) model under the charge-neutrality and $β$-equilibrium constraints. We find that when such constraints are imposed, the new anomaly term plays a quite similar role as the vector interaction does on the phase diagram, which the present authors clarified in a previous work. Thus, there appear several types of phase structures with multiple critical points at low temperature $T$, although the phase diagrams with intermediate-$T$ critical point(s) are never realized without these constraints even within the same model Lagrangian. This drastic change is attributed to an enhanced interplay between the chiral and diquark condensates due to the anomaly term at finite temperature; the u-d diquark coupling is strengthened by the relatively large chiral condensate of the strange quark through the anomaly term, which in turn definitely leads to the abnormal behavior of the diquark condensate at finite $T$, inherent to the asymmetric quark matter. We note that the critical point from which the crossover region extends to zero temperature appears only when the strength of the vector interaction is larger than a critical value. We also show that the chromomagnetic instability of the neutral asymmetric homogenous two-flavor color superconducting(2CSC) phase is suppressed and can be even completely cured by the enhanced diquark coupling due to the anomaly term and/or by the vector interaction.

preprint2011arXiv

Ultrafast nano-oscillators based on interlayer-bridged carbon nanoscrolls

We demonstrate a viable approach to fabricating ultrafast axial nano-oscillators based on carbon nanoscrolls (CNSs) using molecular dynamics simulations. Initiated by a single-walled carbon nanotube (CNT), a monolayer graphene can continuously scroll into a CNS with the CNT housed inside. The CNT inside the CNS can oscillate along axial direction at a natural frequency of 10s gigahertz (GHz). We demonstrate an effective strategy to reduce the dissipation of the CNS-based nano-oscillator by covalently bridging the carbon layers in the CNS. We further demonstrate that, such a CNS-based nano-oscillator can be excited and driven by an external AC electric field, and oscillate at more than 100 GHz. The CNS-based nano-oscillators not only offer a feasible pathway toward ultrafast nano-devices, but also hold promise to enable nano-scale energy transduction, harnessing and storage (e.g., from electric to mechanical).

preprint2010arXiv

Substrate-regulated morphology of graphene

We delineate a general theoretical framework to determine the substrate-regulated graphene morphology through energy minimization. We then apply such a framework to study the graphene morphology on a substrate with periodic surface grooves. Depending on the substrate surface roughness and the graphene-substrate interfacial bonding energy, the equilibrium morphology of graphene ranges from 1) closely conforming to the substrate, to 2) remaining flat on the substrate. Interestingly, in certain cases, the graphene morphology snaps between the above two limiting states. Our quantitative results envision a promising strategy to precisely control the graphene morphology over large areas. The rich features of the substrate-regulated graphene morphology (e.g., the snap-through instability) can potentially lead to new design concepts of functional graphene device components.

preprint2009arXiv

Snap-through instability of graphene on substrates

We determine the graphene morphology regulated by substrates with herringbone and checkerboard surface corrugations. As the graphene/substrate interfacial bonding energy and the substrate surface roughness vary, the graphene morphology snaps between two distinct states: 1) closely conforming to the substrate and 2) remaining nearly flat on the substrate. Such a snapthrough instability of graphene can potentially lead to desirable electronic properties to enable graphene-based devices.

preprint2008arXiv

Another chiral critical end-point induced by neutral color superconductivity

This paper has been withdrawn by the authors, due to it's incompleteness.

preprint2008arXiv

Design and Evaluation of a Collective IO Model for Loosely Coupled Petascale Programming

Loosely coupled programming is a powerful paradigm for rapidly creating higher-level applications from scientific programs on petascale systems, typically using scripting languages. This paradigm is a form of many-task computing (MTC) which focuses on the passing of data between programs as ordinary files rather than messages. While it has the significant benefits of decoupling producer and consumer and allowing existing application programs to be executed in parallel with no recoding, its typical implementation using shared file systems places a high performance burden on the overall system and on the user who will analyze and consume the downstream data. Previous efforts have achieved great speedups with loosely coupled programs, but have done so with careful manual tuning of all shared file system access. In this work, we evaluate a prototype collective IO model for file-based MTC. The model enables efficient and easy distribution of input data files to computing nodes and gathering of output results from them. It eliminates the need for such manual tuning and makes the programming of large-scale clusters using a loosely coupled model easier. Our approach, inspired by in-memory approaches to collective operations for parallel programming, builds on fast local file systems to provide high-speed local file caches for parallel scripts, uses a broadcast approach to handle distribution of common input data, and uses efficient scatter/gather and caching techniques for input and output. We describe the design of the prototype model, its implementation on the Blue Gene/P supercomputer, and present preliminary measurements of its performance on synthetic benchmarks and on a large-scale molecular dynamics application.

preprint2008arXiv

Towards Loosely-Coupled Programming on Petascale Systems

We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.

Zhao Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

80 published item(s)

LASAR: Latent Adaptive Semantic Aligned Reasoning for Generative Recommendation

Reasoning over Precedents Alongside Statutes: Case-Augmented Deliberative Alignment for LLM Safety

Route Before Retrieve: Activating Latent Routing Abilities of LLMs for RAG vs. Long-Context Selection

SynGR: Unleashing the Potential of Cross-Modal Synergy for Generative Recommendation

Absolute frequency measurement of a Lu$^+$ $(^{3}\rm D_1)$ optical frequency standard via link to international atomic time

Benchmarking LLMs for Fine-Grained Code Review with Enriched Context in Practice

Zeeman Degenerate Sideband Cooling in $^{176}$Lu$^+$

A Hierarchical Interactive Network for Joint Span-based Aspect-Sentiment Analysis

A Multi-Task Learning Model for Super Resolution of Wireless Channel Characteristics

A parallel algorithm for minimum weight set cover with small neighborhood property

A Search for Millilensing Gamma-Ray Bursts in the Observations of Fermi GBM

A Survey on Incomplete Multi-view Clustering

Approximation Algorithm for Minimum $p$ Union Under a Geometric Setting

Customized Conversational Recommender Systems

DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling

Glassy crystals with colossal multi-baroresponsivities

Hidden SU(2)_D vector dark matter with a scalar septuplet

Image Harmonization by Matching Regional References

Improved Parallel Algorithm for Minimum Cost Submodular Cover Problem

Interactive Style Transfer: All is Your Palette

Macroscopic Traffic Flow Modeling with Physics Regularized Gaussian Process: Generalized Formulations

On Deep Recurrent Reinforcement Learning for Active Visual Tracking of Space Noncooperative Objects

Performance Guaranteed Evolutionary Algorithm for Minimum Connected Dominating Set

PointScatter: Point Set Representation for Tubular Structure Extraction

Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion

Pre-training Enhanced Spatial-temporal Graph Neural Network for Multivariate Time Series Forecasting

Scalable all-optical cold damping of levitated nanoparticles

Spatial-Temporal Identity: A Simple yet Effective Baseline for Multivariate Time Series Forecasting

Tunable Non-equilibrium Phase Transitions between Spatial and Temporal Order through Dissipation

A Comprehensive Consistency Check between Synchrotron radiation and the Observed Gamma-ray Burst Spectra

A Survey on Concept Factorization: From Shallow to Deep Representation Learning

Dense Residual Network: Enhancing Global Dense Feature Flow for Character Recognition

Phase transition gravitational waves from pseudo-Nambu-Goldstone dark matter and two Higgs doublets

Bilateral Attention Network for RGB-D Salient Object Detection

Compressed DenseNet for Lightweight Character Recognition

Convolutional Dictionary Pair Learning Network for Image Representation Learning

Convolutional Neural Network Training with Distributed K-FAC

Discovery of oscillations above 200 keV in a black hole X-ray binary with Insight-HXMT

Fully-Convolutional Intensive Feature Flow Neural Network for Text Recognition

LinSBFT: Linear-Communication One-Step BFT Protocol for Public Blockchains

Macroscopic Traffic Flow Modeling with Physics Regularized Gaussian Process: A New Insight into Machine Learning Applications

Mitochondria in higher plants possess H2 evolving activity which is closely related to complex I

Multilayer Collaborative Low-Rank Coding Network for Robust Deep Subspace Discovery

Observation of E8 Particles in an Ising Chain Antiferromagnet

The Limit of the Batch Size

Ultimate precision of multi-parameter quantum magnetometry under the parallel scheme

Unsupervised Vehicle Re-identification with Progressive Adaptation

Deep Self-representative Concept Factorization Network for Representation Learning

Dynamic 3-D measurement based on fringe-to-fringe transformation using deep learning

Overview to the Hard X-ray Modulation Telescope (Insight-HXMT) Satellite

Stationary State Degeneracy of Open Quantum Systems with Non-Abelian Symmetries

Holographic rainbow networks for colorful Motzkin and Fredkin spin chains

Dual meson condensates in the Polyakov-loop extended linear sigma model

A simple setup to measure muon lifetime and electron energy spectrum of muon decay and its Monte Carlo simulation

Combined chiral and diquark fluctuations along QCD critical line and enhanced baryon production with parity doubling

Integrating Abstractions to Enhance the Execution of Distributed Applications

Performance Guaranteed Approximation Algorithm for Minimum $k$-Connected $m$-Fold Dominating Set

Quantum phase transition from bounded to extensive entanglement entropy in a frustration-free spin chain

Scientific Computing Meets Big Data Technology: An Astronomy Use Case

Coupled wire model of symmetric Majorana surfaces of topological superconductors

Dual condensates at finite isospin chemical potential

The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox

Conditioning the gamma spectrometer for activity measurement at very high background

Fate of separate chiral transitions at finite $μ_I$ under the influence of mismatched vector interactions

Correction to the Chiral Magnetic Effect from axial-vector interaction

Many-Task Computing and Blue Waters

A Molecular Mechanics Study of Morphologic Interaction between Graphene and Si Nanowires on a SiO2 Substrate

Axial Anomaly, Mismatched Fermi Surfaces and Vector Interaction in Dense Neutral Quark Matter

Carbon Nanotube Initiated Formation of Carbon Nanoscrolls

Determining Graphene Adhesion via Substrate-regulated Morphology of Graphene

Dynamic Coherent Acceptability Indices and their Applications to Finance

Graphene morphology regulated by nanowires patterned in parallel on a substrate surface

Multiple critical point structure for chiral phase transition induced by charge neutrality and vector interaction

Roles of axial anomaly on neutral quark matter with color superconducting phase