Source author record

Hong Chang

Hong Chang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision math.CO quant-ph physics.atom-ph cond-mat.quant-gas eess.AS Machine Learning math.GT Neural and Evolutionary Computing physics.optics Sound

Catalog footprint

What is connected

15works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MATS: An Audio Language Model under Text-only Supervision

Large audio-language models (LALMs), built upon powerful Large Language Models (LLMs), have exhibited remarkable audio comprehension and reasoning capabilities. However, the training of LALMs demands a large corpus of audio-language pairs, which requires substantial costs in both data collection and training resources. In this paper, we propose \textbf{MATS}, an audio-language multimodal LLM designed to handle \textbf{M}ultiple \textbf{A}udio task using solely \textbf{T}ext-only \textbf{S}upervision. By leveraging pre-trained audio-language alignment models such as CLAP, we develop a text-only training strategy that projects the shared audio-language latent space into LLM latent space, endowing the LLM with audio comprehension capabilities without relying on audio data during training. To further bridge the modality gap between audio and language embeddings within CLAP, we propose the \textbf{S}trongly-rel\textbf{a}ted \textbf{n}oisy \textbf{t}ext with \textbf{a}udio (\textbf{Santa}) mechanism. Santa maps audio embeddings into CLAP language embedding space while preserving essential information from the audio input. Extensive experiments demonstrate that MATS, despite being trained exclusively on text data, achieves competitive performance compared to recent LALMs trained on large-scale audio-language pairs. The code is publicly available in \href{https://github.com/wangwen-banban/MATS}{https://github.com/wangwen-banban/MATS}.

preprint2022arXiv

Clothes-Changing Person Re-identification with RGB Modality Only

The key to address clothes-changing person re-identification (re-id) is to extract clothes-irrelevant features, e.g., face, hairstyle, body shape, and gait. Most current works mainly focus on modeling body shape from multi-modality information (e.g., silhouettes and sketches), but do not make full use of the clothes-irrelevant information in the original RGB images. In this paper, we propose a Clothes-based Adversarial Loss (CAL) to mine clothes-irrelevant features from the original RGB images by penalizing the predictive power of re-id model w.r.t. clothes. Extensive experiments demonstrate that using RGB images only, CAL outperforms all state-of-the-art methods on widely-used clothes-changing person re-id benchmarks. Besides, compared with images, videos contain richer appearance and additional temporal information, which can be used to model proper spatiotemporal patterns to assist clothes-changing re-id. Since there is no publicly available clothes-changing video re-id dataset, we contribute a new dataset named CCVID and show that there exists much room for improvement in modeling spatiotemporal information. The code and new dataset are available at: https://github.com/guxinqian/Simple-CCReID.

preprint2022arXiv

Efficient geodesics in the curve complex and their dot graphs

For the complex of curves of a closed orientable surface of genus $g$, $\mathcal{C}(S_{g>1})$, the notion of efficient geodesic in was introduced in arXiv:1408.4133. There it was established that there always exists (finitely many) efficient geodesics between any two vertices, $ v_α , v_β \in \mathcal{C}(S_g)$, representing homotopy classes of simple closed curves, $α, β\subset S_g$. The main tool for used in establishing the existence of efficient geodesic was a dot graph, a booking scheme for recording the intersection pattern of a reference arc, $γ\subset S_g$, with the simple closed curves associated with the vertices of geodesic path in the zero skeleton, $\mathcal{C}^0(S_g)$. In particular, for an efficient geodesic between $v_α$ and $v_β$ of length $d \geq 3$, it was shown that any curve corresponding to the vertex that is distance one from $v_α$ intersects any $γ$ at most $d -2$ times. In this note we make a more expansive study of the characterizing "shape" of the dot graphs over the entire set of vertices in an efficient geodesic edge-path. The key take away of this study is that the shape of a dot graph for any efficient geodesic is contained within a spindle shape region. Since the Nielson-Thurston coordinates of any curve on $S_g$ are directly derived from its intersection number with finitely many reference arcs, spindle shaped dot graphs control the coordinate behavior of curves associated with the vertices of an efficient geodesic.

preprint2022arXiv

Floquet engineering Hz-Level Rabi Spectra in Shallow Optical Lattice Clock

Quantum metrology with ultra-high precision usually requires atoms prepared in an ultra-stable environment with well-defined quantum states. Thus, in optical lattice clock systems deep lattice potentials are used to trap ultra-cold atoms. However, decoherence, induced by Raman scattering and higher order light shifts, can significantly be reduced if atomic clocks are realized in shallow optical lattices. On the other hand, in such lattices, tunneling among different sites can cause additional dephasing and strongly broadening of the Rabi spectrum. Here, in our experiment, we periodically drive a shallow $^{87}$Sr optical lattice clock. Counter intuitively, shaking the system can deform the wide broad spectral line into a sharp peak with 5.4Hz line-width. With careful comparison between the theory and experiment, we demonstrate that the Rabi frequency and the Bloch bands can be tuned, simultaneously and independently. Our work not only provides a different idea for quantum metrology, such as building shallow optical lattice clock in outer space, but also paves the way for quantum simulation of new phases of matter by engineering exotic spin orbit couplings.

preprint2022arXiv

Theoretical Calculation of the Quadratic Zeeman Shift Coefficient of the 3P0 clock state for Strontium Optical Lattice Clock

The quadratic Zeeman shift coefficient of 3P0 clock state for strontium is determined in theory and experiment. In theory, we derived the expression of the quadratic Zeeman shift of 3P0 clock state for 88Sr and 87Sr in the weak-magnetic-field approximation. By using the multi-configuration Dirac-Hartree-Fock theory, the quadratic Zeeman shift coefficients were calculated. To determine the calculated results, the quadratic Zeeman shift coefficient of 3P0,F=9/2,MF=+/-9/2 clock state was measured in our 87Sr optical lattice clock. The calculated results C2=-23.38(5) MHz/T2 for 88Sr and the 3P0,F=9/2,MF=+/-9/2 clock state for 87Sr agree well with the other experimental and theoretical values, especially the most accurate measurement recently. As the 1S0,F=9/2,MF=+/-5/2-3P0,F=9/2,MF=+/-3/2 transitions have been used as another clock transition for less sensitive to the magnetic field noise, we also calculated the quadratic Zeeman shift coefficients for the other magnetic states.

preprint2020arXiv

Appearance-Preserving 3D Convolution for Video-based Person Re-identification

Due to the imperfect person detection results and posture changes, temporal appearance misalignment is unavoidable in video-based person re-identification (ReID). In this case, 3D convolution may destroy the appearance representation of person video clips, thus it is harmful to ReID. To address this problem, we propose AppearancePreserving 3D Convolution (AP3D), which is composed of two components: an Appearance-Preserving Module (APM) and a 3D convolution kernel. With APM aligning the adjacent feature maps in pixel level, the following 3D convolution can model temporal information on the premise of maintaining the appearance representation quality. It is easy to combine AP3D with existing 3D ConvNets by simply replacing the original 3D convolution kernels with AP3Ds. Extensive experiments demonstrate the effectiveness of AP3D for video-based ReID and the results on three widely used datasets surpass the state-of-the-arts. Code is available at: https://github.com/guxinqian/AP3D.

preprint2020arXiv

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

Although two-stage object detectors have continuously advanced the state-of-the-art performance in recent years, the training process itself is far from crystal. In this work, we first point out the inconsistency problem between the fixed network settings and the dynamic training procedure, which greatly affects the performance. For example, the fixed label assignment strategy and regression loss function cannot fit the distribution change of proposals and thus are harmful to training high quality detectors. Consequently, we propose Dynamic R-CNN to adjust the label assignment criteria (IoU threshold) and the shape of regression loss function (parameters of SmoothL1 Loss) automatically based on the statistics of proposals during training. This dynamic design makes better use of the training samples and pushes the detector to fit more high quality samples. Specifically, our method improves upon ResNet-50-FPN baseline with 1.9% AP and 5.5% AP$_{90}$ on the MS COCO dataset with no extra overhead. Codes and models are available at https://github.com/hkzhang95/DynamicRCNN.

preprint2020arXiv

IAUnet: Global Context-Aware Feature Learning for Person Re-Identification

Person re-identification (reID) by CNNs based networks has achieved favorable performance in recent years. However, most of existing CNNs based methods do not take full advantage of spatial-temporal context modeling. In fact, the global spatial-temporal context can greatly clarify local distractions to enhance the target feature representation. To comprehensively leverage the spatial-temporal context information, in this work, we present a novel block, Interaction-Aggregation-Update (IAU), for high-performance person reID. Firstly, Spatial-Temporal IAU (STIAU) module is introduced. STIAU jointly incorporates two types of contextual interactions into a CNN framework for target feature learning. Here the spatial interactions learn to compute the contextual dependencies between different body parts of a single frame. While the temporal interactions are used to capture the contextual dependencies between the same body parts across all frames. Furthermore, a Channel IAU (CIAU) module is designed to model the semantic contextual interactions between channel features to enhance the feature representation, especially for small-scale visual cues and body parts. Therefore, the IAU block enables the feature to incorporate the globally spatial, temporal, and channel context. It is lightweight, end-to-end trainable, and can be easily plugged into existing CNNs to form IAUnet. The experiments show that IAUnet performs favorably against state-of-the-art on both image and video reID tasks and achieves compelling results on a general object categorization task. The source code is available at https://github.com/blue-blue272/ImgReID-IAnet.

preprint2020arXiv

Temporal Complementary Learning for Video Person Re-Identification

This paper proposes a Temporal Complementary Learning Network that extracts complementary features of consecutive video frames for video person re-identification. Firstly, we introduce a Temporal Saliency Erasing (TSE) module including a saliency erasing operation and a series of ordered learners. Specifically, for a specific frame of a video, the saliency erasing operation drives the specific learner to mine new and complementary parts by erasing the parts activated by previous frames. Such that the diverse visual features can be discovered for consecutive frames and finally form an integral characteristic of the target identity. Furthermore, a Temporal Saliency Boosting (TSB) module is designed to propagate the salient information among video frames to enhance the salient feature. It is complementary to TSE by effectively alleviating the information loss caused by the erasing operation of TSE. Extensive experiments show our method performs favorably against state-of-the-arts. The source code is available at https://github.com/blue-blue272/VideoReID-TCLNet.

preprint2016arXiv

Degree sum conditions for graphs to have proper connection number 2

A path $P$ in an edge-colored graph $G$ is a \emph{proper path} if no two adjacent edges of $P$ are colored with the same color. The graph $G$ is \emph{proper connected} if, between every pair of vertices, there exists a proper path in $G$. The \emph{proper connection number} $pc(G)$ of a connected graph $G$ is defined as the minimum number of colors to make $G$ proper connected. In this paper, we study the degree sum condition for a general graph or a bipartite graph to have proper connection number 2. First, we show that if $G$ is a connected noncomplete graph of order $n\geq 5$ such that $d(x)+d(y)\geq \frac{n}{2}$ for every pair of nonadjacent vertices $x,y\in V(G)$, then $pc(G)=2$ except for three small graphs on 6, 7 and 8 vertices. In addition, we obtain that if $G$ is a connected bipartite graph of order $n\geq 4$ such that $d(x)+d(y)\geq \frac{n+6}{4}$ for every pair of nonadjacent vertices $x,y\in V(G)$, then $pc(G)=2$. Examples are given to show that the above conditions are best possible.

preprint2016arXiv

Some upper bounds for the $3$-proper index of graphs

A tree $T$ in an edge-colored graph is a {\it proper tree} if no two adjacent edges of $T$ receive the same color. Let $G$ be a connected graph of order $n$ and $k$ be a fixed integer with $2\le k\le n$. For a vertex subset $S \subseteq V(G)$ with $\left|S\right| \ge 2$, a tree containing all the vertices of $S$ in $G$ is called an $S$-tree. An edge-coloring of $G$ is called a \emph{$k$-proper coloring} if for every $k$-subset $S$ of $V(G)$, there exists a proper $S$-tree in $G$. For a connected graph $G$, the \emph{$k$-proper index} of $G$, denoted by $px_k(G)$, is the smallest number of colors that are needed in a $k$-proper coloring of $G$. In this paper, we show that for every connected graph $G$ of order $n$ and minimum degree $δ\geq 3$, $px_{3}(G)\le n\frac{\ln(δ+1)}{δ+1}(1+o_δ(1))+2$. We also prove that for every connected graph $G$ with minimum degree at least $3$, $px_{3}(G) \le px_{3}(G[D])+3$ when $D$ is a connected $3$-way dominating set of $G$ and $px_{3}(G) \le px_{3}(G[D])+1$ when $D$ is a connected $3$-dominating set of $G$. In addition, we obtain tight upper bounds of the 3-proper index for two special graph classes: threshold graphs and chain graphs. Finally, we prove that $px_3(G) \le \lfloor\frac{n}{2}\rfloor$ for any 2-connected graphs with at least four vertices.

preprint2016arXiv

The $(k,\ell)$-proper index of graphs

A tree $T$ in an edge-colored graph is called a {\it proper tree} if no two adjacent edges of $T$ receive the same color. Let $G$ be a connected graph of order $n$ and $k$ be an integer with $2\leq k \leq n$. For $S\subseteq V(G)$ and $|S| \ge 2$, an $S$-tree is a tree containing the vertices of $S$ in $G$. Suppose $\{T_1,T_2,\ldots,T_\ell\}$ is a set of $S$-trees, they are called \emph{internally disjoint} if $E(T_i)\cap E(T_j)=\emptyset$ and $V(T_i)\cap V(T_j)=S$ for $1\leq i\neq j\leq \ell$. For a set $S$ of $k$ vertices of $G$, the maximum number of internally disjoint $S$-trees in $G$ is denoted by $κ(S)$. The $κ$-connectivity $κ_k(G)$ of $G$ is defined by $κ_k(G)=\min\{κ(S)\mid S$ is a $k$-subset of $V(G)\}$. For a connected graph $G$ of order $n$ and for two integers $k$ and $\ell$ with $2\le k\le n$ and $1\leq \ell \leq κ_k(G)$, the \emph{$(k,\ell)$-proper index $px_{k,\ell}(G)$} of $G$ is the minimum number of colors that are needed in an edge-coloring of $G$ such that for every $k$-subset $S$ of $V(G)$, there exist $\ell$ internally disjoint proper $S$-trees connecting them. In this paper, we show that for every pair of positive integers $k$ and $\ell$ with $k \ge 3$, there exists a positive integer $N_1=N_1(k,\ell)$ such that $px_{k,\ell}(K_n) = 2$ for every integer $n \ge N_1$, and also there exists a positive integer $N_2=N_2(k,\ell)$ such that $px_{k,\ell}(K_{m,n}) = 2$ for every integer $n \ge N_2$ and $m=O(n^r) (r \ge 1)$. In addition, we show that for every $p \ge c\sqrt[k]{\frac{\log_a n}{n}}$ ($c \ge 5$), $px_{k,\ell}(G_{n,p})\le 2$ holds almost surely, where $G_{n,p}$ is the Erdös-Rényi random graph model.

preprint2014arXiv

Deeply Coupled Auto-encoder Networks for Cross-view Classification

The comparison of heterogeneous samples extensively exists in many applications, especially in the task of image classification. In this paper, we propose a simple but effective coupled neural network, called Deeply Coupled Autoencoder Networks (DCAN), which seeks to build two deep neural networks, coupled with each other in every corresponding layers. In DCAN, each deep structure is developed via stacking multiple discriminative coupled auto-encoders, a denoising auto-encoder trained with maximum margin criterion consisting of intra-class compactness and inter-class penalty. This single layer component makes our model simultaneously preserve the local consistency and enhance its discriminative capability. With increasing number of layers, the coupled networks can gradually narrow the gap between the two views. Extensive experiments on cross-view image classification tasks demonstrate the superiority of our method over state-of-the-art methods.

preprint2014arXiv

Precision measurement of transverse velocity distribution of a strontium atomic beam

We measure the transverse velocity distribution in a thermal Sr atomic beam precisely by velocity-selective saturated fluorescence spectroscopy. The use of an ultrastable laser system and the narrow intercombination transition line of Sr atoms mean that the resolution of the measured velocity can reach 0.13 m/s, corresponding to 90$μK$ in energy units. The experimental results are in very good agreement with the results of theoretical calculations. Based on the spectroscopic techniques used here, the absolute frequency of the intercombination transition of $^{88}$Sr is measured using an optical-frequency comb generator referenced to the SI second through an H maser, and is given as 434 829 121 318(10) kHz.

preprint2006arXiv

Producing and Detecting Correlated atoms

We discuss experiments to produce and detect atom correlations in a degenerate or nearly degenerate gas of neutral atoms. First we treat the atomic analog of the celebrated Hanbury Brown Twiss experiment, in which atom correlations result simply from interference effects without any atom interactions.We have performed this experiment for both bosons and fermions. Next we show how atom interactions produce correlated atoms using the atomic analog of spontaneous four-wavemixing. Finally, we briefly mention experiments on a one dimensional gas on an atom chip in which correlation effects due to both interference and interactions have been observed.

Hong Chang

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

MATS: An Audio Language Model under Text-only Supervision

Clothes-Changing Person Re-identification with RGB Modality Only

Efficient geodesics in the curve complex and their dot graphs

Floquet engineering Hz-Level Rabi Spectra in Shallow Optical Lattice Clock

Theoretical Calculation of the Quadratic Zeeman Shift Coefficient of the 3P0 clock state for Strontium Optical Lattice Clock

Appearance-Preserving 3D Convolution for Video-based Person Re-identification

Dynamic R-CNN: Towards High Quality Object Detection via Dynamic Training

IAUnet: Global Context-Aware Feature Learning for Person Re-Identification

Temporal Complementary Learning for Video Person Re-Identification

Degree sum conditions for graphs to have proper connection number 2

Some upper bounds for the $3$-proper index of graphs

The $(k,\ell)$-proper index of graphs

Deeply Coupled Auto-encoder Networks for Cross-view Classification

Precision measurement of transverse velocity distribution of a strontium atomic beam

Producing and Detecting Correlated atoms