Researcher profile

Cong Zhou

Cong Zhou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2025arXiv

Frequency-switching Array Enhanced Physical-Layer Security in Terahertz Bands: A Movable Antenna Perspective

In this paper, we propose a new frequency-switching array (FSA) to enhance the physical-layer security (PLS) in the presence of multiple eavesdroppers (Eves), where the carrier frequency can be flexibly switched and small frequency offsets can be imposed on each antenna at the secrecy transmitter (Alice).First, we analytically show that by flexibly controlling the carrier frequency parameters, FSAs can effectively form uniform/non-uniform sparse arrays, hence resembling existing mechanically controlled movable antennas (MAs) via the control of inter-antenna spacing and providing additional degree-of-freedom in the beam manipulation.Although the proposed FSA suffers from additional path-gain attenuation in the received signals, it can overcome several hardware and signal processing issues incurred by MAs, such as limited positioning accuracy, extra hardware and energy cost.Then, a secrecy-rate maximization problem is formulated under the constraints on the frequency control.To shed useful insights, we first consider a secrecy-guaranteed problem with a null-steering constraint for which maximum ratio transmission beamformer is considered at Alice and the frequency offsets are set as uniform frequency increment.Interestingly, it is shown that the proposed FSA can flexibly realize null-steering over Eve in both the angular domain and range domain, thereby achieving improved PLS performance.Then, for the general case, we propose an efficient algorithm to solve the formulated non-convex optimization problem by using the block coordinate descent and projected gradient ascent techniques. Finally, numerical results demonstrate that the proposed FSA achieves superior secrecy rate performance over conventional fixed-position array, while it only suffers a slight secrecy rate loss than the existing mechanically controlled MA.

preprint2022arXiv

"Is Whole Word Masking Always Better for Chinese BERT?": Probing on Chinese Grammatical Error Correction

Whole word masking (WWM), which masks all subwords corresponding to a word at once, makes a better English BERT model. For the Chinese language, however, there is no subword because each token is an atomic character. The meaning of a word in Chinese is different in that a word is a compositional unit consisting of multiple characters. Such difference motivates us to investigate whether WWM leads to better context understanding ability for Chinese BERT. To achieve this, we introduce two probing tasks related to grammatical error correction and ask pretrained models to revise or insert tokens in a masked language modeling manner. We construct a dataset including labels for 19,075 tokens in 10,448 sentences. We train three Chinese BERT models with standard character-level masking (CLM), WWM, and a combination of CLM and WWM, respectively. Our major findings are as follows: First, when one character needs to be inserted or replaced, the model trained with CLM performs the best. Second, when more than one character needs to be handled, WWM is the key to better performance. Finally, when being fine-tuned on sentence-level downstream tasks, models trained with different masking strategies perform comparably.

preprint2022arXiv

Effidit: Your AI Writing Assistant

In this technical report, we introduce Effidit (Efficient and Intelligent Editing), a digital writing assistant that facilitates users to write higher-quality text more efficiently by using artificial intelligence (AI) technologies. Previous writing assistants typically provide the function of error checking (to detect and correct spelling and grammatical errors) and limited text-rewriting functionality. With the emergence of large-scale neural language models, some systems support automatically completing a sentence or a paragraph. In Effidit, we significantly expand the capacities of a writing assistant by providing functions in five categories: text completion, error checking, text polishing, keywords to sentences (K2S), and cloud input methods (cloud IME). In the text completion category, Effidit supports generation-based sentence completion, retrieval-based sentence completion, and phrase completion. In contrast, many other writing assistants so far only provide one or two of the three functions. For text polishing, we have three functions: (context-aware) phrase polishing, sentence paraphrasing, and sentence expansion, whereas many other writing assistants often support one or two functions in this category. The main contents of this report include major modules of Effidit, methods for implementing these modules, and evaluation results of some key methods.

preprint2022arXiv

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code

People perceive the world with multiple senses (e.g., through hearing sounds, reading words and seeing objects). However, most existing AI systems only process an individual modality. This paper presents an approach that excels at handling multiple modalities of information with a single model. In our "{SkillNet}" model, different parts of the parameters are specialized for processing different modalities. Unlike traditional dense models that always activate all the model parameters, our model sparsely activates parts of the parameters whose skills are relevant to the task. Such model design enables SkillNet to learn skills in a more interpretable way. We develop our model for five modalities including text, image, sound, video and code. Results show that, SkillNet performs comparably to five modality-specific fine-tuned models. Moreover, our model supports self-supervised pretraining with the same sparsely activated way, resulting in better initialized parameters for different modalities. We find that pretraining significantly improves the performance of SkillNet on five modalities, on par with or even better than baselines with modality-specific pretraining. On the task of Chinese text-to-image retrieval, our final system achieves higher accuracy than existing leading systems including Wukong{ViT-B} and Wenlan 2.0 while using less number of activated parameters.

preprint2022arXiv

Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors

Chinese BERT models achieve remarkable progress in dealing with grammatical errors of word substitution. However, they fail to handle word insertion and deletion because BERT assumes the existence of a word at each position. To address this, we present a simple and effective Chinese pretrained model. The basic idea is to enable the model to determine whether a word exists at a particular position. We achieve this by introducing a special token \texttt{[null]}, the prediction of which stands for the non-existence of a word. In the training stage, we design pretraining tasks such that the model learns to predict \texttt{[null]} and real words jointly given the surrounding context. In the inference stage, the model readily detects whether a word should be inserted or deleted with the standard masked language modeling function. We further create an evaluation dataset to foster research on word insertion and deletion. It includes human-annotated corrections for 7,726 erroneous sentences. Results show that existing Chinese BERT performs poorly on detecting insertion and deletion errors. Our approach significantly improves the F1 scores from 24.1\% to 78.1\% for word insertion and from 26.5\% to 68.5\% for word deletion, respectively.

preprint2022arXiv

Pretraining without Wordpieces: Learning Over a Vocabulary of Millions of Words

The standard BERT adopts subword-based tokenization, which may break a word into two or more wordpieces (e.g., converting "lossless" to "loss" and "less"). This will bring inconvenience in following situations: (1) what is the best way to obtain the contextual vector of a word that is divided into multiple wordpieces? (2) how to predict a word via cloze test without knowing the number of wordpieces in advance? In this work, we explore the possibility of developing BERT-style pretrained model over a vocabulary of words instead of wordpieces. We call such word-level BERT model as WordBERT. We train models with different vocabulary sizes, initialization configurations and languages. Results show that, compared to standard wordpiece-based BERT, WordBERT makes significant improvements on cloze test and machine reading comprehension. On many other natural language understanding tasks, including POS tagging, chunking and NER, WordBERT consistently performs better than BERT. Model analysis indicates that the major advantage of WordBERT over BERT lies in the understanding for low-frequency words and rare words. Furthermore, since the pipeline is language-independent, we train WordBERT for Chinese language and obtain significant gains on five natural language understanding datasets. Lastly, the analyse on inference speed illustrates WordBERT has comparable time cost to BERT in natural language understanding tasks.

preprint2022arXiv

SkillNet-NLU: A Sparsely Activated Model for General-Purpose Natural Language Understanding

Prevailing deep models are single-purpose and overspecialize at individual tasks. However, when being extended to new tasks, they typically forget previously learned skills and learn from scratch. We address this issue by introducing SkillNet-NLU, a general-purpose model that stitches together existing skills to learn new tasks more effectively. The key feature of our approach is that it is sparsely activated guided by predefined skills. Different from traditional dense models that always activate all the model parameters, SkillNet-NLU only activates parts of the model parameters whose skills are relevant to the target task. When learning for a new task, our approach precisely activates required skills and also provides an option to add new skills. We evaluate on natural language understandings tasks and have the following findings. First, with only one model checkpoint, SkillNet-NLU performs better than task-specific fine-tuning and two multi-task learning baselines (i.e., dense model and Mixture-of-Experts model) on six tasks. Second, sparsely activated pre-training further improves the overall performance. Third, SkillNet-NLU significantly outperforms baseline systems when being extended to new tasks.

preprint2020arXiv

Model-independent test of the parity symmetry of gravity with gravitational waves

Gravitational wave (GW) data can be used to test the parity symmetry of gravity by investigating the difference between left-hand and right-hand circular polarization modes. In this article, we develop a method to decompose the circular polarizations of GWs produced during the inspiralling stage of compact binaries, with the help of stationary phase approximation. The foremost advantage is that this method is simple, clean, independent of GW waveform, and is applicable to the existing detector network. Applying it to the mock data, we test the parity symmetry of gravity by constraining the velocity birefringence of GWs. If a nearly edge-on binary neutron-stars with observed electromagnetic counterparts at 40 Mpc is detected by the second-generation detector network, one could derive the model-independent test on the parity symmetry in gravity: the lower limit of the energy scale of parity violation can be constrained within $\mathcal{O}(10^4{\rm eV})$.

preprint2020arXiv

On the Space of $C^1$ Regular Curves on Sphere with Constrained Curvature

Let $\mathcal{P}_{κ_1}^{κ_2}(\boldsymbol{P}, \boldsymbol{Q})$ denote the set of $C^1$ regular curves in the $2$-sphere $\mathbb{S}^2$ that start and end at given points with the corresponding Frenet frames $\boldsymbol{P}$ and $\boldsymbol{Q}$, whose tangent vectors are Lipschitz continuous, and their a.e. existing geodesic curvatures have essentially bounds in $(κ_1, κ_2)$, $-\infty<κ_1<κ_2<\infty$. In this article, firstly we study the geometric property of the curves in $\mathcal{P}_{κ_1}^{κ_2}(\boldsymbol{P}, \boldsymbol{Q})$. We introduce the concepts of the lower and upper curvatures at any point of a $C^1$ regular curve and prove that a $C^1$ regular curve is in $\mathcal{P}_{κ_1}^{κ_2}(\boldsymbol{P}, \boldsymbol{Q})$ if and only if the infimum of its lower curvature and the supremum of its upper curvature are constrained in $(κ_1,κ_2)$. Secondly we prove that the $C^0$ and $C^1$ topologies on $\mathcal{P}_{κ_1}^{κ_2}(\boldsymbol{P}, \boldsymbol{Q})$ are the same. Further, we show that a curve in $\mathcal{P}_{κ_1}^{κ_2}(\boldsymbol{P}, \boldsymbol{Q})$ can be determined by the solutions of differential equation $Φ&#39;(t) = Φ(t)Λ(t)$ with $Φ(t)\in \textrm{SO}_3(\mathbb{R})$ with special constraints to $Λ(t)\in\mathfrak{so}_3(\mathbb{R})$ and give a complete metric on $\mathcal{P}_{κ_1}^{κ_2}(\boldsymbol{P}, \boldsymbol{Q})$ such that it becomes a (trivial) Banach manifold.

preprint2020arXiv

Source coding of audio signals with a generative model

We consider source coding of audio signals with the help of a generative model. We use a construction where a waveform is first quantized, yielding a finite bitrate representation. The waveform is then reconstructed by random sampling from a model conditioned on the quantized waveform. The proposed coding scheme is theoretically analyzed. Using SampleRNN as the generative model, we demonstrate that the proposed coding structure provides performance competitive with state-of-the-art source coding tools for specific categories of audio signals.