Source author record

Minghui Yang

Minghui Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Artificial Intelligence Computation and Language math.NT Computer Vision eess.IV Machine Learning Multimedia quant-ph

Catalog footprint

What is connected

9works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CMTA: Leveraging Cross-Modal Temporal Artifacts for Generalizable AI-Generated Video Detection

The proliferation of advanced AI video synthesis techniques poses an unprecedented challenge to digital video authenticity. Existing AI-generated video (AIGV) detection methods primarily focus on uni-modal or spatiotemporal artifacts, but they overlook the rich cues within the visual-textual cross-modal space, especially the temporal stability of semantic alignment. In this work, we identify a distinctive fingerprint in AIGVs, termed cross-modal temporal artifact (CMTA). Unlike real videos that exhibit natural temporal fluctuations in cross-modal alignment due to semantic variations, AIGVs display unnaturally stable semantic trajectories governed by given input prompts. To bridge this gap, we propose the CMTA framework, a cross-modal detection approach that captures these unique temporal artifacts through joint cross-modal embedding and multi-grained temporal modeling. Specifically, CMTA leverages BLIP to generate frame-level image captions and utilizes CLIP to extract corresponding visual-textual representations. A coarse-grained temporal modeling branch is then designed to characterize temporal fluctuations in cross-modal alignment with a GRU. In parallel, a fine-grained branch is constructed to capture intricate inter-frame variations from integrated visual-textual features with a Transformer encoder. Extensive experiments on 40 subsets across four large-scale datasets, including GenVideo, EvalCrafter, VideoPhy, and VidProM, validate that our approach sets a new state-of-the-art while exhibiting superior cross-generator generalization. Code and models of CMTA will be released at https://github.com/hwang-cs-ime/CMTA

preprint2026arXiv

MLB: A Scenario-Driven Benchmark for Evaluating Large Language Models in Clinical Applications

The proliferation of Large Language Models (LLMs) presents transformative potential for healthcare, yet practical deployment is hindered by the absence of frameworks that assess real-world clinical utility. Existing benchmarks test static knowledge, failing to capture the dynamic, application-oriented capabilities required in clinical practice. To bridge this gap, we introduce a Medical LLM Benchmark MLB, a comprehensive benchmark evaluating LLMs on both foundational knowledge and scenario-based reasoning. MLB is structured around five core dimensions: Medical Knowledge (MedKQA), Safety and Ethics (MedSE), Medical Record Understanding (MedRU), Smart Services (SmartServ), and Smart Healthcare (SmartCare). The benchmark integrates 22 datasets (17 newly curated) from diverse Chinese clinical sources, covering 64 clinical specialties. Its design features a rigorous curation pipeline involving 300 licensed physicians. Besides, we provide a scalable evaluation methodology, centered on a specialized judge model trained via Supervised Fine-Tuning (SFT) on expert annotations. Our comprehensive evaluation of 10 leading models reveals a critical translational gap: while the top-ranked model, Kimi-K2-Instruct (77.3% accuracy overall), excels in structured tasks like information extraction (87.8% accuracy in MedRU), performance plummets in patient-facing scenarios (61.3% in SmartServ). Moreover, the exceptional safety score (90.6% in MedSE) of the much smaller Baichuan-M2-32B highlights that targeted training is equally critical. Our specialized judge model, trained via SFT on a 19k expert-annotated medical dataset, achieves 92.1% accuracy, an F1-score of 94.37%, and a Cohen's Kappa of 81.3% for human-AI consistency, validating a reproducible and expert-aligned evaluation protocol. MLB thus provides a rigorous framework to guide the development of clinically viable LLMs.

preprint2022arXiv

AdaCoach: A Virtual Coach for Training Customer Service Agents

With the development of online business, customer service agents gradually play a crucial role as an interface between the companies and their customers. Most companies spend a lot of time and effort on hiring and training customer service agents. To this end, we propose AdaCoach: A Virtual Coach for Training Customer Service Agents, to promote the ability of newly hired service agents before they get to work. AdaCoach is designed to simulate real customers who seek help and actively initiate the dialogue with the customer service agents. Besides, AdaCoach uses an automated dialogue evaluation model to score the performance of the customer agent in the training process, which can provide necessary assistance when the newly hired customer service agent encounters problems. We apply recent NLP technologies to ensure efficient run-time performance in the deployed system. To the best of our knowledge, this is the first system that trains the customer service agent through human-computer interaction. Until now, the system has already supported more than 500,000 simulation training and cultivated over 1000 qualified customer service agents.

preprint2020arXiv

A kind of quaternary sequences of period $2p^mq^n$ and their linear complexity

Sequences with high linear complexity have wide applications in cryptography. In this paper, a new class of quaternary sequences over $\mathbb{F}_4$ with period $2p^mq^n$ is constructed using generalized cyclotomic classes. Results show that the linear complexity of these sequences attains the maximum.

preprint2020arXiv

Determination of 2-Adic Complexity of Generalized Binary Sequences of Order 2

The generalized binary sequences of order 2 have been used to construct good binary cyclic codes [4]. The linear complexity of these sequences has been computed in [2]. The autocorrelation values of such sequences have been determined in [1] and [3]. Some lower bounds of 2-adic complexity for such sequences have been presented in [5] and [7]. In this paper we determine the exact value of 2-adic complexity for such sequences. Particularly, we improve the lower bounds presented in [5] and [7] and the condition for the 2-adic complexity reaching the maximum value.

preprint2016arXiv

Construction of Cyclic and Constacyclic Codes for b-symbol Read Channels Meeting the Plotkin-like Bound

The symbol-pair codes over finite fields have been raised for symbol-pair read channels and motivated by application of high-density data storage technologies [1, 2]. Their generalization is the code for b-symbol read channels (b > 2). Many MDS codes for b-symbol read channels have been constructed which meet the Singleton-like bound ([3, 4, 10] for b = 2 and [11] for b > 2). In this paper we show the Plotkin-like bound and present a construction on irreducible cyclic codes and constacyclic codes meeting the Plotkin-like bound.

preprint2016arXiv

Mutually unbiased maximally entangled bases in $\mathbb{C}^d\otimes\mathbb{C}^d$

We study mutually unbiased maximally entangled bases (MUMEB's) in bipartite system $\mathbb{C}^d\otimes\mathbb{C}^d (d \geq 3)$. We generalize the method to construct MUMEB's given in [16], by using any commutative ring $R$ with $d$ elements and generic character of $(R,+)$ instead of $\mathbb{Z}_d=\mathbb{Z}/d\mathbb{Z}$. Particularly, if $d=p_1^{a_1}p_2^{a_2}\ldots p_s^{a_s}$ where $p_1, \ldots, p_s$ are distinct primes and $3\leq p_1^{a_1}\leq\cdots\leq p_s^{a_s}$, we present $p_1^{a_1}-1$ MUMEB's in $\mathbb{C}^d\otimes\mathbb{C}^d$ by taking $R=\mathbb{F}_{p_1^{a_1}}\oplus\cdots\oplus\mathbb{F}_{p_s^{a_s}}$, direct sum of finite fields (Theorem 3.3).

preprint2016arXiv

The $(n,m,k,λ)$-Strong External Difference Family with $m \geq 5$ Exists

The notion of strong external difference family (SEDF) in a finite abelian group $(G,+)$ is raised by M. B. Paterson and D. R. Stinson [5] in 2016 and motivated by its application in communication theory to construct $R$-optimal regular algebraic manipulation detection code. A series of $(n,m,k,λ)$-SEDF's have been constructed in [5, 4, 2, 1] with $m=2$. In this note we present an example of (243, 11, 22, 20)-SEDF in finite field $\mathbb{F}_q$ $(q=3^5=243).$ This is an answer for the following problem raised in [5] and continuously asked in [4, 2, 1]: if there exists an $(n,m,k,λ)$-SEDF for $m\geq 5$.

preprint2014arXiv

Generalized Hamming Weights of Irreducible Cyclic Codes

The generalized Hamming weight (GHW) $d_r(C)$ of linear codes $C$ is a natural generalization of the minimum Hamming distance $d(C)(=d_1(C))$ and has become one of important research objects in coding theory since Wei's originary work [23] in 1991. In this paper two general formulas on $d_r(C)$ for irreducible cyclic codes are presented by using Gauss sums and the weight hierarchy $\{d_1(C), d_2(C), \ldots, d_k(C)\}$ $(k=\dim C)$ are completely determined for several cases.

Minghui Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

CMTA: Leveraging Cross-Modal Temporal Artifacts for Generalizable AI-Generated Video Detection

MLB: A Scenario-Driven Benchmark for Evaluating Large Language Models in Clinical Applications

AdaCoach: A Virtual Coach for Training Customer Service Agents

A kind of quaternary sequences of period $2p^mq^n$ and their linear complexity

Determination of 2-Adic Complexity of Generalized Binary Sequences of Order 2

Construction of Cyclic and Constacyclic Codes for b-symbol Read Channels Meeting the Plotkin-like Bound

Mutually unbiased maximally entangled bases in $\mathbb{C}^d\otimes\mathbb{C}^d$

The $(n,m,k,λ)$-Strong External Difference Family with $m \geq 5$ Exists

Generalized Hamming Weights of Irreducible Cyclic Codes