Source author record

Mingyang Zhang

Mingyang Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language eess.AS Artificial Intelligence Machine Learning Sound cond-mat.mtrl-sci Neural and Evolutionary Computing physics.optics Social and Information Networks

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

High-Ti induced planar-fault transformation toward superlattice extrinsic stacking faults and microtwins in crept CoNi-based superalloys

Controlling planar fault shearing mechanisms is key for improving the high-temperature creep performance of gamma prime-strengthened high-temperature superalloys. This work examines how the Ti concentration in L12-strengthened CoNi-based alloys affects planar fault formation during creep. Interrupted compressive creep tests were conducted at 1223 K under air with a constant load stress of 241 MPa. We found, for the first time, that high Ti additions shift the dominant gamma prime shearing mode from antiphase boundaries (APBs) in Ti-free and low-Ti alloys to superlattice extrinsic stacking faults (SESFs). Systematic ab initio calculations show that in high-Ti alloys, the elevated APB energy renders APB-shearing mode unfavorable. Nevertheless, the SESF energy decreases relative to that in low-Ti compositions, and an increased ratio of complex intrinsic stacking fault (CISF) to SESF energy promote the transformation of high-energy CISFs into lower-energy SESFs. Chemical analysis using scanning transmission electron microscopy combined with energy-dispersive X-ray spectroscopy further reveals that, SESFs in high-Ti alloys are enriched in Ti, Mo and W, yet no grid-like ordering is observed. Together with the ab initio calculations, Mo and W additions in high Ti alloys could facilitate the transformation from L12 structure to low-energy D024 structure, indicating Mo and W segregation along SESFs is energetically favourable. Furthermore, the successive SESF thickening facilitates microtwinning in the absence of D024 ordering along SESFs, as an additional big carrier for creep strain. These new findings clarify the role of Ti in controlling planar fault shearing mechanisms, providing new insights for optimizing the creep performance of next-generation CoNi-based superalloys.

preprint2026arXiv

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

Recent progress in large language models (LLMs) has focused on test-time scaling to improve reasoning via increased inference computation, but often at the cost of efficiency. We revisit test-time behavior and uncover a simple yet underexplored phenomenon: reasoning uncertainty is highly localized-only a small subset of high-entropy tokens dominantly affects output correctness. Motivated by this, we propose Minimal Test-Time Intervention (MTI), a training-free framework that enhances reasoning accuracy and stability with minimal overhead. MTI includes: (i) Selective CFG intervention, applying classifier-free guidance only at uncertain positions; and (ii) Lightweight negative-prompt guidance, reusing the main model's KV cache to approximate unconditional decoding efficiently. MTI yields consistent gains across general, coding, and STEM tasks-e.g., +9.28% average improvement on six benchmarks for DeepSeek-R1-7B and +11.25% on AIME2024 using Ling-mini-2.0-while remaining highly efficient.

preprint2024arXiv

Transfer the linguistic representations from TTS to accent conversion with non-parallel data

Accent conversion aims to convert the accent of a source speech to a target accent, meanwhile preserving the speaker's identity. This paper introduces a novel non-autoregressive framework for accent conversion that learns accent-agnostic linguistic representations and employs them to convert the accent in the source speech. Specifically, the proposed system aligns speech representations with linguistic representations obtained from Text-to-Speech (TTS) systems, enabling training of the accent voice conversion model on non-parallel data. Furthermore, we investigate the effectiveness of a pretraining strategy on native data and different acoustic features within our proposed framework. We conduct a comprehensive evaluation using both subjective and objective metrics to assess the performance of our approach. The evaluation results highlight the benefits of the pretraining strategy and the incorporation of richer semantic features, resulting in significantly enhanced audio quality and intelligibility.

preprint2022arXiv

Efficient Re-parameterization Operations Search for Easy-to-Deploy Network Based on Directional Evolutionary Strategy

Structural re-parameterization (Rep) methods has achieved significant performance improvement on traditional convolutional network. Most current Rep methods rely on prior knowledge to select the reparameterization operations. However, the performance of architecture is limited by the type of operations and prior knowledge. To break this restriction, in this work, an improved re-parameterization search space is designed, which including more type of re-parameterization operations. Concretely, the performance of convolutional networks can be further improved by the search space. To effectively explore this search space, an automatic re-parameterization enhancement strategy is designed based on neural architecture search (NAS), which can search a excellent re-parameterization architecture. Besides, we visualize the output features of the architecture to analyze the reasons for the formation of the re-parameterization architecture. On public datasets, we achieve better results. Under the same training conditions as ResNet, we improve the accuracy of ResNet-50 by 1.82% on ImageNet-1k.

preprint2022arXiv

Real transmission and reflection zeros of periodic structures with a bound state in the continuum

For lossless periodic structures with a proper symmetry, the transmission and reflection spectra often have peaks and dips that are truly $100\%$ and $0\%$, respectively. The full peaks and zero dips typically appear near resonant frequencies, and they are robust with respect to structural perturbations that preserve the required symmetry. However, current theories on the existence of full peaks and zero dips are incomplete and difficult to use. For periodic structures with a bound state in the continuum (BIC), we present a new theory on the existence of real transmission and reflection zeros that correspond to the zero dips in the transmission and reflection spectra. Our theory is relatively simple, complete, and easy to use. Numerical examples are presented to validate the new theory.

preprint2022arXiv

RepNAS: Searching for Efficient Re-parameterizing Blocks

In the past years, significant improvements in the field of neural architecture search(NAS) have been made. However, it is still challenging to search for efficient networks due to the gap between the searched constraint and real inference time exists. To search for a high-performance network with low inference time, several previous works set a computational complexity constraint for the search algorithm. However, many factors affect the speed of inference(e.g., FLOPs, MACs). The correlation between a single indicator and the latency is not strong. Currently, some re-parameterization(Rep) techniques are proposed to convert multi-branch to single-path architecture which is inference-friendly. Nevertheless, multi-branch architectures are still human-defined and inefficient. In this work, we propose a new search space that is suitable for structural re-parameterization techniques. RepNAS, a one-stage NAS approach, is present to efficiently search the optimal diverse branch block(ODBB) for each layer under the branch number constraint. Our experimental results show the searched ODBB can easily surpass the manual diverse branch block(DBB) with efficient training.

preprint2022arXiv

VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over

In this paper, we formulate a novel task to synthesize speech in sync with a silent pre-recorded video, denoted as automatic voice over (AVO). Unlike traditional speech synthesis, AVO seeks to generate not only human-sounding speech, but also perfect lip-speech synchronization. A natural solution to AVO is to condition the speech rendering on the temporal progression of lip sequence in the video. We propose a novel text-to-speech model that is conditioned on visual input, named VisualTTS, for accurate lip-speech synchronization. The proposed VisualTTS adopts two novel mechanisms that are 1) textual-visual attention, and 2) visual fusion strategy during acoustic decoding, which both contribute to forming accurate alignment between the input text content and lip motion in input lip sequence. Experimental results show that VisualTTS achieves accurate lip-speech synchronization and outperforms all baseline systems.

preprint2021arXiv

Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data

This paper presents a novel framework to build a voice conversion (VC) system by learning from a text-to-speech (TTS) synthesis system, that is called TTS-VC transfer learning. We first develop a multi-speaker speech synthesis system with sequence-to-sequence encoder-decoder architecture, where the encoder extracts robust linguistic representations of text, and the decoder, conditioned on target speaker embedding, takes the context vectors and the attention recurrent network cell output to generate target acoustic features. We take advantage of the fact that TTS system maps input text to speaker independent context vectors, and reuse such a mapping to supervise the training of latent representations of an encoder-decoder voice conversion system. In the voice conversion system, the encoder takes speech instead of text as input, while the decoder is functionally similar to TTS decoder. As we condition the decoder on speaker embedding, the system can be trained on non-parallel data for any-to-any voice conversion. During voice conversion training, we present both text and speech to speech synthesis and voice conversion networks respectively. At run-time, the voice conversion network uses its own encoder-decoder architecture. Experiments show that the proposed approach outperforms two competitive voice conversion baselines consistently, namely phonetic posteriorgram and variational autoencoder methods, in terms of speech quality, naturalness, and speaker similarity.

preprint2020arXiv

Urban Anomaly Analytics: Description, Detection, and Prediction

Urban anomalies may result in loss of life or property if not handled properly. Automatically alerting anomalies in their early stage or even predicting anomalies before happening are of great value for populations. Recently, data-driven urban anomaly analysis frameworks have been forming, which utilize urban big data and machine learning algorithms to detect and predict urban anomalies automatically. In this survey, we make a comprehensive review of the state-of-the-art research on urban anomaly analytics. We first give an overview of four main types of urban anomalies, traffic anomaly, unexpected crowds, environment anomaly, and individual anomaly. Next, we summarize various types of urban datasets obtained from diverse devices, i.e., trajectory, trip records, CDRs, urban sensors, event records, environment data, social media and surveillance cameras. Subsequently, a comprehensive survey of issues on detecting and predicting techniques for urban anomalies is presented. Finally, research challenges and open problems as discussed.

Mingyang Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

9 published item(s)

High-Ti induced planar-fault transformation toward superlattice extrinsic stacking faults and microtwins in crept CoNi-based superalloys

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

Transfer the linguistic representations from TTS to accent conversion with non-parallel data

Efficient Re-parameterization Operations Search for Easy-to-Deploy Network Based on Directional Evolutionary Strategy

Real transmission and reflection zeros of periodic structures with a bound state in the continuum

RepNAS: Searching for Efficient Re-parameterizing Blocks

VisualTTS: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over

Transfer Learning from Speech Synthesis to Voice Conversion with Non-Parallel Training Data

Urban Anomaly Analytics: Description, Detection, and Prediction