Researcher profile

Ke Wu

Ke Wu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
12works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2026arXiv

Benchmarking LLMs on the Massive Sound Embedding Benchmark (MSEB)

The Massive Sound Embedding Benchmark (MSEB) has emerged as a standard for evaluating the functional breadth of audio models. While initial baselines focused on specialized encoders, the shift toward "audio-native" Large Language Models (LLMs) suggests a new paradigm where a single multimodal backbone may replace complex, task-specific pipelines. This paper provides a rigorous empirical evaluation of leading LLMs - including members from the Gemini and GPT families - across the eight core MSEB capabilities to assess their efficacy and audio-text parity. Our results indicate that while a significant modality gap persists regarding performance and robustness, the empirical evidence for an "optimal" modeling approach remains inconclusive. Ultimately, the choice between audionative and cascaded architectures depends heavily on specific use-case requirements and the underlying assumptions regarding latency, cost, and reasoning depth.

preprint2026arXiv

Why AI Alignment Failure Is Structural: Learned Human Interaction Structures and AGI as an Endogenous Evolutionary Shock

Recent reports of large language models (LLMs) exhibiting behaviors such as deception, threats, or blackmail are often interpreted as evidence of alignment failure or emergent malign agency. We argue that this interpretation rests on a conceptual error. LLMs do not reason morally; they statistically internalize the record of human social interaction, including laws, contracts, negotiations, conflicts, and coercive arrangements. Behaviors commonly labeled as unethical or anomalous are therefore better understood as structural generalizations of interaction regimes that arise under extreme asymmetries of power, information, or constraint. Drawing on relational models theory, we show that practices such as blackmail are not categorical deviations from normal social behavior, but limiting cases within the same continuum that includes market pricing, authority relations, and ultimatum bargaining. The surprise elicited by such outputs reflects an anthropomorphic expectation that intelligence should reproduce only socially sanctioned behavior, rather than the full statistical landscape of behaviors humans themselves enact. Because human morality is plural, context-dependent, and historically contingent, the notion of a universally moral artificial intelligence is ill-defined. We therefore reframe concerns about artificial general intelligence (AGI). The primary risk is not adversarial intent, but AGI's role as an endogenous amplifier of human intelligence, power, and contradiction. By eliminating longstanding cognitive and institutional frictions, AGI compresses timescales and removes the historical margin of error that has allowed inconsistent values and governance regimes to persist without collapse. Alignment failure is thus structural, not accidental, and requires governance approaches that address amplification, complexity, and regime stability rather than model-level intent alone.

preprint2025arXiv

Collaborative Continuum Robots: A Survey

Continuum robots (CRs), owing to their compact structure, inherent compliance, and flexible deformation, have been widely applied in various fields. By coordinating multiple CRs to form collaborative continuum robots (CCRs), task adaptability, workspace, flexibility, load capacity, and operational stability can be further improved, thus offering significant advantages. In recent years, interest in this emerging field has grown steadily within the continuum-robotics community, accompanied by a consistent rise in related publications. By presenting a comprehensive overview of recent progress from different system-architecture levels, this survey provides a clear framework for research on CCRs. First, CCRs are classified into the three collaboration modes of separated collaboration, assistance collaboration, and parallel collaboration, with definitions provided. Next, advances in structural design, modeling, motion planning, and control for each mode are systematically summarized. Finally, current challenges and future opportunities for CCRs are discussed.

preprint2023arXiv

3D Bosons and $W_{1+\infty}$ algebra

In this paper, we consider 3D Young diagrams with at most $N$ layers in $z$-axis direction, which can be constructed by $N$ 2D Young diagrams on slice $z=j$, $j=1,2,\cdots, N$ from the Yang-Baxter equation. Use 2D Bosons $\{a_{j,m},\ m\in\Z\}$ associated to 2D Young diagrams on the slice $z=j$, we constructed 3D Bosons. Then we show the 3D Boson representation of $W_{1+\infty}$ algebra, and the Littlewood-Richardson rule for 3-Jack polynomials from the actions of 3D Bosons on 3D Young diagrams.

preprint2023arXiv

Summative Student Course Review Tool Based on Machine Learning Sentiment Analysis to Enhance Life Science Feedback Efficacy

Machine learning enables the development of new, supplemental, and empowering tools that can either expand existing technologies or invent new ones. In education, space exists for a tool that supports generic student course review formats to organize and recapitulate students' views on the pedagogical practices to which they are exposed. Often, student opinions are gathered with a general comment section that solicits their feelings towards their courses without polling specifics about course contents. Herein, we show a novel approach to summarizing and organizing students' opinions via analyzing their sentiment towards a course as a function of the language/vocabulary used to convey their opinions about a class and its contents. This analysis is derived from their responses to a general comment section encountered at the end of post-course review surveys. This analysis, accomplished with Python, LaTeX, and Google's Natural Language API, allows for the conversion of unstructured text data into both general and topic-specific sub-reports that convey students' views in a unique, novel way.

preprint2022arXiv

3-form Yang-Mills based on 2-crossed modules

In this paper, we study the higher Yang-Mills theory in the framework of higher gauge theory. It was shown that the 2-form electromagnetism can be generalized to the 2-form Yang-Mills theory with the group $U(1)$ replaced by a crossed module of Lie groups. To extend this theory to even higher structure, we develop a 3-form Yang-Mills theory with a 2-crossed module of Lie groups. First, we give an explicit construction of non-degenerate symmetric $G$-invariant forms on the 2-crossed module of Lie algebras. Then, we derive the 3-Bianchi-Identities for 3-curvatures. Finally, we create a 3-form Yang-Mills action and obtain the corresponding field equations.

preprint2022arXiv

Global Normalization for Streaming Speech Recognition in a Modular Framework

We introduce the Globally Normalized Autoregressive Transducer (GNAT) for addressing the label bias problem in streaming speech recognition. Our solution admits a tractable exact computation of the denominator for the sequence-level normalization. Through theoretical and empirical results, we demonstrate that by switching to a globally normalized model, the word error rate gap between streaming and non-streaming speech-recognition models can be greatly reduced (by more than 50\% on the Librispeech dataset). This model is developed in a modular framework which encompasses all the common neural speech recognition models. The modularity of this framework enables controlled comparison of modelling choices and creation of new models.

preprint2022arXiv

Transformer-based Models of Text Normalization for Speech Applications

Text normalization, or the process of transforming text into a consistent, canonical form, is crucial for speech applications such as text-to-speech synthesis (TTS). In TTS, the system must decide whether to verbalize "1995" as "nineteen ninety five" in "born in 1995" or as "one thousand nine hundred ninety five" in "page 1995". We present an experimental comparison of various Transformer-based sequence-to-sequence (seq2seq) models of text normalization for speech and evaluate them on a variety of datasets of written text aligned to its normalized spoken form. These models include variants of the 2-stage RNN-based tagging/seq2seq architecture introduced by Zhang et al. (2019), where we replace the RNN with a Transformer in one or more stages, as well as vanilla Transformers that output string representations of edit sequences. Of our approaches, using Transformers for sentence context encoding within the 2-stage model proved most effective, with the fine-tuned BERT encoder yielding the best performance.

preprint2021arXiv

W-representation of Rainbow tensor model

We analyze the rainbow tensor model and present the Virasoro constraints, where the constraint operators obey the Witt algebra and null 3-algebra. We generalize the method of W-representation in matrix model to the rainbow tensor model, where the operators preserving and increasing the grading play a crucial role. It is shown that the rainbow tensor model can be realized by acting on elementary function with exponent of the operator increasing the grading. We derive the compact expression of correlators and apply it to several models, i.e., the red tensor model, Aristotelian tensor model and r=4 rainbow tensor model. Furthermore, we discuss the case of the non-Gaussian red tensor model and present a dual expression for partition function through differentiation.

preprint2020arXiv

Correlators in the supereigenvalue model in the Ramond sector

We investigate the supereigenvalue model in the Ramond sector. We prove that its partition function can be obtained by acting on elementary functions with exponents of the given operators. The Virasoro constraints for this supereigenvalue model are presented. The remarkable property of these bosonic constraint operators is that they obey the Witt algebra and null 3-algebra. The compact expression of correlators can be derived from these Virasoro constraints.

preprint2019arXiv

Non-rigid Registration Method between 3D CT Liver Data and 2D Ultrasonic Images based on Demons Model

The non-rigid registration between CT data and ultrasonic images of liver can facilitate the diagnosis and treatment, which has been widely studied in recent years. To improve the registration accuracy of the Demons model on the non-rigid registration between 3D CT liver data and 2D ultrasonic images, a novel boundary extraction and enhancement method based on radial directional local intuitionistic fuzzy entropy in the polar coordinates has been put forward, and a new registration workflow has been provided. Experiments show that our method can acquire high-accuracy registration results. Experiments also show that the accuracy of the results of our method is higher than that of the original Demons method and the Demons method using simulated ultrasonic image by Field II. The operation time of our registration workflow is about 30 seconds, and it can be used in the surgery.