Source author record

Xiaohui Zhang

Xiaohui Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

31works

22topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptive Robust Control for Uncertain Systems with Ellipsoid-Set Learning

Despite the celebrated success of stochastic control approaches for uncertain systems, such approaches are limited in the ability to handle non-Gaussian uncertainties. This work presents an adaptive robust control for linear uncertain systems, whose process noise, observation noise, and system states are depicted by ellipsoid sets rather than Gaussian distributions. We design an ellipsoid-set learning method to estimate the boundaries of state sets, and incorporate the learned sets into the control law derivation to reduce conservativeness in robust control. Further, we consider the parametric uncertainties in state-space matrices. Particularly, we assign finite candidates for the uncertain parameters, and construct a bank of candidate-conditional robust control problems for each candidate. We derive the final control law by aggregating the candidate-conditional control laws. In this way, we separate the control scheme into parallel robust controls, decoupling the learning and control, which otherwise renders the control unattainable. We demonstrate the effectiveness of the proposed control in numerical simulations in the cases of linear quadratic regulation and tracking control.

preprint2026arXiv

WavFlow: Audio Generation in Waveform Space

Modern audio generation predominantly relies on latent-space compression, introducing additional complexity and potential information loss. In this work, we challenge this paradigm with WavFlow, a framework that generates high-fidelity audio directly in raw waveform space without intermediate representations. To overcome the inherent difficulties of modeling high-dimensional and low-energy signals, we reshape audio into 2D token grids through waveform patchify and introduce amplitude lifting to align signal scales, enabling stable optimization via direct x-prediction in flow matching. To capture complex semantic alignment and temporal synchronization, we leverage an automated data pipeline to curate 5 million high-quality video-text-audio triplets, allowing the model to learn fine-grained acoustic patterns from scratch. Experimental results show that WavFlow achieves competitive performance on the video-to-audio benchmark VGGSound (FD_PaSST: 59.98, IS_PANNs: 17.40, DeSync: 0.44) and the text-to-audio benchmark AudioCaps (FD_PANNs: 10.63, IS_PANNs: 12.62), matching or exceeding the performance of established latent-based methods. Our work demonstrates that intermediate compression is not a prerequisite for high-quality synthesis, offering a simpler and more scalable alternative for multimodal audio generation.

preprint2022arXiv

A Consensus Algorithm Based on Risk Assessment Model for Permissioned Blockchain

Blockchain technology enables stakeholders to conduct trusted data sharing and exchange without a trusted centralized institution. These features make blockchain applications attractive to enhance trustworthiness in very different contexts. Due to unique design concepts and outstanding performance, blockchain has become a popular research topic in industry and academia in recent years. Every participant is anonymous in a permissionless blockchain represented by cryptocurrency applications such as Bitcoin. In this situation, some special incentive mechanisms are applied to permissionless blockchain, such as mined native cryptocurrency to solve the trust issues of permissionless blockchain. In many use cases, permissionless blockchain has bottlenecks in transaction throughput performance, which restricts further application in the real world. A permissioned blockchain can reach a consensus among a group of entities that do not establish an entire trust relationship. Unlike permissionless blockchains, the participants must be identified in permissioned blockchains. By relying on the traditional crash fault-tolerant consensus protocols, permissioned blockchains can achieve high transaction throughput and low latency without sacrificing security. However, how to balance the security and consensus efficiency is still the issue that needs to be solved urgently in permissioned blockchains. As the core module of blockchain technology, the consensus algorithm plays a vital role in the performance of the blockchain system. Thus, this paper proposes a new consensus algorithm for permissioned blockchain, the Risk Assessment-based Consensus protocol (RAC), combined with the decentralized design concept and the risk-node assessment mechanism to address the unbalance issues of performance in speed, scalability, and security.

preprint2022arXiv

A novel adversarial learning strategy for medical image classification

Deep learning (DL) techniques have been extensively utilized for medical image classification. Most DL-based classification networks are generally structured hierarchically and optimized through the minimization of a single loss function measured at the end of the networks. However, such a single loss design could potentially lead to optimization of one specific value of interest but fail to leverage informative features from intermediate layers that might benefit classification performance and reduce the risk of overfitting. Recently, auxiliary convolutional neural networks (AuxCNNs) have been employed on top of traditional classification networks to facilitate the training of intermediate layers to improve classification performance and robustness. In this study, we proposed an adversarial learning-based AuxCNN to support the training of deep neural networks for medical image classification. Two main innovations were adopted in our AuxCNN classification framework. First, the proposed AuxCNN architecture includes an image generator and an image discriminator for extracting more informative image features for medical image classification, motivated by the concept of generative adversarial network (GAN) and its impressive ability in approximating target data distribution. Second, a hybrid loss function is designed to guide the model training by incorporating different objectives of the classification network and AuxCNN to reduce overfitting. Comprehensive experimental studies demonstrated the superior classification performance of the proposed model. The effect of the network-related factors on classification performance was investigated.

preprint2022arXiv

Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet

From wearables to powerful smart devices, modern automatic speech recognition (ASR) models run on a variety of edge devices with different computational budgets. To navigate the Pareto front of model accuracy vs model size, researchers are trapped in a dilemma of optimizing model accuracy by training and fine-tuning models for each individual edge device while keeping the training GPU-hours tractable. In this paper, we propose Omni-sparsity DNN, where a single neural network can be pruned to generate optimized model for a large range of model sizes. We develop training strategies for Omni-sparsity DNN that allows it to find models along the Pareto front of word-error-rate (WER) vs model size while keeping the training GPU-hours to no more than that of training one singular model. We demonstrate the Omni-sparsity DNN with streaming E2E ASR models. Our results show great saving on training time and resources with similar or better accuracy on LibriSpeech compared to individually pruned sparse models: 2%-6.6% better WER on Test-other.

preprint2022arXiv

Parameterized Colorings And Labellings Of Graphs In Topological Coding

The coming quantum computation is forcing us to reexamine the cryptosystems people use. We are applying graph colorings of topological coding to modern information security and future cryptography against supercomputer and quantum computer attacks in the near future. Many of techniques introduced here are associated with many mathematical conjecture and NP-problems. We will introduce a group of W-constraint (k,d)-total colorings and algorithms for realizing these colorings in some kinds of graphs, which are used to make quickly public-keys and private-keys with anti-quantum computing, these (k,d)-total colorings are: graceful (k,d)-total colorings, harmonious (k,d)-total colorings, (k,d)-edge-magic total colorings, (k,d)-graceful-difference total colorings and (k,d)-felicitous-difference total colorings. One of useful tools we used is called Topcode-matrix with elements can be all sorts of things, for example, sets, graphs, number-based strings. Most of parameterized graphic colorings/labelings are defined by Topcode-matrix algebra here. From the application point of view, many of our coloring techniques are given by algorithms and easily converted into programs.

preprint2021arXiv

On BiHom-analogue of generalized Lie algebras

In this paper, we introduce the definition of generalized BiHom-Lie algebras and generalized BiHom-Lie admissible algebras in the category ${}_H{\mathcal M}$ of left modules for any quasitriangular Hopf algebra $(H, R) $. Also, we describe the BiHom-Lie ideal structures of the BiHom-associative algebras.

preprint2020arXiv

A Gromov Hyperbolic metric and Möbius transformations

We compare a Gromov hyperbolic metric with the hyperbolic metric in the unit ball or in the upper half space, and prove sharp comparison inequalities between the Gromov hyperbolic metric and some hyperbolic type metrics. We also obtain several sharp distortion inequalities for the Gromov hyperbolic metric under some families of Möbius transformations.

preprint2020arXiv

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

Deep acoustic models typically receive features in the first layer of the network, and process increasingly abstract representations in the subsequent layers. Here, we propose to feed the input features at multiple depths in the acoustic model. As our motivation is to allow acoustic models to re-examine their input features in light of partial hypotheses we introduce intermediate model heads and loss function. We study this architecture in the context of deep Transformer networks, and we use an attention mechanism over both the previous layer activations and the input features. To train this model's intermediate output hypothesis, we apply the objective function at each layer right before feature re-use. We find that the use of such iterated loss significantly improves performance by itself, as well as enabling input feature re-use. We present results on both Librispeech, and a large scale video dataset, with relative improvements of 10 - 20% for Librispeech and 3.2 - 13% for videos.

preprint2020arXiv

Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

In this work, we first show that on the widely used LibriSpeech benchmark, our transformer-based context-dependent connectionist temporal classification (CTC) system produces state-of-the-art results. We then show that using wordpieces as modeling units combined with CTC training, we can greatly simplify the engineering pipeline compared to conventional frame-based cross-entropy training by excluding all the GMM bootstrapping, decision tree building and force alignment steps, while still achieving very competitive word-error-rate. Additionally, using wordpieces as modeling units can significantly improve runtime efficiency since we can use larger stride without losing accuracy. We further confirm these findings on two internal VideoASR datasets: German, which is similar to English as a fusional language, and Turkish, which is an agglutinative language.

preprint2020arXiv

Multilingual Graphemic Hybrid ASR with Massive Data Augmentation

Towards developing high-performing ASR for low-resource languages, approaches to address the lack of resources are to make use of data from multiple languages, and to augment the training data by creating acoustic variations. In this work we present a single grapheme-based ASR model learned on 7 geographically proximal languages, using standard hybrid BLSTM-HMM acoustic models with lattice-free MMI objective. We build the single ASR grapheme set via taking the union over each language-specific grapheme set, and we find such multilingual graphemic hybrid ASR model can perform language-independent recognition on all 7 languages, and substantially outperform each monolingual ASR model. Secondly, we evaluate the efficacy of multiple data augmentation alternatives within language, as well as their complementarity with multilingual modeling. Overall, we show that the proposed multilingual graphemic hybrid ASR with various data augmentation can not only recognize any within training set languages, but also provide large ASR performance improvements.

preprint2020arXiv

On cyclic quadrilaterals in euclidean and hyperbolic geometries

Four points ordered in the positive order on the unit circle determine the vertices of a quadrilateral, which is considered either as a euclidean or as a hyperbolic quadrilateral depending on whether the lines connecting the vertices are euclidean or hyperbolic lines. In the case of hyperbolic lines, this type of quadrilaterals are called ideal quadrilaterals. Our main result gives a euclidean counterpart of an earlier result on the hyperbolic distances between the opposite sides of ideal quadrilaterals. The proof is based on computations involving hyperbolic geometry. We also found a new formula for the hyperbolic midpoint of a hyperbolic geodesic segment in the unit disk. As an application of some geometric properties, we provided a euclidean construction of the symmetrization of random four points on the unit circle with respect to a diameter which preserves the absolute cross ratio of quadruples.

preprint2020arXiv

On split regular Hom-Leibniz-Rinehart algebras

In this paper, we introduce the notion of the Hom-Leibniz-Rinehart algebra as an algebraic analogue of Hom-Leibniz algebroid, and prove that such an arbitrary split regular Hom-Leibniz-Rinehart algebra $L$ is of the form $L=U+\sum_γI_γ$ with $U$ a subspace of a maximal abelian subalgebra $H$ and any $I_γ$, a well described ideal of $L$, satisfying $[I_γ, I_δ]= 0$ if $[γ]\neq [δ]$. In the sequel, we develop techniques of connections of roots and weights for split Hom-Leibniz-Rinehart algebras respectively. Finally, we study the structures of tight split regular Hom-Leibniz-Rinehart algebras.

preprint2020arXiv

The Hom-Long dimodule category and nonlinear equations

In this paper, we construct a kind of new braided monoidal category over two Hom-Hopf algerbas $(H,α)$ and $(B,β)$ and associate it with two nonlinear equations. We first introduce the notion of an $(H,B)$-Hom-Long dimodule and show that the Hom-Long dimodule category $^{B}_{H} \Bbb L$ is an autonomous category. Second, we prove that the category $^{B}_{H} \Bbb L$ is a braided monoidal category if $(H,α)$ is quasitriangular and $(B,β)$ is coquasitriangular and get a solution of the quantum Yang-Baxter equation. Also, we show that the category $^{B}_{H} \Bbb L$ can be viewed as a subcategory of the Hom-Yetter-Drinfeld category $^{HøB}_{HøB} \Bbb {HYD}$. Finally, we obtain a solution of the Hom-Long equation from the Hom-Long dimodules.

preprint2020arXiv

Transformer-based Acoustic Modeling for Hybrid Speech Recognition

We propose and evaluate transformer-based acoustic models (AMs) for hybrid speech recognition. Several modeling choices are discussed in this work, including various positional embedding methods and an iterated loss to enable training deep transformers. We also present a preliminary study of using limited right context in transformer models, which makes it possible for streaming applications. We demonstrate that on the widely used Librispeech benchmark, our transformer-based AM outperforms the best published hybrid result by 19% to 26% relative when the standard n-gram language model (LM) is used. Combined with neural network LM for rescoring, our proposed approach achieves state-of-the-art results on Librispeech. Our findings are also confirmed on a much larger internal dataset.

preprint2016arXiv

Pivotal and Ribbon Entwining Datums

Let $(C,A,φ)$ be an entwining structure over $k$. In this paper, we introduce the notions of the pivotal entwined datums and ribbon entwined datums to generalize (co)pivotal Hopf algebras and (co)ribbon Hopf algebras. These notions give necessary and sufficient conditions for the category of entwined modules to be a pivotal category and ribbon category.

preprint2016arXiv

Smash coproducts of bicomonads and Hom-entwining structures

Let $F,G$ be bicomonads on a monoidal category $\mathcal{C}$. The aim of this paper is to discuss the smash coproducts of $F$ and $G$. As an application, the smash coproduct of Hom-bialgebras is discussed. Further, the Hom-entwining structure and Hom-entwined modules are investigated.

preprint2015arXiv

Constructing New Braided $T$-Categories via Weak Monoidal Hom-Hopf Algebras

In this paper, we define and study weak monoidal Hom-Hopf algebras, which generalize both weak Hopf algebras and monoidal Hom-Hopf algebras. If $H$ is a weak monoidal Hom-Hopf algebra with bijective antipode and let $Aut_{wmHH}(H)$ be the set of all automorphisms of $H$. Then we introduce a category ${_{H}\mathcal{WMHYD}^{H}}(α,β)$ with $α,β\in Aut_{wmHH}(H)$ and construct a braided $T$-category $\mathcal{WMHYD}(H)$ that having all the categories ${_{H}\mathcal{WMHYD}^{H}}(α,β)$ as components.

preprint2015arXiv

Geometry of the Cassinian metric and its inner metric

The Cassinian metric and its inner metric have been studied for subdomains of the $n$-dimensional Euclidean space $\mathbb{R}^n$ ($n\ge 2$) by the first named author. In this paper we obtain various inequalities between the Cassinian metric and other related metrics in some specific subdomains of $\mathbb{R}^n$. Also, a sharp distortion property of the Cassinian metric under Möbius transformations of the unit ball is obtained.

preprint2015arXiv

On isometries of conformally invariant metric

We prove that isometries in a conformally invariant metric of a general domain are quasiconformal. In the particular case of the punctured space, we prove that isometries in this metric are Mobius, thus resolving a conjecture of Ferrand, Martin and Vuorinen [FMV, p. 200] in this particular case.

preprint2015arXiv

On the structure theorem and the Maschke type theorem of Doi Hom-Hopf modules

We give necessary and sufficient conditions for the functor that forgets the $(C, γ)$-coaction to be separable. This leads to a generalized notion of integrals. Finally, the applications of our results are considered.

preprint2015arXiv

Parallel training of DNNs with Natural Gradient and Parameter Averaging

We describe the neural-network training framework used in the Kaldi speech recognition toolkit, which is geared towards training DNNs with large amounts of training data using multiple GPU-equipped or multi-core machines. In order to be as hardware-agnostic as possible, we needed a way to use multiple machines without generating excessive network traffic. Our method is to average the neural network parameters periodically (typically every minute or two), and redistribute the averaged parameters to the machines for further training. Each machine sees different data. By itself, this method does not work very well. However, we have another method, an approximate and efficient implementation of Natural Gradient for Stochastic Gradient Descent (NG-SGD), which seems to allow our periodic-averaging method to work well, as well as substantially improving the convergence of SGD on a single machine.

preprint2014arXiv

Braided monoidal categories and Doi Hopf modules for monoidal Hom-Hopf algebras

We first introduce the notion of Doi Hom-Hopf modules and find the sufficient condition for the category of Doi Hom-Hopf modules to be monoidal. Also we obtain the condition for the monoidal Hom-algebra and monoidal Hom-coalgebra to be monoidal Hom-bialgebras. Second, we give the maps between the underlying monoidal Hom-Hopf algebras, Hom-comodule algebras and Hom-module coalgebras give rise to functors between the category of Doi Hom-Hopf modules and study tensor identities for monodial categories of Doi Hom-Hopf modules. Furthermore, we construct a braiding on the category of Doi Hom-Hopf modules. Finally, as an application of our theory, we consider the braiding on the category of Hom-modules, the category of Hom-comodules and the category of Hom-Yetter-Drinfeld modules respectively.

preprint2014arXiv

Drinfeld twists for monoidal Hom-bialgebras

The aim of this paper is to define and study Drinfeld twists for monoidal Hom-bialgebras. We show that a new Hom-bialgebra could be constructed by changing the coproduct of a monoidal Hom-bialgebra via a Drinfeld twist, and this construction preserves $R$-matrixes if there exist one. Moreover, their representation categories are monoidal isomorphic.

preprint2014arXiv

Relative Hom-Hopf modules and total integrals

Let $(H, \a)$ be a monoidal Hom-Hopf algebra and $(A, \b)$ a right $(H, \a)$-Hom-comodule algebra. We first investigate the criterion for the existence of a total integral of $(A, \b)$ in the setting of monoidal Hom-Hopf algebras. Also we prove that there exists a total integral $ϕ: (H, \a)\rightarrow (A, \b)$ if and only if any representation of the pair $(H,A)$ is injective in a functorial way, as a corepresentation of $(H, \a)$, which generalizes Doi's result. Finally, we define a total quantum integral $\g: H\rightarrow Hom(H, A)$ and prove the following affineness criterion: if there exists a total quantum integral $\g$ and the canonical map $ψ: Aø_{B}A\rightarrow AøH,\ \ aø_{B}b\mapsto \b^{-1}(a)b_{[0]}ø\a(b_{[1]}) $is surjective, then the induction functor $Aø_B-: \widetilde{\mathscr{H}}(\mathscr{M}_k)_{B}\rightarrow \widetilde{\mathscr{H}}(\mathscr{M}_k)^{H}_{A}$ is an equivalence of categories.

preprint2013arXiv

A Power Mean Inequality involving the complete elliptic integrals

In this paper the authors investigate a power mean inequality for a special function which is defined by the complete elliptic integrals.

preprint2013arXiv

Distortion of quasiconformal mappings with identity boundary values

Teichmüller's classical mapping problem for plane domains concerns finding a lower bound for the maximal dilatation of a quasiconformal homeomorphism which holds the boundary pointwise fixed, maps the domain onto itself, and maps a given point of the domain to another given point of the domain. For a domain $D \subset {\mathbb R}^n\,,n\ge 2\,,$ we consider the class of all $K$- quasiconformal maps of $D$ onto itself with identity boundary values and Teichmüller's problem in this context. Given a map $f$ of this class and a point $x\in D\,,$ we show that the maximal dilatation of $f$ has a lower bound in terms of the distance of $x$ and $f(x)$ in the distance ratio metric. For instance, convex domains, bounded domains and domains with uniformly perfect boundaries are studied.

preprint2013arXiv

On exterior moduli of quadrilaterals and special functions

In this paper two identities involving a function defined by the complete elliptic integrals of the first and second kinds are proved. Some functional inequalities and elementary estimates for this function are also derived from the properties of monotonicity and convexity of this function. As applications, some functional inequalities and the growth of the exterior modulus of a rectangle are studied.

preprint2013arXiv

Quasihyperbolic metric and Möbius transformations

An improved version of quasiinvariance property of the quasihyperbolic metric under Möbius transformations of the unit ball in ${\mathbb R}^n, n \ge 2,$ is given. Next, a quasiinvariance property, sharp in a local sense, of the quasihyperbolic metric under quasiconformal mappings is proved. Finally, several inequalities between the quasihyperbolic metric and other commonly used metrics such as the hyperbolic metric of the unit ball and the chordal metric are established.

preprint2013arXiv

Topics in special functions III

The authors survey recent results in special functions of classical analysis and geometric function theory, in particular the circular and hyperbolic functions, the gamma function, the elliptic integrals, the Gaussian hypergeometric function, power series, and mean values.

preprint2012arXiv

Inequalities for the generalized trigonometric and hyperbolic functions

The generalized trigonometric functions occur as an eigenfunction of the Dirichlet problem for the one-dimensional $p-$Laplacian. The generalized hyperbolic functions are defined similarly. Some classical inequalities for trigonometric and hyperbolic functions, such as Mitrinović-Adamović inequality, Lazarević's inequality, Huygens-type inequalities, Wilker-type inequalities, and Cuza-Huygens-type inequalities, are generalized to the case of generalized functions.

Xiaohui Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

31 published item(s)

Adaptive Robust Control for Uncertain Systems with Ellipsoid-Set Learning

WavFlow: Audio Generation in Waveform Space

A Consensus Algorithm Based on Risk Assessment Model for Permissioned Blockchain

A novel adversarial learning strategy for medical image classification

Omni-sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR via Supernet

Parameterized Colorings And Labellings Of Graphs In Topological Coding

On BiHom-analogue of generalized Lie algebras

A Gromov Hyperbolic metric and Möbius transformations

Deja-vu: Double Feature Presentation and Iterated Loss in Deep Transformer Networks

Faster, Simpler and More Accurate Hybrid ASR Systems Using Wordpieces

Multilingual Graphemic Hybrid ASR with Massive Data Augmentation

On cyclic quadrilaterals in euclidean and hyperbolic geometries

On split regular Hom-Leibniz-Rinehart algebras

The Hom-Long dimodule category and nonlinear equations

Transformer-based Acoustic Modeling for Hybrid Speech Recognition

Pivotal and Ribbon Entwining Datums

Smash coproducts of bicomonads and Hom-entwining structures

Constructing New Braided $T$-Categories via Weak Monoidal Hom-Hopf Algebras

Geometry of the Cassinian metric and its inner metric

On isometries of conformally invariant metric

On the structure theorem and the Maschke type theorem of Doi Hom-Hopf modules

Parallel training of DNNs with Natural Gradient and Parameter Averaging

Braided monoidal categories and Doi Hopf modules for monoidal Hom-Hopf algebras

Drinfeld twists for monoidal Hom-bialgebras

Relative Hom-Hopf modules and total integrals

A Power Mean Inequality involving the complete elliptic integrals

Distortion of quasiconformal mappings with identity boundary values

On exterior moduli of quadrilaterals and special functions

Quasihyperbolic metric and Möbius transformations

Topics in special functions III

Inequalities for the generalized trigonometric and hyperbolic functions