Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
24works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

24 published item(s)

preprint2026arXiv

CAGS: Color-Adaptive Volumetric Video Streaming with Dynamic 3D Gaussian Splatting

Volumetric video (VV) streaming enables real-time, immersive access to remote 3D environments, powering telepresence, ecological monitoring, and robotic teleoperation. These applications turn VV streaming into a real-time interface to remote physical environments, imposing new system-level demands for photorealistic scene representation, low-latency interaction, and robust performance under heterogeneous networks. 3D Gaussian Splatting (3DGS) has been widely used for real-time photorealistic rendering, offering superior visual quality and rendering performance, but it faces challenges due to bandwidth consumption. Furthermore, as the foundation of adaptive VV streaming, existing Levels of Detail (LoD) methods based on density are not well-suited to Gaussian representations, leading to visible gaps and severe quality degradation. Recent studies have also explored attribute compression techniques to reduce bandwidth consumption. Our preliminary studies reveal that aggressive attribute compression primarily causes color distortion, which can be effectively corrected in the rendered image using a reference image. Motivated by these findings, we propose a novel Color-Adaptive scheme for adaptive VV streaming that uses vector quantization (VQ) to establish LoDs and correct color distortions with low-resolution reference images. We further present CAGS, an adaptive VV streaming system compatible with diverse Gaussian representations, which integrates the Color-Adaptive scheme by rendering reference images on the streaming server and performing color restoration on the client. Extensive experiments on our prototype system demonstrate that CAGS outperforms the existing adaptive streaming systems in PSNR by 5$\sim$20 dB under fluctuating bandwidth, operates significantly faster than existing scalable Gaussian compression methods, and generalizes across different Gaussian representations.

preprint2026arXiv

MesonGS++: Post-training Compression of 3D Gaussian Splatting with Hyperparameter Searching

3D Gaussian Splatting (3DGS) achieves high-quality novel view synthesis with real-time rendering, but its storage cost remains prohibitive for practical deployment. Existing post-training compression methods still rely on many coupled hyperparameters across pruning, transformation, quantization, and entropy coding, making it difficult to control the final compressed size and fully exploit the rate-distortion trade-off. We propose MesonGS++, a size-aware post-training codec for 3D Gaussian compression. On the codec side, MesonGS++ combines joint importance-based pruning, octree geometry coding, attribute transformation, selective vector quantization for higher-degree spherical harmonics, and group-wise mixed-precision quantization with entropy coding. On the configuration side, it treats the reserve ratio and bit-width allocation as the dominant rate-distortion knobs and jointly optimizes them under a target storage budget via discrete sampling and 0--1 integer linear programming. We further propose a linear size estimator and a CUDA parallel quantization operator to accelerate the hyperparameter searching process. Extensive experiments show that MesonGS++ achieves over 34$\times$ compression while preserving rendering fidelity, outperforming state-of-the-art post-training methods and accurately meeting target size budgets. Remarkably, without any training, MesonGS++ can even surpass the PSNR of vanilla 3DGS at a 20$\times$ compression rate on the Stump scene. Our code is available at https://github.com/mmlab-sigs/mesongs_plus

preprint2023arXiv

Weighted EF1 Allocations for Indivisible Chores

We study how to fairly allocate a set of indivisible chores to a group of agents, where each agent $i$ has a non-negative weight $w_i$ that represents its obligation for undertaking the chores. We consider the fairness notion of weighted envy-freeness up to one item (WEF1) and propose an efficient picking sequence algorithm for computing WEF1 allocations. Our analysis is based on a natural and powerful continuous interpretation for the picking sequence algorithms in the weighted setting, which might be of independent interest. Using this interpretation, we establish the necessary and sufficient conditions under which picking sequence algorithms can guarantee other fairness notions in the weighted setting. We also study the existence of fair and efficient allocations and propose efficient algorithms for the computation of WEF1 and PO allocations for the bi-valued instances. Our result generalizes that of Garg et al. (AAAI 2022) and Ebadian et al. (AAMAS 2022) to the weighted setting. Our work also studies the price of fairness for WEF1, and the implications of WEF1 to other fairness notions.

preprint2022arXiv

Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech

This study investigates whether the phonological features derived from the Featurally Underspecified Lexicon model can be applied in text-to-speech systems to generate native and non-native speech in English and Mandarin. We present a mapping of ARPABET/pinyin to SAMPA/SAMPA-SC and then to phonological features. This mapping was tested for whether it could lead to the successful generation of native, non-native, and code-switched speech in the two languages. We ran two experiments, one with a small dataset and one with a larger dataset. The results supported that phonological features could be used as a feasible input system for languages in or not in the train data, although further investigation is needed to improve model performance. The results lend support to FUL by presenting successfully synthesised output, and by having the output carrying a source-language accent when synthesising a language not in the training data. The TTS process stimulated human second language acquisition process and thus also confirm FUL's ability to account for acquisition.

preprint2022arXiv

ByT5 model for massively multilingual grapheme-to-phoneme conversion

In this study, we tackle massively multilingual grapheme-to-phoneme conversion through implementing G2P models based on ByT5. We have curated a G2P dataset from various sources that covers around 100 languages and trained large-scale multilingual G2P models based on ByT5. We found that ByT5 operating on byte-level inputs significantly outperformed the token-based mT5 model in terms of multilingual G2P. Pairwise comparison with monolingual models in these languages suggests that multilingual ByT5 models generally lower the phone error rate by jointly learning from a variety of languages. The pretrained model can further benefit low resource G2P through zero-shot prediction on unseen languages or provides pretrained weights for finetuning, which helps the model converge to a lower phone error rate than randomly initialized weights. To facilitate future research on multilingual G2P, we make available our code and pretrained multilingual G2P models at: https://github.com/lingjzhu/CharsiuG2P.

preprint2022arXiv

Fermion coupling to loop quantum gravity: canonical formulation

In the model of a fermion field coupled to loop quantum gravity, we consider the Gauss and the Hamiltonian constraints. According to the explicit solutions to the Gauss constraint, the fermion spins and the gravitational spin networks intertwine with each other so that the fermion spins contribute to the volume of the spin network vertices. For the Hamiltonian constraint, the regularization and quantization procedures are presented in detail. By introducing an adapted vertex Hilbert space to remove the regulator, we propose a diffeomorphism covariant graph-changing Hamiltonian constraint operator of the fermion field. This operator shows how fermions move in the loop quantum gravity spacetime and simultaneously influences the background quantum geometry.

preprint2022arXiv

Fermions on Quantum Geometry and Resolution of Doubling Problem

The fermion doubling problem has an important impact on quantum gravity, by revealing the tension between fermion and the fundamental discreteness of quantum spacetime. In this work, we discover that in Loop Quantum Gravity, the quantum geometry involving superposition of states associated with lattice refinements provides a resolution to the fermion doubling problem. We construct and analyze the fermion propagator on the quantum geometry, and we show that all fermion doubler modes are suppressed in the propagator. Our result suggests that the superposition nature of quantum geometry should resolve the tension between fermion and the fundamental discreteness, and relate to the continuum limit of quantum gravity.

preprint2022arXiv

First-Order Quantum Correction in Coherent State Expectation Value of Loop-Quantum-Gravity Hamiltonian

Given the non-graph-changing Hamiltonian $\widehat{H[N]}$ in Loop Quantum Gravity (LQG), $\langle\widehat{H[N]}\rangle$, the coherent state expectation value of $\widehat{H[N]}$, admits an semiclassical expansion in $\ell^2_{\rm p}$. In this paper, as presenting the detailed derivations of our previous work arXiv:2012.14242, we explicitly compute the expansion of $\langle\widehat{H[N]}\rangle$ to the linear order in $\ell^2_{\rm p}$ on the cubic graph with respect to the coherent state peaked at the homogeneous and isotropic data of cosmology. In our computation, a powerful algorithm is developed, supported by rigorous proofs and several theorems, to overcome the complexity in the computation of $\langle \widehat{H[N]} \rangle$. Particularly, some key innovations in our algorithm substantially reduce the complexity in computing the Lorentzian part of $\langle\widehat{H[N]}\rangle$. Additionally, some quantum correction effects resulting from $\langle\widehat{H[N]}\rangle$ in cosmology are discussed at the end of this paper.

preprint2022arXiv

First-Order Quantum Correction in Coherent State Expectation Value of Loop-Quantum-Gravity Hamiltonian: Overview and Results

Given the Loop-Quantum-Gravity (LQG) non-graph-changing Hamiltonian $\widehat{H[N]}$, the coherent state expectation value $\langle\widehat{H[N]}\rangle$ admits an semiclassical expansion in $\ell^2_{\rm p}$. In this paper, we compute explicitly the expansion of $\langle\widehat{H[N]}\rangle$ on the cubic graph to the linear order in $\ell^2_{\rm p}$, when the coherent state is peaked at the homogeneous and isotropic data of cosmology. In our computation, a powerful algorithm is developed to overcome the complexity in computing $\langle \widehat{H[N]} \rangle$. In particular, some key innovations in our algorithm substantially reduce the computational complexity in the Lorentzian part of $\langle\widehat{H[N]}\rangle$. Moreover, the algorithm developed in the present work makes it possible to compute the expectation value of arbitrary monomial of holonomies and fluxes on one edge up to arbitrary order of $\ell_{\rm p}^2$.

preprint2022arXiv

Learning to Solve Multiple-TSP with Time Window and Rejections via Deep Reinforcement Learning

We propose a manager-worker framework based on deep reinforcement learning to tackle a hard yet nontrivial variant of Travelling Salesman Problem (TSP), \ie~multiple-vehicle TSP with time window and rejections (mTSPTWR), where customers who cannot be served before the deadline are subject to rejections. Particularly, in the proposed framework, a manager agent learns to divide mTSPTWR into sub-routing tasks by assigning customers to each vehicle via a Graph Isomorphism Network (GIN) based policy network. A worker agent learns to solve sub-routing tasks by minimizing the cost in terms of both tour length and rejection rate for each vehicle, the maximum of which is then fed back to the manager agent to learn better assignments. Experimental results demonstrate that the proposed framework outperforms strong baselines in terms of higher solution quality and shorter computation time. More importantly, the trained agents also achieve competitive performance for solving unseen larger instances.

preprint2022arXiv

Phone-to-audio alignment without text: A Semi-supervised Approach

The task of phone-to-audio alignment has many applications in speech research. Here we introduce two Wav2Vec2-based models for both text-dependent and text-independent phone-to-audio alignment. The proposed Wav2Vec2-FS, a semi-supervised model, directly learns phone-to-audio alignment through contrastive learning and a forward sum loss, and can be coupled with a pretrained phone recognizer to achieve text-independent alignment. The other model, Wav2Vec2-FC, is a frame classification model trained on forced aligned labels that can both perform forced alignment and text-independent segmentation. Evaluation results suggest that both proposed methods, even when transcriptions are not available, generate highly close results to existing forced alignment tools. Our work presents a neural pipeline of fully automated phone-to-audio alignment. Code and pretrained models are available at https://github.com/lingjzhu/charsiu.

preprint2022arXiv

Polarization measurement for the dileptonic channel of $W^+ W^-$ scattering using generative adversarial network

Measuring the polarization fractions of the $W^+W^-$ scattering reveals the interactions of the Higgs boson as well as new neutral states that are related to the standard model electroweak symmetry breaking. The dileptonic channel has a relatively lower background rate, but the kinematics of its final states can not be fully reconstructed due to the presence of two neutrinos. We propose neural networks to establish maps between the distributions of measurable quantities and the distributions of the lepton angles in $W$ boson rest frames. New physics contributions and collision energy can largely affect the kinematic properties of the $W^+W^-$ scattering beside the lepton angles. To make the network in ignorance of that information, the loss function is modified in two different ways. We show that the networks are promising in reproducing the lepton angle distributions, and the precision of the fitted polarization fractions obtained from network predictions is comparable to that obtained with the truth lepton angle. Although the best-fit values of polarization fractions do not change much after including the background uncertainty, the precisions is substantially reduced. Our trained models are available at GitHub.

preprint2022arXiv

Reduced Phase Space Quantization of Black Holes: Path Integrals, and Effective Dynamics

We consider the loop quantum theory of the spherically symmetric model of gravity coupled to Gaussian dust fields, where the Gaussian dust fields provide a material reference frame of the space and time to deparameterize gravity. This theory, used to study the quantum features of the spherically symmetric black hole, is constructed based on a 1-dimensional lattice $γ\subset\mathbb R$. Taking advantage of the path integral formulation, we investigate the quantum dynamics and obtain an effective action. With this action, we get an effective continuous description of this quantum lattice system which is not the same as the one described by the effective Hamiltonian used in arXiv:2012.05729, i.e. the classical Hamiltonian with the holonomy correction. It turns out that the Hamiltonian derived in this paper returns that used in arXiv:2012.05729 only for macro black holes since the lattice $γ$ is required to be sufficiently fine. Indeed, it is necessary to propose this fine-grained lattice structure in order to well describe the underlying lattice theory by the continuous description.

preprint2022arXiv

Sampling Efficient Deep Reinforcement Learning through Preference-Guided Stochastic Exploration

Massive practical works addressed by Deep Q-network (DQN) algorithm have indicated that stochastic policy, despite its simplicity, is the most frequently used exploration approach. However, most existing stochastic exploration approaches either explore new actions heuristically regardless of Q-values or inevitably introduce bias into the learning process to couple the sampling with Q-values. In this paper, we propose a novel preference-guided $ε$-greedy exploration algorithm that can efficiently learn the action distribution in line with the landscape of Q-values for DQN without introducing additional bias. Specifically, we design a dual architecture consisting of two branches, one of which is a copy of DQN, namely the Q-branch. The other branch, which we call the preference branch, learns the action preference that the DQN implicit follows. We theoretically prove that the policy improvement theorem holds for the preference-guided $ε$-greedy policy and experimentally show that the inferred action preference distribution aligns with the landscape of corresponding Q-values. Consequently, preference-guided $ε$-greedy exploration motivates the DQN agent to take diverse actions, i.e., actions with larger Q-values can be sampled more frequently whereas actions with smaller Q-values still have a chance to be explored, thus encouraging the exploration. We assess the proposed method with four well-known DQN variants in nine different environments. Extensive results confirm the superiority of our proposed method in terms of performance and convergence speed. Index Terms- Preference-guided exploration, stochastic policy, data efficiency, deep reinforcement learning, deep Q-learning.

preprint2022arXiv

Spherical Convolution empowered FoV Prediction in 360-degree Video Multicast with Limited FoV Feedback

Field of view (FoV) prediction is critical in 360-degree video multicast, which is a key component of the emerging Virtual Reality (VR) and Augmented Reality (AR) applications. Most of the current prediction methods combining saliency detection and FoV information neither take into account that the distortion of projected 360-degree videos can invalidate the weight sharing of traditional convolutional networks, nor do they adequately consider the difficulty of obtaining complete multi-user FoV information, which degrades the prediction performance. This paper proposes a spherical convolution-empowered FoV prediction method, which is a multi-source prediction framework combining salient features extracted from 360-degree video with limited FoV feedback information. A spherical convolution neural network (CNN) is used instead of a traditional two-dimensional CNN to eliminate the problem of weight sharing failure caused by video projection distortion. Specifically, salient spatial-temporal features are extracted through a spherical convolution-based saliency detection model, after which the limited feedback FoV information is represented as a time-series model based on a spherical convolution-empowered gated recurrent unit network. Finally, the extracted salient video features are combined to predict future user FoVs. The experimental results show that the performance of the proposed method is better than other prediction methods.

preprint2022arXiv

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Speech enhancement is an essential task of improving speech quality in noise scenario. Several state-of-the-art approaches have introduced visual information for speech enhancement,since the visual aspect of speech is essentially unaffected by acoustic environment. This paper proposes a novel frameworkthat involves visual information for speech enhancement, by in-corporating a Generative Adversarial Network (GAN). In par-ticular, the proposed visual speech enhancement GAN consistof two networks trained in adversarial manner, i) a generator that adopts multi-layer feature fusion convolution network to enhance input noisy speech, and ii) a discriminator that attemptsto minimize the discrepancy between the distributions of the clean speech signal and enhanced speech signal. Experiment re-sults demonstrated superior performance of the proposed modelagainst several state-of-the-art

preprint2021arXiv

DeepFake-o-meter: An Open Platform for DeepFake Detection

In recent years, the advent of deep learning-based techniques and the significant reduction in the cost of computation resulted in the feasibility of creating realistic videos of human faces, commonly known as DeepFakes. The availability of open-source tools to create DeepFakes poses as a threat to the trustworthiness of the online media. In this work, we develop an open-source online platform, known as DeepFake-o-meter, that integrates state-of-the-art DeepFake detection methods and provide a convenient interface for the users. We describe the design and function of DeepFake-o-meter in this work.

preprint2021arXiv

Loop quantum deparametrized Schwarzschild interior and discrete black hole mass

We present the detailed analyses of a model of loop quantum Schwarzschild interior coupled to a massless scalar field and extend the results in our previous rapid communication arXiv:2006.08313 to more general schemes. It is shown that the spectrum of the black hole mass is discrete and does not contain zero. This indicates the existence of a black hole remnant after Hawking evaporation due to loop quantum gravity effects. Besides to show the existence of a stable black hole remnant in the vacuum case, the quantum dynamics for the non-vacuum case is also solved and compared with the effective one.

preprint2021arXiv

Twisted geometry coherent states in all dimensional loop quantum gravity: I. Construction and Peakedness properties

A new family of coherent states for all dimensional loop quantum gravity are proposed, which is based on the generalized twisted geometry parametrization of the phase space of $SO(D+1)$ connection theory. We prove that this family of coherent states provide an over-complete basis of the Hilbert space in which edge simplicity constraint is solved. Moreover, according to our explicit calculation, the expectation values of holonomy and flux operators with respect to this family of coherent states coincide with the corresponding classical values given by the labels of the coherent states, up to some gauge degrees of freedom. Besides, we study the peakedness properties of this family of coherent states, including the peakedness of the wave functions of this family of coherent states in holonomy, momentum and phase space representations. It turns out that the peakedness in these various representations and the (relative) uncertainty of the expectation values of the operators are well controlled by the semi-classical parameter $t$. Therefore, this family of coherent states provide a candidate for the semi-classical analysis of all dimensional loop quantum gravity.

preprint2020arXiv

Alternative dynamics in loop quantum Brans-Dicke cosmology

To inherit more features of full loop quantum Brans-Dicke theory, the Euclidean and Lorentzian terms of the Hamiltonian constraint are quantized independently in loop quantum Brans-Dicke cosmology. An alternative Hamiltonian constraint operator and its effective expression are obtained in the cosmological model. A residual quantum correction term is found in the effective Hamiltonian constraint, which has no analog in the effective Hamiltonian of the loop quantum cosmology from general relativity. The dynamics driven by this effective Hamiltonian constraint is analyzed in detail. For the physically interesting case of $ω\gg 1$, this effective Hamiltonian drives a bouncing evolution which evolves from a de Sitter universe to a classical Brans-Dicke solution.

preprint2020arXiv

DSP: A Differential Spatial Prediction Scheme for Comprehensive real industrial datasets

Inverse Distance Weighted models (IDW) have been widely used for predicting and modeling multidimensional space in multimodal industrial processes. However, the more complex the structure of multidimensional space, the lower the performance of IDW models, and real industrial datasets tend to have more complex spatial structure. To solve this problem, a new framework for spatial prediction and modeling based on deep reinforcement learning network is proposed. In the proposed framework, the internal relationship between state and action is enhanced by reusing the state values in the Q network, and the convergence rate and stability of the deep reinforcement learning network are improved. The improved deep reinforcement learning network is then used to search for and learn the hyperparameters of each sample point in the inverse distance weighted model. These hyperparameters can reflect the spatial structure of the current industrial dataset to some extent. Then a spatial distribution of hyperparameters is constructed based on the learned hyperparameters. Each interpolation point obtains corresponding hyperparameters from the hyperparametric spatial distribution and brings them into the classical IDW models for prediction, thus achieving differential spatial prediction and modeling. The simulation results show that the proposed framework is suitable for real industrial datasets with complex spatial structure characteristics and is more accurate than current IDW models in spatial prediction.

preprint2020arXiv

Joint Communication and Computational Resource Allocation for QoE-driven Point Cloud Video Streaming

Point cloud video is the most popular representation of hologram, which is the medium to precedent natural content in VR/AR/MR and is expected to be the next generation video. Point cloud video system provides users immersive viewing experience with six degrees of freedom and has wide applications in many fields such as online education, entertainment. To further enhance these applications, point cloud video streaming is in critical demand. The inherent challenges lie in the large size by the necessity of recording the three-dimensional coordinates besides color information, and the associated high computation complexity of encoding. To this end, this paper proposes a communication and computation resource allocation scheme for QoE-driven point cloud video streaming. In particular, we maximize system resource utilization by selecting different quantities, transmission forms and quality level tiles to maximize the quality of experience. Extensive simulations are conducted and the simulation results show the superior performance over the existing schemes

preprint2020arXiv

Koopman Operator and Phase Space Partition of Chaotic Maps

Koopman operator describes evolution of observables in the phase space, which could be used to extract characteristic dynamical features of a nonlinear system. Here, we show that it is possible to carry out interesting symbolic partitions based on properly constructed eigenfunctions of the operator for chaotic maps. The partition boundaries are the extrema of these eigenfunctions, the accuracy of which is improved by including more basis functions in the numerical computation. The validity of this scheme is demonstrated in well-known 1-d and 2-d maps. It seems no obstacle to extend the computation to nonlinear systems of high dimensions, which provides a possible way of dissecting complex dynamics.

preprint2020arXiv

Quantum geometry and effective dynamics of Janis-Newman-Winicour singularities

Inspired by the recent proposal for the quantum effective dynamics of the Schwarzschild spacetime given in \cite{AOS1}, we investigate the effective dynamics of the loop quantized Janis-Newman-Winicour (JNW) spacetime which is an extension of the Schwarzschild spacetime with an extra minimally coupled massless scalar field. Two parameters are introduced in order to regularize the Hamiltonian constraint in the quantum effective dynamics. These two parameters are assumed to be Dirac observables when the effective dynamics is solved. By carefully choosing appropriate conditions for these two parameters, we completely determine them, and the resulted new effective description of the JNW spacetime leads to a well behaved quantum dynamics which on one hand resolves the classical singularities, and on the other hand, agrees with the classical dynamics in the low curvature region.