Source author record

Jing Peng

Jing Peng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language cs.CY hep-ex hep-ph Human-Computer Interaction Information Retrieval math.AP Multimedia physics.ed-ph q-fin.MF q-fin.RM Sound

Catalog footprint

What is connected

7works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

UniSRCodec: Unified and Low-Bitrate Single Codebook Codec with Sub-Band Reconstruction

Neural Audio Codecs (NACs) can reduce transmission overhead by performing compact compression and reconstruction, which also aim to bridge the gap between continuous and discrete signals. Existing NACs can be divided into two categories: multi-codebook and single-codebook codecs. Multi-codebook codecs face challenges such as structural complexity and difficulty in adapting to downstream tasks, while single-codebook codecs, though structurally simpler, suffer from low-fidelity, ineffective modeling of unified audio, and an inability to support modeling of high-frequency audio. We propose the UniSRCodec, a single-codebook codec capable of supporting high sampling rate, low-bandwidth, high fidelity, and unified. We analyze the inefficiency of waveform-based compression and introduce the time and frequency compression method using the Mel-spectrogram, and cooperate with a Vocoder to recover the phase information of the original audio. Moreover, we propose a sub-band reconstruction technique to achieve high-quality compression across both low and high frequency bands. Subjective and objective experimental results demonstrate that UniSRCodec achieves state-of-the-art (SOTA) performance among cross-domain single-codebook codecs with only a token rate of 40, and its reconstruction quality is comparable to that of certain multi-codebook methods. Our demo page is available at https://wxzyd123.github.io/unisrcodec.

preprint2022arXiv

Animating collider processes with Event-time-frame Format

High Energy Physics processes, such as hard scattering, parton shower, and hadronization, occur at colliders around the world, e.g., the Large Hadron Collider in Europe. The various steps are also components within corresponding Monte-Carlo simulations. They are usually considered to occur in an instant and displayed in MC simulations as intricate paths hard-coded with the HepMC format. We recently developed a framework to convert HEP event records into online 3D animations, aiming for visual Monte-Carlo studies and science popularization, where the most difficult parts are about designing an event timeline and particles' movement. As a by-product, we propose here an event-time-frame format for animation data exchanging and persistence, which is potentially helpful in other visualization works. The code is maintained at https://github.com/lyazj/hepani, and the web service is available at https://ppnp.pku.edu.cn/hepani/index.html.

preprint2022arXiv

Searching for PETs: Using Distributional and Sentiment-Based Methods to Find Potentially Euphemistic Terms

This paper presents a linguistically driven proof of concept for finding potentially euphemistic terms, or PETs. Acknowledging that PETs tend to be commonly used expressions for a certain range of sensitive topics, we make use of distributional similarities to select and filter phrase candidates from a sentence and rank them using a set of simple sentiment-based metrics. We present the results of our approach tested on a corpus of sentences containing euphemisms, demonstrating its efficacy for detecting single and multi-word PETs from a broad range of topics. We also discuss future potential for sentiment-based methods on this task.

preprint2020arXiv

A free boundary problem arising from a multi-state regime-switching stock trading model

In this paper, we study a free boundary problem, which arises from an optimal trading problem of a stock that is driven by a uncertain market status process. The free boundary problem is a variational inequality system of three functions with a degenerate operator. The main contribution of this paper is that we not only prove all the four switching free boundaries are no-overlapping, monotonic and $C^{\infty}$-smooth, but also completely determine their relative localities and provide the optimal trading strategies for the stock trading problem.

preprint2020arXiv

Linguistic Fingerprints of Internet Censorship: the Case of SinaWeibo

This paper studies how the linguistic components of blogposts collected from Sina Weibo, a Chinese microblogging platform, might affect the blogposts' likelihood of being censored. Our results go along with King et al. (2013)'s Collective Action Potential (CAP) theory, which states that a blogpost's potential of causing riot or assembly in real life is the key determinant of it getting censored. Although there is not a definitive measure of this construct, the linguistic features that we identify as discriminatory go along with the CAP theory. We build a classifier that significantly outperforms non-expert humans in predicting whether a blogpost will be censored. The crowdsourcing results suggest that while humans tend to see censored blogposts as more controversial and more likely to trigger action in real life than the uncensored counterparts, they in general cannot make a better guess than our model when it comes to `reading the mind' of the censors in deciding whether a blogpost should be censored. We do not claim that censorship is only determined by the linguistic features. There are many other factors contributing to censorship decisions. The focus of the present paper is on the linguistic form of blogposts. Our work suggests that it is possible to use linguistic properties of social media posts to automatically predict if they are going to be censored.

preprint2013arXiv

A Random Walk Model for Item Recommendation in Folksonomies

Social tagging, as a novel approach to information organization and discovery, has been widely adopted in many Web2.0 applications. The tags provide a new type of information that can be exploited by recommender systems. Nevertheless, the sparsity of ternary <user, tag, item> interaction data limits the performance of tag-based collaborative filtering. This paper proposes a random-walk-based algorithm to deal with the sparsity problem in social tagging data, which captures the potential transitive associations between users and items through their interaction with tags. In particular, two smoothing strategies are presented from both the user-centric and item-centric perspectives. Experiments on real-world data sets empirically demonstrate the efficacy of the proposed algorithm.

preprint2012arXiv

Dynamic Shared Context Processing in an E-Collaborative Learning Environment

In this paper, we propose a dynamic shared context processing method based on DSC (Dynamic Shared Context) model, applied in an e-collaborative learning environment. Firstly, we present the model. This is a way to measure the relevance between events and roles in collaborative environments. With this method, we can share the most appropriate event information for each role instead of sharing all information to all roles in a collaborative work environment. Then, we apply and verify this method in our project with Google App supported e-learning collaborative environment. During this experiment, we compared DSC method measured relevance of events and roles to manual measured relevance. And we describe the favorable points from this comparison and our finding. Finally, we discuss our future research of a hybrid DSC method to make dynamical information shared more effective in a collaborative work environment.