Source author record

Yongliang Wang

Yongliang Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Computer Vision eess.SP Human-Computer Interaction Information Retrieval Information Theory math.IT

Catalog footprint

What is connected

6works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2025arXiv

From Events to Trending: A Multi-Stage Hotspots Detection Method Based on Generative Query Indexing

LLM-based conversational systems have become a popular gateway for information access, yet most existing chatbots struggle to handle news-related trending queries effectively. To improve user experience, an effective trending query detection method is urgently needed to enable differentiated processing of such target traffic. However, current research on trending detection tailored to the dialogue system scenario remains largely unexplored, and methods designed for traditional search engines often underperform in conversational contexts due to radically distinct query distributions and expression patterns. To fill this gap, we propose a multi-stage framework for trending detection, which achieves systematic optimization from both offline generation and online identification perspectives. Specifically, our framework first exploits selected hot events to generate index queries, establishing a key bridge between static events and dynamic user queries. It then employs a retrieval matching mechanism for real-time online detection of trending queries, where we introduce a cascaded recall and ranking architecture to balance detection efficiency and accuracy. Furthermore, to better adapt to the practical application scenario, our framework adopts a single-recall module as a cold-start strategy to collect online data for fine-tuning the reranker. Extensive experiments demonstrate that our framework significantly outperforms baseline methods in both offline evaluations and online A/B tests, and user satisfaction is relatively improved by 27\% in terms of positive-negative feedback ratio.

preprint2022arXiv

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

The task of Human-Object Interaction~(HOI) detection could be divided into two core problems, i.e., human-object association and interaction understanding. In this paper, we reveal and address the disadvantages of the conventional query-driven HOI detectors from the two aspects. For the association, previous two-branch methods suffer from complex and costly post-matching, while single-branch methods ignore the features distinction in different tasks. We propose Guided-Embedding Network~(GEN) to attain a two-branch pipeline without post-matching. In GEN, we design an instance decoder to detect humans and objects with two independent query sets and a position Guided Embedding~(p-GE) to mark the human and object in the same position as a pair. Besides, we design an interaction decoder to classify interactions, where the interaction queries are made of instance Guided Embeddings (i-GE) generated from the outputs of each instance decoder layer. For the interaction understanding, previous methods suffer from long-tailed distribution and zero-shot discovery. This paper proposes a Visual-Linguistic Knowledge Transfer (VLKT) training strategy to enhance interaction understanding by transferring knowledge from a visual-linguistic pre-trained model CLIP. In specific, we extract text embeddings for all labels with CLIP to initialize the classifier and adopt a mimic loss to minimize the visual feature distance between GEN and CLIP. As a result, GEN-VLKT outperforms the state of the art by large margins on multiple datasets, e.g., +5.05 mAP on HICO-Det. The source codes are available at https://github.com/YueLiao/gen-vlkt.

preprint2022arXiv

Measuring Uncertainty in Signal Fingerprinting with Gaussian Processes Going Deep

In indoor positioning, signal fluctuation is highly location-dependent. However, signal uncertainty is one critical yet commonly overlooked dimension of the radio signal to be fingerprinted. This paper reviews the commonly used Gaussian Processes (GP) for probabilistic positioning and points out the pitfall of using GP to model signal fingerprint uncertainty. This paper also proposes Deep Gaussian Processes (DGP) as a more informative alternative to address the issue. How DGP better measures uncertainty in signal fingerprinting is evaluated via simulated and realistically collected datasets.

preprint2022arXiv

Precognition in Task-oriented Dialogue Understanding: Posterior Regularization by Future Context

Task-oriented dialogue systems have become overwhelmingly popular in recent researches. Dialogue understanding is widely used to comprehend users' intent, emotion and dialogue state in task-oriented dialogue systems. Most previous works on such discriminative tasks only models current query or historical conversations. Even if in some work the entire dialogue flow was modeled, it is not suitable for the real-world task-oriented conversations as the future contexts are not visible in such cases. In this paper, we propose to jointly model historical and future information through the posterior regularization method. More specifically, by modeling the current utterance and past contexts as prior, and the entire dialogue flow as posterior, we optimize the KL distance between these distributions to regularize our model during training. And only historical information is used for inference. Extensive experiments on two dialogue datasets validate the effectiveness of our proposed method, achieving superior results compared with all baseline models.

preprint2022arXiv

Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken Conversations

Building robust and general dialogue models for spoken conversations is challenging due to the gap in distributions of spoken and written data. This paper presents our approach to build generalized models for the Knowledge-grounded Task-oriented Dialogue Modeling on Spoken Conversations Challenge of DSTC-10. In order to mitigate the discrepancies between spoken and written text, we mainly employ extensive data augmentation strategies on written data, including artificial error injection and round-trip text-speech transformation. To train robust models for spoken conversations, we improve pre-trained language models, and apply ensemble algorithms for each sub-task. Typically, for the detection task, we fine-tune \roberta and ELECTRA, and run an error-fixing ensemble algorithm. For the selection task, we adopt a two-stage framework that consists of entity tracking and knowledge ranking, and propose a multi-task learning method to learn multi-level semantic information by domain classification and entity selection. For the generation task, we adopt a cross-validation data process to improve pre-trained generative language models, followed by a consensus decoding algorithm, which can add arbitrary features like relative \rouge metric, and tune associated feature weights toward \bleu directly. Our approach ranks third on the objective evaluation and second on the final official human evaluation.

preprint2010arXiv

Direct Data Domain STAP using Sparse Representation of Clutter Spectrum

Space-time adaptive processing (STAP) is an effective tool for detecting a moving target in the airborne radar system. Due to the fast-changing clutter scenario and/or non side-looking configuration, the stationarity of the training data is destroyed such that the statistical-based methods suffer performance degradation. Direct data domain (D3) methods avoid non-stationary training data and can effectively suppress the clutter within the test cell. However, this benefit comes at the cost of a reduced system degree of freedom (DOF), which results in performance loss. In this paper, by exploiting the intrinsic sparsity of the spectral distribution, a new direct data domain approach using sparse representation (D3SR) is proposed, which seeks to estimate the high-resolution space-time spectrum with only the test cell. The simulation of both side-looking and non side-looking cases has illustrated the effectiveness of the D3SR spectrum estimation using focal underdetermined system solution (FOCUSS) and norm minimization. Then the clutter covariance matrix (CCM) and the corresponding adaptive filter can be effectively obtained. Since D3SR maintains the full system DOF, it can achieve better performance of output signal-clutter-ratio (SCR) and minimum detectable velocity (MDV) than current D3 methods, e.g., direct data domain least squares (D3LS). Thus D3SR is more effective against the range-dependent clutter and interference in the non-stationary clutter scenario.

Yongliang Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

From Events to Trending: A Multi-Stage Hotspots Detection Method Based on Generative Query Indexing

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

Measuring Uncertainty in Signal Fingerprinting with Gaussian Processes Going Deep

Precognition in Task-oriented Dialogue Understanding: Posterior Regularization by Future Context

Towards Generalized Models for Task-oriented Dialogue Modeling on Spoken Conversations

Direct Data Domain STAP using Sparse Representation of Clutter Spectrum