Researcher profile

Tian Tan

Tian Tan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2026arXiv

Qihe: A General-Purpose Static Analysis Framework for Verilog

In the past decades, static analysis has thrived in software, facilitating applications in bug detection, security, and program understanding. These advanced analyses are largely underpinned by general-purpose static analysis frameworks, which offer essential infrastructure to streamline their development. Conversely, hardware lacks such a framework, which overshadows the promising opportunities for sophisticated static analysis in hardware, hindering achievements akin to those witnessed in software. We thus introduce Qihe, the first general-purpose static analysis framework for Verilog -- a highly challenging endeavor given the absence of precedents in hardware. Qihe features an analysis-oriented front end, a Verilog-specific IR, and a suite of diverse fundamental analyses that capture essential hardware-specific characteristics -- such as bit-vector arithmetic, register synchronization, and digital component concurrency -- and enable the examination of intricate hardware data and control flows. These fundamental analyses are designed to support a wide array of hardware analysis clients. To validate Qihe's utility, we further developed a set of clients spanning bug detection, security, and program understanding. Our preliminary experimental results are highly promising; for example, Qihe uncovered 9 previously unknown bugs in popular real-world hardware projects (averaging 1.5K+ GitHub stars), all of which were confirmed by developers; moreover, Qihe successfully identified 18 bugs beyond the capabilities of existing static analyses for Verilog bug detection (i.e., linters), and detected 16 vulnerabilities in real-world hardware programs. By open-sourcing Qihe, which comprises over 100K lines of code, we aim to inspire further innovation and applications of sophisticated static analysis for hardware, aspiring to foster a similarly vibrant ecosystem that software analysis enjoys.

preprint2026arXiv

WavCube: Unifying Speech Representation for Understanding and Generation via Semantic-Acoustic Joint Modeling

Integrating speech understanding and generation is a pivotal step toward building unified speech models. However, the different representations required for these two tasks currently pose significant compatibility challenges. Typically, semantics-oriented features are learned from self-supervised learning (SSL), and acoustic-oriented features from reconstruction. Such fragmented representations hinder the realization of truly unified speech systems. We present WavCube, a compact continuous latent derived from an SSL speech encoder that simultaneously supports speech understanding, reconstruction, and generation. WavCube employs a two-stage training scheme. Stage 1 trains a semantic bottleneck to filter off-manifold redundancy that makes raw SSL features intractable for diffusion. Stage 2 injects fine-grained acoustic details via end-to-end reconstruction, while a semantic anchoring loss ensures the representation remains grounded within its original semantic manifold. Comprehensive experiments show that WavCube closely approaches WavLM performance on SUPERB despite an 8x dimensional compression, attains reconstruction quality on par with existing acoustic representations, delivers state-of-the-art zero-shot TTS performance with markedly faster training convergence, and excels in speech enhancement, separation, and voice conversion tasks on the SUPERB-SG benchmark. Systematic ablations reveal that WavCube's two-stage recipe resolves two intrinsic flaws of SSL features for generative modeling, paving the way for future unified speech systems. Codes and checkpoints are available at https://github.com/yanghaha0908/WavCube.

preprint2022arXiv

Adjacency constraint for efficient hierarchical reinforcement learning

Goal-conditioned Hierarchical Reinforcement Learning (HRL) is a promising approach for scaling up reinforcement learning (RL) techniques. However, it often suffers from training inefficiency as the action space of the high-level, i.e., the goal space, is large. Searching in a large goal space poses difficulty for both high-level subgoal generation and low-level policy learning. In this paper, we show that this problem can be effectively alleviated by restricting the high-level action space from the whole goal space to a $k$-step adjacent region of the current state using an adjacency constraint. We theoretically prove that in a deterministic Markov Decision Process (MDP), the proposed adjacency constraint preserves the optimal hierarchical policy, while in a stochastic MDP the adjacency constraint induces a bounded state-value suboptimality determined by the MDP's transition structure. We further show that this constraint can be practically implemented by training an adjacency network that can discriminate between adjacent and non-adjacent subgoals. Experimental results on discrete and continuous control tasks including challenging simulated robot locomotion and manipulation tasks show that incorporating the adjacency constraint significantly boosts the performance of state-of-the-art goal-conditioned HRL approaches.

preprint2022arXiv

Tai-e: A Static Analysis Framework for Java by Harnessing the Best Designs of Classics

Static analysis is a mature field with applications to bug detection, security analysis, and code optimization, etc. To facilitate these applications, static analysis frameworks play an essential role by providing a series of fundamental services such as program abstraction, control flow graph construction, and points-to/alias information computation, etc. However, despite impressive progress of static analysis, and this field has seen several popular frameworks in the last decades, it is still not clear how a static analysis framework should be designed in a way that analysis developers could benefit more: for example, what a good IR (for analysis) ought to look like? What functionalities should the module of fundamental analyses provide to ease client analyses? How to develop and integrate new analysis conveniently? How to manage multiple analyses? To answer these questions, in this work, we discuss the design trade-offs for the crucial components of a static analysis framework, and argue for the most appropriate design by following the HBDC (Harnessing the Best Designs of Classics) principle: for each crucial component, we compare the design choices made for it (possibly) by different classic frameworks such as Soot, WALA, SpotBugs and Doop, and choose arguably the best one, but if none is good enough, we then propose a better design. These selected or newly proposed designs finally constitute Tai-e, a new static analysis framework for Java. Specifically, Tai-e is novel in the designs of several aspects like IR, pointer analysis and development of new analyses, etc., leading to an easy-to-learn, easy-to-use and efficient system. To our knowledge, this is the first work that systematically explores the designs and implementations of various static analysis frameworks, and we believe it provides useful materials and viewpoints for building better static analysis infrastructures.

preprint2020arXiv

An Investigation on Deep Learning with Beta Stabilizer

Artificial neural networks (ANN) have been used in many applications such like handwriting recognition and speech recognition. It is well-known that learning rate is a crucial value in the training procedure for artificial neural networks. It is shown that the initial value of learning rate can confoundedly affect the final result and this value is always set manually in practice. A new parameter called beta stabilizer has been introduced to reduce the sensitivity of the initial learning rate. But this method has only been proposed for deep neural network (DNN) with sigmoid activation function. In this paper we extended beta stabilizer to long short-term memory (LSTM) and investigated the effects of beta stabilizer parameters on different models, including LSTM and DNN with relu activation function. It is concluded that beta stabilizer parameters can reduce the sensitivity of learning rate with almost the same performance on DNN with relu activation function and LSTM. However, it is shown that the effects of beta stabilizer on DNN with relu activation function and LSTM are fewer than the effects on DNN with sigmoid activation function.

preprint2020arXiv

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

It is well known that quantifying uncertainty in the action-value estimates is crucial for efficient exploration in reinforcement learning. Ensemble sampling offers a relatively computationally tractable way of doing this using randomized value functions. However, it still requires a huge amount of computational resources for complex problems. In this paper, we present an alternative, computationally efficient way to induce exploration using index sampling. We use an indexed value function to represent uncertainty in our action-value estimates. We first present an algorithm to learn parameterized indexed value function through a distributional version of temporal difference in a tabular setting and prove its regret bound. Then, in a computational point of view, we propose a dual-network architecture, Parameterized Indexed Networks (PINs), comprising one mean network and one uncertainty network to learn the indexed value function. Finally, we show the efficacy of PINs through computational experiments.