Source author record

Tianyu Yang

Tianyu Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Information Theory math.IT math.AP cond-mat.mtrl-sci Machine Learning physics.med-ph physics.optics

Catalog footprint

What is connected

15works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Novel Computational Framework for Causal Inference: Tree-Based Discretization with ILP-Based Matching

Causal inference is essential for data-driven decision-making, as it aims to uncover causal relationships from observational data. However, identifying causality remains challenging due to the potential for confounding and the distinction between correlation and causation. While recent advances in causal machine learning and matching algorithms have improved estimation accuracy, these methods often face trade-offs between interpretability and computational efficiency. This paper proposes a novel approach that combines a tree-based discretization technique, tailored for causal inference, with an integer linear programming-based matching algorithm. The discretization ensures approximately linear relationships for control datasets within strata, enabling effective matching, while the optimization framework optimizes for global balance. The resulting algorithm yields computational efficiency and less biased ATT estimates compared to state-of-the-art algorithms. Empirical evaluations demonstrate the proposed method's practical advantages over existing techniques in causal inference scenarios.

preprint2022arXiv

Bone tumor suppression in rabbits by hyperthermia below the clinical safety limit using aligned magnetic bone cement

Demonstrating highly efficient alternating current (AC) magnetic field heating of nanoparticles in physiological environments under clinically safe field parameters has remained a great challenge, hindering clinical applications of magnetic hyperthermia. In this work, we report exceptionally high loss power of magnetic bone cement under clinical safety limit of AC field parameters, incorporating DC field-aligned soft magnetic Zn0.3Fe2.7O4 nanoparticles with low concentration. Under an AC field of 4 kA/m at 430 kHz, the aligned bone cement with 0.2 wt% nanoparticles achieved a temperature increase of 30 C in 180 s. This amounts to a specific loss power value of 327 W/gmetal and an intrinsic loss power of 47 nHm^2/kg, which is enhanced by 50-fold compared to randomly oriented samples. The high-performance magnetic bone cement allows for the demonstration of effective hyperthermia suppression of tumor growth in the bone marrow cavity of New Zealand White Rabbits subjecting to rapid cooling due to blood circulation, and significant enhancement of survival rate.

preprint2022arXiv

Channel State Acquisition in FDD Massive MIMO: Rate-Distortion Bound and Effectiveness of "Analog" Feedback

We consider the problem of estimating channel fading coefficients (modeled as a correlated Gaussian vector) via Downlink (DL) training and Uplink (UL) feedback in wideband FDD massive MIMO systems. Using rate-distortion theory, we derive optimal bounds on the achievable channel state estimation error in terms of the number of training pilots in DL ($β_{tr}$) and feedback dimension in UL ($β_{fb}$), with random, spatially isotropic pilots. It is shown that when the number of training pilots exceeds the channel covariance rank ($r$), the optimal rate-distortion feedback strategy achieves an estimation error decay of $Θ(SNR^{-α})$ in estimating the channel state, where $α= min (β_{fb}/r , 1)$ is the so-called quality scaling exponent. We also discuss an "analog" feedback strategy, showing that it can achieve the optimal quality scaling exponent for a wide range of training and feedback dimensions with no channel covariance knowledge and simple signal processing at the user side. Our findings are supported by numerical simulations comparing various strategies in terms of channel state mean squared error and achievable ergodic sum-rate in DL with zero-forcing precoding.

preprint2022arXiv

FDD Massive MIMO Channel Training Optimal Rate Distortion Bounds and the Efficiency of one-shot Schemes

We study the problem of providing channel state information (CSI) at the transmitter in multi-user massive MIMO systems operating in frequency division duplexing (FDD). The wideband MIMO channel is a vector-valued random process correlated in time, space (antennas), and frequency (subcarriers). The base station (BS) broadcasts periodically beta_tr pilot symbols from its M antenna ports to K single-antenna users (UEs). Correspondingly, the K UEs send feedback messages about their channel state using beta_fb symbols in the uplink (UL). Using results from remote rate-distortion theory, we show that, as snr reaches infty, the optimal feedback strategy achieves a channel state estimation mean squared error (MSE) that behaves as Theta(1) if beta_tr < r and as Theta(snr^(-alpha)) when beta_tr >=r, where alpha = min(beta_fb/r, 1), where r is the rank of the channel covariance matrix. The MSE-optimal rate-distortion strategy implies encoding of long sequences of channel states, which would yield completely stale CSI and therefore poor multiuser precoding performance. Hence, we consider three practical one-shot CSI strategies with minimum one-slot delay and analyze their large-SNR channel estimation MSE behavior. These are: (1) digital feedback via entropy-coded scalar quantization (ECSQ), (2) analog feedback (AF), and (3) local channel estimation at the UEs and digital feedback. These schemes have different requirements in terms of knowledge of the channel statistics at the UE and at the BS. In particular, the latter strategy requires no statistical knowledge and is closely inspired by a CSI feedback scheme currently proposed in 3GPP standardization.

preprint2022arXiv

LocVTP: Video-Text Pre-training for Temporal Localization

Video-Text Pre-training (VTP) aims to learn transferable representations for various downstream tasks from large-scale web videos. To date, almost all existing VTP methods are limited to retrieval-based downstream tasks, e.g., video retrieval, whereas their transfer potentials on localization-based tasks, e.g., temporal grounding, are under-explored. In this paper, we experimentally analyze and demonstrate the incompatibility of current VTP methods with localization tasks, and propose a novel Localization-oriented Video-Text Pre-training framework, dubbed as LocVTP. Specifically, we perform the fine-grained contrastive alignment as a complement to the coarse-grained one by a clip-word correspondence discovery scheme. To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships. Extensive experiments on four downstream tasks across six datasets demonstrate that our LocVTP achieves state-of-the-art performance on both retrieval-based and localization-based tasks. Furthermore, we conduct comprehensive ablation studies and thorough analyses to explore the optimum model designs and training strategies.

preprint2022arXiv

Motion-aware Contrastive Video Representation Learning via Foreground-background Merging

In light of the success of contrastive learning in the image domain, current self-supervised video representation learning methods usually employ contrastive loss to facilitate video representation learning. When naively pulling two augmented views of a video closer, the model however tends to learn the common static background as a shortcut but fails to capture the motion information, a phenomenon dubbed as background bias. Such bias makes the model suffer from weak generalization ability, leading to worse performance on downstream tasks such as action recognition. To alleviate such bias, we propose \textbf{F}oreground-b\textbf{a}ckground \textbf{Me}rging (FAME) to deliberately compose the moving foreground region of the selected video onto the static background of others. Specifically, without any off-the-shelf detector, we extract the moving foreground out of background regions via the frame difference and color statistics, and shuffle the background regions among the videos. By leveraging the semantic consistency between the original clips and the fused ones, the model focuses more on the motion patterns and is debiased from the background shortcut. Extensive experiments demonstrate that FAME can effectively resist background cheating and thus achieve the state-of-the-art performance on downstream tasks across UCF101, HMDB51, and Diving48 datasets. The code and configurations are released at https://github.com/Mark12Ding/FAME.

preprint2022arXiv

Structured Channel Covariance Estimation from Limited Samples for Large Antenna Arrays

In massive multiple-input multiple-output (MIMO) systems, the knowledge of the users' channel covariance matrix is crucial for minimum mean square error (MMSE) channel estimation in the uplink as well as it plays an important role in several multiuser beamforming schemes in the downlink. Due to the large number of base station antennas in massive MIMO, accurate covariance estimation is challenging especially in the case where the number of samples is limited and thus comparable to the channel vector dimension. As a result, the standard sample covariance estimator may yield a too large estimation error which in turn may yield significant system performance degradation with respect to the ideal channel covariance knowledge case. To address such problem, we propose a method based on a parametric representation of the channel angular scattering function. The proposed parametric representation includes a discrete specular component which is addressed using the well-known MUltiple SIgnal Classification (MUSIC) method, and a diffuse scattering component, which is modeled as the superposition of suitable dictionary functions. To obtain the representation parameters we propose two methods, where the first solves a non-negative least-squares problem and the second maximizes the likelihood function using expectation-maximization. Our simulation results show that the proposed methods outperform the state of the art with respect to various estimation quality metrics and different sample sizes.

preprint2022arXiv

SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization

Matching-based methods, especially those based on space-time memory, are significantly ahead of other solutions in semi-supervised video object segmentation (VOS). However, continuously growing and redundant template features lead to an inefficient inference. To alleviate this, we propose a novel Sequential Weighted Expectation-Maximization (SWEM) network to greatly reduce the redundancy of memory features. Different from the previous methods which only detect feature redundancy between frames, SWEM merges both intra-frame and inter-frame similar features by leveraging the sequential weighted EM algorithm. Further, adaptive weights for frame features endow SWEM with the flexibility to represent hard samples, improving the discrimination of templates. Besides, the proposed method maintains a fixed number of template features in memory, which ensures the stable inference complexity of the VOS system. Extensive experiments on commonly used DAVIS and YouTube-VOS datasets verify the high efficiency (36 FPS) and high performance (84.3\% $\mathcal{J}\&\mathcal{F}$ on DAVIS 2017 validation dataset) of SWEM. Code is available at: https://github.com/lmm077/SWEM.

preprint2022arXiv

Unsupervised Pre-training for Temporal Action Localization Tasks

Unsupervised video representation learning has made remarkable achievements in recent years. However, most existing methods are designed and optimized for video classification. These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization. To bridge this gap, we make the first attempt to propose a self-supervised pretext task, coined as Pseudo Action Localization (PAL) to Unsupervisedly Pre-train feature encoders for Temporal Action Localization tasks (UP-TAL). Specifically, we first randomly select temporal regions, each of which contains multiple clips, from one video as pseudo actions and then paste them onto different temporal positions of the other two videos. The pretext task is to align the features of pasted pseudo action regions from two synthetic videos and maximize the agreement between them. Compared to the existing unsupervised video representation learning approaches, our PAL adapts better to downstream TAL tasks by introducing a temporal equivariant contrastive learning paradigm in a temporally dense and scale-aware manner. Extensive experiments show that PAL can utilize large-scale unlabeled video data to significantly boost the performance of existing TAL methods. Our codes and models will be made publicly available at https://github.com/zhang-can/UP-TAL.

preprint2020arXiv

A Non-Iterative Reconstruction Algorithm for the Acoustic Inverse Boundary Value Problem

We present a non-iterative algorithm to reconstruct the isotropic acoustic wave speed from the measurement of the Neumann-to-Dirichlet map. The algorithm is designed based on the boundary control method and involves only computations that are stable. We prove the convergence of the algorithm and present its numerical implementation. The effectiveness of the algorithm is validated on both constant speed and variable speed, with full and partial boundary measurement as well as different levels of noise.

preprint2020arXiv

Dual-Polarized FDD Massive MIMO: A Comprehensive Framework

We propose a comprehensive scheme for realizing a massive multiple-input multiple-output (MIMO) system with dual-polarized antennas in frequency division duplexing (FDD) mode. Employing dual-polarized elements in a massive MIMO array has been common practice recently and can, in principle, double the number of spatial degrees of freedom with a less-than-proportional increase in array size. However, processing a dual-polarized channel is demanding due to the high channel dimension and the lack of Uplink-Downlink (UL-DL) channel reciprocity in FDD mode. In particular, the difficulty arises in channel covariance acquisition for both UL and DL transmissions and in common training of DL channels in a multi-user setup. To overcome these challenges, we develop a unified framework consisting of three steps: (1) a covariance estimation method to efficiently estimate the UL covariance from noisy, orthogonal UL pilots; (2) a UL-DL covariance transformation method that obtains the DL covariance from the estimated UL covariance in the previous step; (3) a multi-user common DL channel training with limited DL pilot dimension method, which enables the BS to estimate effective user DL channels and use them for interference-free DL beamforming and data transmission. We provide extensive empirical results to prove the applicability and merits of our scheme.

preprint2020arXiv

ROAM: Recurrently Optimizing Tracking Model

In this paper, we design a tracking model consisting of response generation and bounding box regression, where the first component produces a heat map to indicate the presence of the object at different positions and the second part regresses the relative bounding box shifts to anchors mounted on sliding-window locations. Thanks to the resizable convolutional filters used in both components to adapt to the shape changes of objects, our tracking model does not need to enumerate different sized anchors, thus saving model parameters. To effectively adapt the model to appearance variations, we propose to offline train a recurrent neural optimizer to update tracking model in a meta-learning setting, which can converge the model in a few gradient steps. This improves the convergence speed of updating the tracking model while achieving better performance. We extensively evaluate our trackers, ROAM and ROAM++, on the OTB, VOT, LaSOT, GOT-10K and TrackingNet benchmark and our methods perform favorably against state-of-the-art algorithms.

preprint2020arXiv

Ultrasound Modulated Bioluminescence Tomography With A Single Optical Measurement

Ultrasound modulated bioluminescence tomography (UMBLT) is an imaging method which can be formulated as a hybrid inverse source problem. In the regime where light propagation is modeled by a radiative transfer equation, previous approaches to this problem require large numbers of optical measurements [10]. Here we propose an alternative solution for this inverse problem which requires only a single optical measurement in order to reconstruct the isotropic source. Specifically, we derive two inversion formulae based on Neumann series and Fredholm theory respectively, and prove their convergence under sufficient conditions. The resulting numerical algorithms are implemented and experimented to reconstruct both continuous and discontinuous sources in the presence of noise.

preprint2016arXiv

An Upper Bound on the Sum Capacity of the Downlink Multicell Processing with Finite Backhaul Capacity

In this paper, we study upper bounds on the sum capacity of the downlink multicell processing model with finite backhaul capacity for the simple case of 2 base stations and 2 mobile users. It is modelled as a two-user multiple access diamond channel. It consists of a first hop from the central processor to the base stations via orthogonal links of finite capacity, and the second hop from the base stations to the mobile users via a Gaussian interference channel. The converse is derived using the converse tools of the multiple access diamond channel and that of the Gaussian MIMO broadcast channel. Through numerical results, it is shown that our upper bound improves upon the existing upper bound greatly in the medium backhaul capacity range, and as a result, the gap between the upper bounds and the sum rate of the time-sharing of the known achievable schemes is significantly reduced.

preprint2015arXiv

Dirac-Point Solitons in Nonlinear Optical Lattices

The discovery of a new type of solitons occuring in periodic systems without photonic bandgaps is reported. Solitons are nonlinear self-trapped wave packets. They have been extensively studied in many branches of physics. Solitons in periodic systems, which have become the mainstream of soliton research in the past decade, are localized states supported by photonic bandgaps. In this Letter, we report the discovery of a new type of solitons located at the Dirac point beyond photonic bandgaps. The Dirac point is a conical singularity of a photonic band structure where wave motion obeys the famous Dirac equation. These new solitons are sustained by the Dirac point rather than photonic bandgaps, thus provides a sort of advance in conceptual understanding over the traditional gap solitons. Apart from their theoretical impact within soliton theory, they have many potential uses because such solitons have dramatic stability characteristics and are possible in both Kerr material and photorefractive crystals that possess self-focusing and self-defocusing nonlinearities. The new results elegantly reveal that traditional photonic bandgaps are not required when Dirac points are accessible. The findings enrich the soliton family and provide valuable information for studies of nonlinear waves in many branches of physics, including hydrodynamics, plasma physics, and Bose Einstein condensates.

Tianyu Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

A Novel Computational Framework for Causal Inference: Tree-Based Discretization with ILP-Based Matching

Bone tumor suppression in rabbits by hyperthermia below the clinical safety limit using aligned magnetic bone cement

Channel State Acquisition in FDD Massive MIMO: Rate-Distortion Bound and Effectiveness of "Analog" Feedback

FDD Massive MIMO Channel Training Optimal Rate Distortion Bounds and the Efficiency of one-shot Schemes

LocVTP: Video-Text Pre-training for Temporal Localization

Motion-aware Contrastive Video Representation Learning via Foreground-background Merging

Structured Channel Covariance Estimation from Limited Samples for Large Antenna Arrays

SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization

Unsupervised Pre-training for Temporal Action Localization Tasks

A Non-Iterative Reconstruction Algorithm for the Acoustic Inverse Boundary Value Problem

Dual-Polarized FDD Massive MIMO: A Comprehensive Framework

ROAM: Recurrently Optimizing Tracking Model

Ultrasound Modulated Bioluminescence Tomography With A Single Optical Measurement

An Upper Bound on the Sum Capacity of the Downlink Multicell Processing with Finite Backhaul Capacity

Dirac-Point Solitons in Nonlinear Optical Lattices