Researcher profile

Tsung-Hui Chang

Tsung-Hui Chang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
21works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

21 published item(s)

preprint2026arXiv

DeepFP: Deep-Unfolded Fractional Programming for MIMO Beamforming

This work proposes a mixed learning-based and optimization-based approach to the weighted-sum-rates beamforming problem in a multiple-input multiple-output (MIMO) wireless network. The conventional methods, i.e., the fractional programming (FP) method and the weighted minimum mean square error (WMMSE) algorithm, can be computationally demanding for two reasons: (i) they require inverting a sequence of matrices whose sizes are proportional to the number of antennas; (ii) they require tuning a set of Lagrange multipliers to account for the power constraints. The recently proposed method called the reduced WMMSE addresses the above two issues for a single cell. In contrast, for the multicell case, another recent method called the FastFP eliminates the large matrix inversion and the Lagrange multipliers by using an improved FP technique, but the update stepsize in the FastFP can be difficult to decide. As such, we propose integrating the deep unfolding network into the FastFP for the stepsize optimization. Numerical experiments show that the proposed method is much more efficient than the learning method based on the WMMSE algorithm.

preprint2026arXiv

PEMNet: Towards Autonomous and Enhanced Environment-Aware Mobile Networks

With 5G deployment and the evolution toward 6G, mobile networks must make decisions in highly dynamic environments under strict latency, energy, and spectrum constraints. Achieving this goal, however, depends on prior knowledge of spatial-temporal variations in wireless channels and traffic demands. This motivates a joint, site-specific representation of radio propagation and user demand that is queryable at low online overhead. In this work, we propose the perception embedding map (PEM), a localized framework that embeds fine-grained channel statistics together with grid-level spatial-temporal traffic patterns over a base station's coverage. PEM is built from standard-compliant measurements -- such as measurement report and scheduling/quality-of-service logs -- so it can be deployed and maintained at scale with low cost. Integrated into PEM, this joint knowledge supports enhanced environment-aware optimization across PHY, MAC, and network layers while substantially reducing training overhead and signaling. Compared with existing site-specific channel maps and digital-twin replicas, PEM distinctively emphasizes (i) joint channel-traffic embedding, which is essential for network optimization, and (ii) practical construction using standard measurements, enabling network autonomy while striking a favorable fidelity-cost balance.

preprint2023arXiv

Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated Learning Framework

As a novel distributed learning paradigm, federated learning (FL) faces serious challenges in dealing with massive clients with heterogeneous data distribution and computation and communication resources. Various client-variance-reduction schemes and client sampling strategies have been respectively introduced to improve the robustness of FL. Among others, primal-dual algorithms such as the alternating direction of method multipliers (ADMM) have been found being resilient to data distribution and outperform most of the primal-only FL algorithms. However, the reason behind remains a mystery still. In this paper, we firstly reveal the fact that the federated ADMM is essentially a client-variance-reduced algorithm. While this explains the inherent robustness of federated ADMM, the vanilla version of it lacks the ability to be adaptive to the degree of client heterogeneity. Besides, the global model at the server under client sampling is biased which slows down the practical convergence. To go beyond ADMM, we propose a novel primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model. In addition, FedVRA unifies several representative FL algorithms in the sense that they are either special instances of FedVRA or are close to it. Extensions of FedVRA to semi/un-supervised learning are also presented. Experiments based on (semi-)supervised image classification tasks demonstrate superiority of FedVRA over the existing schemes in learning scenarios with massive heterogeneous clients and client sampling.

preprint2022arXiv

A Simple yet Effective Relation Information Guided Approach for Few-Shot Relation Extraction

Few-Shot Relation Extraction aims at predicting the relation for a pair of entities in a sentence by training with a few labelled examples in each relation. Some recent works have introduced relation information (i.e., relation labels or descriptions) to assist model learning based on Prototype Network. However, most of them constrain the prototypes of each relation class implicitly with relation information, generally through designing complex network structures, like generating hybrid features, combining with contrastive learning or attention networks. We argue that relation information can be introduced more explicitly and effectively into the model. Thus, this paper proposes a direct addition approach to introduce relation information. Specifically, for each relation class, the relation representation is first generated by concatenating two views of relations (i.e., [CLS] token embedding and the mean value of embeddings of all tokens) and then directly added to the original prototype for both train and prediction. Experimental results on the benchmark dataset FewRel 1.0 show significant improvements and achieve comparable results to the state-of-the-art, which demonstrates the effectiveness of our proposed approach. Besides, further analyses verify that the direct addition is a much more effective way to integrate the relation representations and the original prototypes.

preprint2022arXiv

Decentralized Non-Convex Learning with Linearly Coupled Constraints

Motivated by the need for decentralized learning, this paper aims at designing a distributed algorithm for solving nonconvex problems with general linear constraints over a multi-agent network. In the considered problem, each agent owns some local information and a local variable for jointly minimizing a cost function, but local variables are coupled by linear constraints. Most of the existing methods for such problems are only applicable for convex problems or problems with specific linear constraints. There still lacks a distributed algorithm for such problems with general linear constraints and under nonconvex setting. In this paper, to tackle this problem, we propose a new algorithm, called "proximal dual consensus" (PDC) algorithm, which combines a proximal technique and a dual consensus method. We build the theoretical convergence conditions and show that the proposed PDC algorithm can converge to an $ε$-Karush-Kuhn-Tucker solution within $\mathcal{O}(1/ε)$ iterations. For computation reduction, the PDC algorithm can choose to perform cheap gradient descent per iteration while preserving the same order of $\mathcal{O}(1/ε)$ iteration complexity. Numerical results are presented to demonstrate the good performance of the proposed algorithms for solving a regression problem and a classification problem over a network where agents have only partial observations of data features.

preprint2022arXiv

Federated Stochastic Primal-dual Learning with Differential Privacy

Federated learning (FL) is a new paradigm that enables many clients to jointly train a machine learning (ML) model under the orchestration of a parameter server while keeping the local data not being exposed to any third party. However, the training of FL is an interactive process between local clients and the parameter server. Such process would cause privacy leakage since adversaries may retrieve sensitive information by analyzing the overheard messages. In this paper, we propose a new federated stochastic primal-dual algorithm with differential privacy (FedSPD-DP). Compared to the existing methods, the proposed FedSPD-DP incorporates local stochastic gradient descent (local SGD) and partial client participation (PCP) for addressing the issues of communication efficiency and straggler effects due to randomly accessed clients. Our analysis shows that the data sampling strategy and PCP can enhance the data privacy whereas the larger number of local SGD steps could increase privacy leakage, revealing a non-trivial tradeoff between algorithm communication efficiency and privacy protection. Specifically, we show that, by guaranteeing $(ε, δ)$-DP for each client per communication round, the proposed algorithm guarantees $(\mathcal{O}(qε\sqrt{p T}), δ)$-DP after $T$ communication rounds while maintaining an $\mathcal{O}(1/\sqrt{pTQ})$ convergence rate for a convex and non-smooth learning problem, where $Q$ is the number of local SGD steps, $p$ is the client sampling probability, $q=\max_{i} q_i/\sqrt{1-q_i}$ and $q_i$ is the data sampling probability of each client under PCP. Experiment results are presented to evaluate the practical performance of the proposed algorithm and comparison with state-of-the-art methods.

preprint2022arXiv

Generating Radiology Reports via Memory-driven Transformer

Medical imaging is frequently used in clinical practice and trials for diagnosis and treatment. Writing imaging reports is time-consuming and can be error-prone for inexperienced radiologists. Therefore, automatically generating radiology reports is highly desired to lighten the workload of radiologists and accordingly promote clinical automation, which is an essential task to apply artificial intelligence to the medical domain. In this paper, we propose to generate radiology reports with memory-driven Transformer, where a relational memory is designed to record key information of the generation process and a memory-driven conditional layer normalization is applied to incorporating the memory into the decoder of Transformer. Experimental results on two prevailing radiology report datasets, IU X-Ray and MIMIC-CXR, show that our proposed approach outperforms previous models with respect to both language generation metrics and clinical evaluations. Particularly, this is the first work reporting the generation results on MIMIC-CXR to the best of our knowledge. Further analyses also demonstrate that our approach is able to generate long reports with necessary medical terms as well as meaningful image-text attention mappings.

preprint2022arXiv

Graph Enhanced Contrastive Learning for Radiology Findings Summarization

The impression section of a radiology report summarizes the most prominent observation from the findings section and is the most important section for radiologists to communicate to physicians. Summarizing findings is time-consuming and can be prone to error for inexperienced radiologists, and thus automatic impression generation has attracted substantial attention. With the encoder-decoder framework, most previous studies explore incorporating extra knowledge (e.g., static pre-defined clinical ontologies or extra background information). Yet, they encode such knowledge by a separate encoder to treat it as an extra input to their models, which is limited in leveraging their relations with the original findings. To address the limitation, we propose a unified framework for exploiting both extra knowledge and the original findings in an integrated way so that the critical information (i.e., key words and their relations) can be extracted in an appropriate way to facilitate impression generation. In detail, for each input findings, it is encoded by a text encoder, and a graph is constructed through its entities and dependency tree. Then, a graph encoder (e.g., graph neural networks (GNNs)) is adopted to model relation information in the constructed graph. Finally, to emphasize the key words in the findings, contrastive learning is introduced to map positive samples (constructed by masking non-key words) closer and push apart negative ones (constructed by masking key words). The experimental results on OpenI and MIMIC-CXR confirm the effectiveness of our proposed method.

preprint2022arXiv

Hero-Gang Neural Model For Named Entity Recognition

Named entity recognition (NER) is a fundamental and important task in NLP, aiming at identifying named entities (NEs) from free text. Recently, since the multi-head attention mechanism applied in the Transformer model can effectively capture longer contextual information, Transformer-based models have become the mainstream methods and have achieved significant performance in this task. Unfortunately, although these models can capture effective global context information, they are still limited in the local feature and position information extraction, which is critical in NER. In this paper, to address this limitation, we propose a novel Hero-Gang Neural structure (HGN), including the Hero and Gang module, to leverage both global and local information to promote NER. Specifically, the Hero module is composed of a Transformer-based encoder to maintain the advantage of the self-attention mechanism, and the Gang module utilizes a multi-window recurrent module to extract local features and position information under the guidance of the Hero module. Afterward, the proposed multi-window attention effectively combines global information and multiple local features for predicting entity labels. Experimental results on several benchmark datasets demonstrate the effectiveness of our proposed model.

preprint2022arXiv

Quantized Federated Learning under Transmission Delay and Outage Constraints

Federated learning (FL) has been recognized as a viable distributed learning paradigm which trains a machine learning model collaboratively with massive mobile devices in the wireless edge while protecting user privacy. Although various communication schemes have been proposed to expedite the FL process, most of them have assumed ideal wireless channels which provide reliable and lossless communication links between the server and mobile clients. Unfortunately, in practical systems with limited radio resources such as constraint on the training latency and constraints on the transmission power and bandwidth, transmission of a large number of model parameters inevitably suffers from quantization errors (QE) and transmission outage (TO). In this paper, we consider such non-ideal wireless channels, and carry out the first analysis showing that the FL convergence can be severely jeopardized by TO and QE, but intriguingly can be alleviated if the clients have uniform outage probabilities. These insightful results motivate us to propose a robust FL scheme, named FedTOE, which performs joint allocation of wireless resources and quantization bits across the clients to minimize the QE while making the clients have the same TO probability. Extensive experimental results are presented to show the superior performance of FedTOE for deep learning-based classification tasks with transmission latency constraints.

preprint2021arXiv

Learning to Continuously Optimize Wireless Resource in a Dynamic Environment: A Bilevel Optimization Perspective

There has been a growing interest in developing data-driven, and in particular deep neural network (DNN) based methods for modern communication tasks. For a few popular tasks such as power control, beamforming, and MIMO detection, these methods achieve state-of-the-art performance while requiring less computational efforts, less resources for acquiring channel state information (CSI), etc. However, it is often challenging for these approaches to learn in a dynamic environment. This work develops a new approach that enables data-driven methods to continuously learn and optimize resource allocation strategies in a dynamic environment. Specifically, we consider an ``episodically dynamic" setting where the environment statistics change in ``episodes", and in each episode the environment is stationary. We propose to build the notion of continual learning (CL) into wireless system design, so that the learning model can incrementally adapt to the new episodes, {\it without forgetting} knowledge learned from the previous episodes. Our design is based on a novel bilevel optimization formulation which ensures certain ``fairness" across different data samples. We demonstrate the effectiveness of the CL approach by integrating it with two popular DNN based models for power control and beamforming, respectively, and testing using both synthetic and ray-tracing based data sets. These numerical results show that the proposed CL approach is not only able to adapt to the new scenarios quickly and seamlessly, but importantly, it also maintains high performance over the previously encountered scenarios as well.

preprint2020arXiv

Distributed Learning in the Non-Convex World: From Batch to Streaming Data, and Beyond

Distributed learning has become a critical enabler of the massively connected world envisioned by many. This article discusses four key elements of scalable distributed processing and real-time intelligence --- problems, data, communication and computation. Our aim is to provide a fresh and unique perspective about how these elements should work together in an effective and coherent manner. In particular, we {provide a selective review} about the recent techniques developed for optimizing non-convex models (i.e., problem classes), processing batch and streaming data (i.e., data types), over the networks in a distributed manner (i.e., communication and computation paradigm). We describe the intuitions and connections behind a core set of popular distributed algorithms, emphasizing how to trade off between computation and communication costs. Practical issues and future research directions will also be discussed.

preprint2020arXiv

Learning Structured Communication for Multi-agent Reinforcement Learning

This work explores the large-scale multi-agent communication mechanism under a multi-agent reinforcement learning (MARL) setting. We summarize the general categories of topology for communication structures in MARL literature, which are often manually specified. Then we propose a novel framework termed as Learning Structured Communication (LSC) by using a more flexible and efficient communication topology. Our framework allows for adaptive agent grouping to form different hierarchical formations over episodes, which is generated by an auxiliary task combined with a hierarchical routing protocol. Given each formed topology, a hierarchical graph neural network is learned to enable effective message information generation and propagation among inter- and intra-group communications. In contrast to existing communication mechanisms, our method has an explicit while learnable design for hierarchical communication. Experiments on challenging tasks show the proposed LSC enjoys high communication efficiency, scalability, and global cooperation capability.

preprint2013arXiv

Optimal Real-time Spectrum Sharing between Cooperative Relay and Ad-hoc Networks

Optimization based spectrum sharing strategies have been widely studied. However, these strategies usually require a great amount of real-time computation and significant signaling delay, and thus are hard to be fulfilled in practical scenarios. This paper investigates optimal real-time spectrum sharing between a cooperative relay network (CRN) and a nearby ad-hoc network. Specifically, we optimize the spectrum access and resource allocation strategies of the CRN so that the average traffic collision time between the two networks can be minimized while maintaining a required throughput for the CRN. The development is first for a frame-level setting, and then is extended to an ergodic setting. For the latter setting, we propose an appealing optimal real-time spectrum sharing strategy via Lagrangian dual optimization. The proposed method only involves a small amount of real-time computation and negligible control delay, and thus is suitable for practical implementations. Simulation results are presented to demonstrate the efficiency of the proposed strategies.

preprint2013arXiv

Real-Time Power Balancing via Decentralized Coordinated Home Energy Scheduling

It is anticipated that an uncoordinated operation of individual home energy management (HEM) systems in a neighborhood would have a rebound effect on the aggregate demand profile. To address this issue, this paper proposes a coordinated home energy management (CoHEM) architecture in which distributed HEM units collaborate with each other in order to keep the demand and supply balanced in their neighborhood. Assuming the energy requests by customers are random in time, we formulate the proposed CoHEM design as a multi-stage stochastic optimization problem. We propose novel models to describe the deferrable appliance load (e.g., Plug-in (Hybrid) Electric Vehicles (PHEV)), and apply approximation and decomposition techniques to handle the considered design problem in a decentralized fashion. The developed decentralized CoHEM algorithm allow the customers to locally compute their scheduling solutions using domestic user information and with message exchange between their neighbors only. Extensive simulation results demonstrate that the proposed CoHEM architecture can effectively improve real-time power balancing. Extensions to joint power procurement and real-time CoHEM scheduling are also presented.

preprint2013arXiv

Simultaneous Information and Energy Transfer: A Two-User MISO Interference Channel Case

This paper considers the sum rate maximization problem of a two-user multiple-input single-output interference channel with receivers that can scavenge energy from the radio signals transmitted by the transmitters. We first study the optimal transmission strategy for an ideal scenario where the two receivers can simultaneously decode the information signal and harvest energy. Then, considering the limitations of the current circuit technology, we propose two practical schemes based on TDMA, where, at each time slot, the receiver either operates in the energy harvesting mode or in the information detection mode. Optimal transmission strategies for the two practical schemes are respectively investigated. Simulation results show that the three schemes exhibit interesting tradeoff between achievable sum rate and energy harvesting requirement, and do not dominate each other in terms of maximum achievable sum rate.

preprint2012arXiv

Coordinated Home Energy Management for Real-Time Power Balancing

This paper proposes a coordinated home energy management system (HEMS) architecture where the distributed residential units cooperate with each other to achieve real-time power balancing. The economic benefits for the retailer and incentives for the customers to participate in the proposed coordinated HEMS program are given. We formulate the coordinated HEMS design problem as a dynamic programming (DP) and use approximate DP approaches to efficiently handle the design problem. A distributed implementation algorithm based on the convex optimization based dual decomposition technique is also presented. Our focus in the current paper is on the deferrable appliances, such as Plug-in (Hybrid) Electric Vehicles (PHEV), in view of their higher impact on the grid stability. Simulation results shows that the proposed coordinated HEMS architecture can efficiently improve the real-time power balancing.

preprint2011arXiv

Outage Constrained Robust Transmit Optimization for Multiuser MISO Downlinks: Tractable Approximations by Conic Optimization

In this paper we consider a probabilistic signal-to-interference and-noise ratio (SINR) constrained problem for transmit beamforming design in the presence of imperfect channel state information (CSI), under a multiuser multiple-input single-output (MISO) downlink scenario. In particular, we deal with outage-based quality-of-service constraints, where the probability of each user's SINR not satisfying a service requirement must not fall below a given outage probability specification. The study of solution approaches to the probabilistic SINR constrained problem is important because CSI errors are often present in practical systems and they may cause substantial SINR outages if not handled properly. However, a major technical challenge is how to process the probabilistic SINR constraints. To tackle this, we propose a novel relaxation- restriction (RAR) approach, which consists of two key ingredients-semidefinite relaxation (SDR), and analytic tools for conservatively approximating probabilistic constraints. The underlying goal is to establish approximate probabilistic SINR constrained formulations in the form of convex conic optimization problems, so that they can be readily implemented by available solvers. Using either an intuitive worst-case argument or specialized probabilistic results, we develop various conservative approximation schemes for processing probabilistic constraints with quadratic uncertainties. Consequently, we obtain several RAR alternatives for handling the probabilistic SINR constrained problem. Our techniques apply to both complex Gaussian CSI errors and i.i.d. bounded CSI errors with unknown distribution. Moreover, results obtained from our extensive simulations show that the proposed RAR methods significantly improve upon existing ones, both in terms of solution quality and computational complexity.

preprint2011arXiv

Worst-Case SINR Constrained Robust Coordinated Beamforming for Multicell Wireless Systems

Multicell coordinated beamforming (MCBF) has been recognized as a promising approach to enhancing the system throughput and spectrum efficiency of wireless cellular systems. In contrast to the conventional single-cell beamforming (SBF) design, MCBF jointly optimizes the beamforming vectors of cooperative base stations (BSs) (via a central processing unit(CPU)) in order to mitigate the intercell interference. While most of the existing designs assume that the CPU has the perfect knowledge of the channel state information (CSI) of mobile stations (MSs), this paper takes into account the inevitable CSI errors at the CPU, and study the robust MCBF design problem. Specifically, we consider the worst-case robust design formulation that minimizes the weighted sum transmission power of BSs subject to worst-case signal-to-interference-plus-noise ratio (SINR) constraints on MSs. The associated optimization problem is challenging because it involves infinitely many nonconvex SINR constraints. In this paper, we show that the worst-case SINR constraints can be reformulated as linear matrix inequalities, and the approximation method known as semidefinite relation can be used to efficiently handle the worst-case robust MCBF problem. Simulation results show that the proposed robustMCBF design can provide guaranteed SINR performance for the MSs and outperforms the robust SBF design.

preprint2010arXiv

A convex approximation approach to Weighted Sum Rate Maximization of Multiuser MISO Interference Channel under outage constraints

This paper considers weighted sum rate maximization of multiuser multiple-input single-output interference channel (MISO-IFC) under outage constraints. The outage-constrained weighted sum rate maximization problem is a nonconvex optimization problem and is difficult to solve. While it is possible to optimally deal with this problem in an exhaustive search manner by finding all the Pareto-optimal rate tuples in the (discretized) outage-constrained achievable rate region, this approach, however, suffers from a prohibitive computational complexity and is feasible only when the number of transmitter-receive pairs is small. In this paper, we propose a convex optimization based approximation method for efficiently handling the outage-constrained weighted sum rate maximization problem. The proposed approximation method consists of solving a sequence of convex optimization problems, and thus can be efficiently implemented by interior-point methods. Simulation results show that the proposed method can yield near-optimal solutions.

preprint2010arXiv

Probabilistic Sinr Constrained Robust Transmit Beamforming: A Bernstein-Type Inequality Based Conservative Approach

Recently, robust transmit beamforming has drawn considerable attention because it can provide guaranteed receiver performance in the presence of channel state information (CSI) errors. Assuming complex Gaussian distributed CSI errors, this paper investigates the robust beamforming design problem that minimizes the transmission power subject to probabilistic signal-to-interference-plus-noise ratio (SINR) constraints. The probabilistic SINR constraints in general have no closed-form expression and are difficult to handle. Based on a Bernstein-type inequality of complex Gaussian random variables, we propose a conservative formulation to the robust beamforming design problem. The semidefinite relaxation technique can be applied to efficiently handle the proposed conservative formulation. Simulation results show that, in comparison with the existing methods, the proposed method is more power efficient and is able to support higher target SINR values for receivers.