Source author record

Fan Cheng

Fan Cheng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Computer Vision Machine Learning Applications Computation Cryptography and Security eess.SY Information Retrieval math-ph math.AG math.AP math.MP Systems and Control

Catalog footprint

What is connected

14works

14topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Coding for Fading Channels with Imperfect CSI at the Transmitter and Quantized Feedback

The classical Schalkwijk-Kailath (SK) scheme for the additive Gaussian noise channel with noiseless feedback is highly efficient since its coding complexity is extremely low and the decoding error doubly exponentially decays as the coding blocklength tends to infinity. However, how to extend the SK scheme to channel models with memory has yet to be solved. In this paper, we first investigate how to design SK-type scheme for the 2-path quasi-static fading channel with noiseless feedback. By viewing the signal of the second path as a relay and adopting an amplify-and-forward (AF) relay strategy, we show that the interference path signal can help to enhance the transmission rate. Besides this, for arbitrary multi-path fading channel with feedback, we also present an SK-type scheme for such a model, which transforms the time domain channel into a frequency domain MIMO channel.

preprint2026arXiv

Energy-variational solutions for geodynamical two-phase flows -- From logarithmic to double-obstacle potentials by variational convergence

In [Cheng, Lasarzik, Thomas 2025 ARXIV-Preprint 2509.25508], we studied a Cahn--Hilliard two-phase model describing the flow of two viscoelastoplastic fluids in the framework of dissipative solutions using a logarithmic potential for the phase-field variable. This choice of potential has the effect that the fluid mixture cannot fully separate into two pure phases. The notion of dissipative solutions is based on a relative energy-dissipation inequality featuring a suitable regularity weight. In this way, this is a very weak solution concept. In the present work, we study the well-posedness of the geodynamical two-phase flow in the notion of energy-variational solutions. They feature an additional scalar energy variable that majorizes the system energy along solutions and they are further characterized by a variational inequality that combines an energy-dissipation estimate with the weak formulation of the system adding an error term that accounts for the mismatch between the energy variable and the system energy multiplied by a suitable regularity weight. We give a comparison of these two concepts. We further study different phase-field potentials for the geodynamical two-phase flow model. In particular, we address the variational limit from a potential with a logarithmic contribution to a double-obstacle potential, then also allowing for the emergence of pure phases. This study underlines that, thanks to its structure, the energy-variational solution is better suited for variational convergence methods than the dissipative solution.

preprint2026arXiv

Incantation: Natural Language as the Action Interface for Multi-Entity Video World Models

Modern interactive video world models have achieved impressive visual fidelity, yet lack fine-grained multi-entity control and cross-entity, cross-world generalization. We trace this gap to the action interface: standard control protocols (e.g. animation IDs, device inputs, scene-level captions) bind action semantics to specific entities or engines at design time. We propose natural language as the interface to unlock expressiveness that no prior interface can achieve, and we present Incantation, the first interactive video world model with per-latent-frame (0.25 s) natural-language conditioning that supports simultaneous multi-entity control and concept-level cross-entity transfer beyond any fixed rendering pipeline. We pair a pretrained bidirectional video backbone with frame-local text cross-attention, and enable real-time long-horizon streaming through ODE-initialized Self-Forcing distillation with a RoPE-decoupled sliding KV-cache. We surpass the Action-Index baseline on cross-entity transfer (89% vs. 43%) and out-of-vocabulary prompts (90% vs. 0%), and our 2-step student sustains 19.7 FPS at 480p with stable FVD over 2-hour rollouts. We further apply the same architecture and training recipe to The King of Fighters, changing only the per-entity action vocabulary slots. We have released a preview subset of the Incantation dataset at https://huggingface.co/datasets/zhush/incantation-elden-ring-scenes, containing manually collected Elden Ring player-boss combat clips with structured action-oriented metadata. Larger-scale Elden Ring and KOF data will be released with the full project.

preprint2026arXiv

TIE: Time Interval Encoding for Video Generation over Events

Director-style prompting, robotic action prediction, and interactive video agents demand temporal grounding over concurrent events -- a regime in which 68% of general clips and over 99% of robotics/gameplay clips contain overlapping events, yet existing multi-event generators rest on a single-active-prompt assumption. However, modern video generators, such as Diffusion Transformers (DiT), represent time as discrete points through point-wise positional encodings. This formulation creates a fundamental dimension mismatch: temporally extended intervals and overlapping events are mathematically unrepresentable to the attention mechanism. In this paper, we propose Time Interval Encoding (TIE), a principled, plug-and-play interval-aware generalization of rotary embeddings that elevates time intervals to first-class primitives inside DiT cross-attention. Rather than introducing another heuristic interval embedding, we show that, within RoPE-compatible bilinear attention, TIE is characterized by two basic principles: Temporal Integrability, which requires an event to aggregate positional evidence over its full duration, and Duration Invariance, which removes the trivial bias toward longer intervals. Under a uniform kernel, this characterization yields an efficient closed-form sinc-based solution that preserves the standard attention interface and naturally attenuates boundary noise through interval integration. Empirically, TIE preserves the visual quality of the base DiT model while substantially improving temporal controllability. In our experiments on the OmniEvents dataset, it improves human-verified Temporal Constraint Satisfaction Rate from 77.34% to 96.03% and reduces temporal boundary error from 0.261s to 0.073s, while also improving trajectory-level temporal alignment metrics. The code and dataset are available at https://github.com/MatrixTeam-AI/TIE.

preprint2024arXiv

DAFD: Domain Adaptation via Feature Disentanglement for Image Classification

A good feature representation is the key to image classification. In practice, image classifiers may be applied in scenarios different from what they have been trained on. This so-called domain shift leads to a significant performance drop in image classification. Unsupervised domain adaptation (UDA) reduces the domain shift by transferring the knowledge learned from a labeled source domain to an unlabeled target domain. We perform feature disentanglement for UDA by distilling category-relevant features and excluding category-irrelevant features from the global feature maps. This disentanglement prevents the network from overfitting to category-irrelevant information and makes it focus on information useful for classification. This reduces the difficulty of domain alignment and improves the classification accuracy on the target domain. We propose a coarse-to-fine domain adaptation method called Domain Adaptation via Feature Disentanglement~(DAFD), which has two components: (1)the Category-Relevant Feature Selection (CRFS) module, which disentangles the category-relevant features from the category-irrelevant features, and (2)the Dynamic Local Maximum Mean Discrepancy (DLMMD) module, which achieves fine-grained alignment by reducing the discrepancy within the category-relevant features from different domains. Combined with the CRFS, the DLMMD module can align the category-relevant features properly. We conduct comprehensive experiment on four standard datasets. Our results clearly demonstrate the robustness and effectiveness of our approach in domain adaptive image classification tasks and its competitiveness to the state of the art.

preprint2022arXiv

A Reformulation of Gaussian Completely Monotone Conjecture: A Hodge Structure on the Fisher Information along Heat Flow

In the past decade, J. Huh solved several long-standing open problems on log-concave sequences in combinatorics. The ground-breaking techniques developed in those work are from algebraic geometry: "We believe that behind any log-concave sequence that appears in nature there is such a Hodge structure responsible for the log-concavity". A function is called completely monotone if its derivatives alternate in signs; e.g., $e^{-t}$. A fundamental conjecture in mathematical physics and Shannon information theory is on the complete monotonicity of Gaussian distribution (GCMC), which states that $I(X+Z_t)$\footnote{The probability density function of $X+Z_t$ is called "heat flow" in mathematical physics.} is completely monotone in $t$, where $I$ is Fisher information, random variables $X$ and $Z_t$ are independent and $Z_t\sim\mathcal{N}(0,t)$ is Gaussian. Inspired by the algebraic geometry method introduced by J. Huh, GCMC is reformulated in the form of a log-convex sequence. In general, a completely monotone function can admit a log-convex sequence and a log-convex sequence can further induce a log-concave sequence. The new formulation may guide GCMC to the marvelous temple of algebraic geometry. Moreover, to make GCMC more accessible to researchers from both information theory and mathematics\footnote{The author was not familiar with algebraic geometry. The paper is also aimed at providing people outside information theory of necessary background on the history of GCMC in theory and application.}, together with some new findings, a thorough summary of the origin, the implication and further study on GCMC is presented.

preprint2022arXiv

Computationally Efficient Learning of Statistical Manifolds

Analyzing high-dimensional data with manifold learning algorithms often requires searching for the nearest neighbors of all observations. This presents a computational bottleneck in statistical manifold learning when observations of probability distributions rather than vector-valued variables are available or when data size is large. We resolve this problem by proposing a new method for approximation in statistical manifold learning. The novelty of our approximation is the strongly consistent distance estimators based on independent and identically distributed samples from probability distributions. By exploiting the connection between Hellinger/total variation distance for discrete distributions and the L2/L1 norm, we demonstrate that the proposed distance estimators, combined with approximate nearest neighbor searching, could largely improve the computational efficiency with little to no loss in the accuracy of manifold embedding. The result is robust to different manifold learning algorithms and different approximate nearest neighbor algorithms. The proposed method is applied to learning statistical manifolds of electricity usage. This application demonstrates how underlying structures in high dimensional data, including anomalies, can be visualized and identified, in a way that is scalable to large datasets.

preprint2022arXiv

FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining

Tables store rich numerical data, but numerical reasoning over tables is still a challenge. In this paper, we find that the spreadsheet formula, which performs calculations on numerical values in tables, is naturally a strong supervision of numerical reasoning. More importantly, large amounts of spreadsheets with expert-made formulae are available on the web and can be obtained easily. FORTAP is the first method for numerical-reasoning-aware table pretraining by leveraging large corpus of spreadsheet formulae. We design two formula pretraining tasks to explicitly guide FORTAP to learn numerical reference and calculation in semi-structured tables. FORTAP achieves state-of-the-art results on two representative downstream tasks, cell type classification and formula prediction, showing great potential of numerical-reasoning-aware pretraining.

preprint2020arXiv

An Adaptive MMC Synchronous Stability Control Method Based on Local PMU measurements

Reducing the current is a common method to ensure the synchronous stability of a modular multilevel converter (MMC) when there is a short-circuit fault at its AC side. However, the uncertainty of the fault location of the AC system leads to a significant difference in the maximum allowable stable operating current during the fault. This paper proposes an adaptive MMC fault-current control method using local phasor measurement unit (PMU) measurements. Based on the estimated Thevenin equivalent (TE) parameters of the system, the current can be directly calculated to ensure the maximum output power of the MMC during the fault. This control method does not rely on off-line simulation and adapts itself to various fault conditions. The effective measurements are firstly selected by the voltage threshold and parameter constraints, which allow us to handle the error due to the change on the system-side. The proposed TE estimation method can fast track the change of the system impedance without depending on the initial value and can deal with the TE potential changes after a large disturbance. The simulation shows that the TE estimation can accurately track the TE parameters after the fault, and the current control instruction during an MMC fault can ensure the maximum output power of the MMC.

preprint2016arXiv

A Numerical Study on the Wiretap Network with a Simple Network Topology

In this paper, we study a security problem on a simple wiretap network, consisting of a source node S, a destination node D, and an intermediate node R. The intermediate node connects the source and the destination nodes via a set of noiseless parallel channels, with sizes $n_1$ and $n_2$, respectively. A message $M$ is to be sent from S to D. The information in the network may be eavesdropped by a set of wiretappers. The wiretappers cannot communicate with one another. Each wiretapper can access a subset of channels, called a wiretap set. All the chosen wiretap sets form a wiretap pattern. A random key $K$ is generated at S and a coding scheme on $(M, K)$ is employed to protect $M$. We define two decoding classes at D: In Class-I, only $M$ is required to be recovered and in Class-II, both $M$ and $K$ are required to be recovered. The objective is to minimize $H(K)/H(M)$ {for a given wiretap pattern} under the perfect secrecy constraint. The first question we address is whether routing is optimal on this simple network. By enumerating all the wiretap patterns on the Class-I/II $(3,3)$ networks and harnessing the power of Shannon-type inequalities, we find that gaps exist between the bounds implied by routing and the bounds implied by Shannon-type inequalities for a small fraction~($<2\%$) of all the wiretap patterns. The second question we investigate is the following: What is $\min H(K)/H(M)$ for the remaining wiretap patterns where gaps exist? We study some simple wiretap patterns and find that their Shannon bounds (i.e., the lower bound induced by Shannon-type inequalities) can be achieved by linear codes, which means routing is not sufficient even for the ($3$, $3$) network. For some complicated wiretap patterns, we study the structures of linear coding schemes under the assumption that they can achieve the corresponding Shannon bounds....

preprint2015arXiv

Higher Order Derivatives in Costa's Entropy Power Inequality

Let $X$ be an arbitrary continuous random variable and $Z$ be an independent Gaussian random variable with zero mean and unit variance. For $t~>~0$, Costa proved that $e^{2h(X+\sqrt{t}Z)}$ is concave in $t$, where the proof hinged on the first and second order derivatives of $h(X+\sqrt{t}Z)$. Specifically, these two derivatives are signed, i.e., $\frac{\partial}{\partial t}h(X+\sqrt{t}Z) \geq 0$ and $\frac{\partial^2}{\partial t^2}h(X+\sqrt{t}Z) \leq 0$. In this paper, we show that the third order derivative of $h(X+\sqrt{t}Z)$ is nonnegative, which implies that the Fisher information $J(X+\sqrt{t}Z)$ is convex in $t$. We further show that the fourth order derivative of $h(X+\sqrt{t}Z)$ is nonpositive. Following the first four derivatives, we make two conjectures on $h(X+\sqrt{t}Z)$: the first is that $\frac{\partial^n}{\partial t^n} h(X+\sqrt{t}Z)$ is nonnegative in $t$ if $n$ is odd, and nonpositive otherwise; the second is that $\log J(X+\sqrt{t}Z)$ is convex in $t$. The first conjecture can be rephrased in the context of completely monotone functions: $J(X+\sqrt{t}Z)$ is completely monotone in $t$. The history of the first conjecture may date back to a problem in mathematical physics studied by McKean in 1966. Apart from these results, we provide a geometrical interpretation to the covariance-preserving transformation and study the concavity of $h(\sqrt{t}X+\sqrt{1-t}Z)$, revealing its connection with Costa's EPI.

preprint2014arXiv

Generalization of Mrs. Gerber's Lemma

Mrs. Gerber's Lemma (MGL) hinges on the convexity of $H(p*H^{-1}(u))$, where $H(u)$ is the binary entropy function. In this work, we prove that $H(p*f(u))$ is convex in $u$ for every $p\in [0,1]$ provided $H(f(u))$ is convex in $u$, where $f(u) : (a, b) \to [0, \frac12]$. Moreover, our result subsumes MGL and simplifies the original proof. We show that the generalized MGL can be applied in binary broadcast channel to simplify some discussion.

preprint2014arXiv

Imperfect Secrecy in Wiretap Channel II

In a point-to-point communication system which consists of a sender, a receiver and a set of noiseless channels, the sender wishes to transmit a private message to the receiver through the channels which may be eavesdropped by a wiretapper. The set of wiretap sets is arbitrary. The wiretapper can access any one but not more than one wiretap set. From each wiretap set, the wiretapper can obtain some partial information about the private message which is measured by the equivocation of the message given the symbols obtained by the wiretapper. The security strategy is to encode the message with some random key at the sender. Only the message is required to be recovered at the receiver. Under this setting, we define an achievable rate tuple consisting of the size of the message, the size of the key, and the equivocation for each wiretap set. We first prove a tight rate region when both the message and the key are required to be recovered at the receiver. Then we extend the result to the general case when only the message is required to be recovered at the receiver. Moreover, we show that even if stochastic encoding is employed at the sender, the message rate cannot be increased.

preprint2014arXiv

Performance Bounds on a Wiretap Network with Arbitrary Wiretap Sets

Consider a communication network represented by a directed graph $\mathcal{G}=(\mathcal{V},\mathcal{E})$, where $\mathcal{V}$ is the set of nodes and $\mathcal{E}$ is the set of point-to-point channels in the network. On the network a secure message $M$ is transmitted, and there may exist wiretappers who want to obtain information about the message. In secure network coding, we aim to find a network code which can protect the message against the wiretapper whose power is constrained. Cai and Yeung \cite{cai2002secure} studied the model in which the wiretapper can access any one but not more than one set of channels, called a wiretap set, out of a collection $\mathcal{A}$ of all possible wiretap sets. In order to protect the message, the message needs to be mixed with a random key $K$. They proved tight fundamental performance bounds when $\mathcal{A}$ consists of all subsets of $\mathcal{E}$ of a fixed size $r$. However, beyond this special case, obtaining such bounds is much more difficult. In this paper, we investigate the problem when $\mathcal{A}$ consists of arbitrary subsets of $\mathcal{E}$ and obtain the following results: 1) an upper bound on $H(M)$; 2) a lower bound on $H(K)$ in terms of $H(M)$. The upper bound on $H(M)$ is explicit, while the lower bound on $H(K)$ can be computed in polynomial time when $|\mathcal{A}|$ is fixed. The tightness of the lower bound for the point-to-point communication system is also proved.

Fan Cheng

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

Coding for Fading Channels with Imperfect CSI at the Transmitter and Quantized Feedback

Energy-variational solutions for geodynamical two-phase flows -- From logarithmic to double-obstacle potentials by variational convergence

Incantation: Natural Language as the Action Interface for Multi-Entity Video World Models

TIE: Time Interval Encoding for Video Generation over Events

DAFD: Domain Adaptation via Feature Disentanglement for Image Classification

A Reformulation of Gaussian Completely Monotone Conjecture: A Hodge Structure on the Fisher Information along Heat Flow

Computationally Efficient Learning of Statistical Manifolds

FORTAP: Using Formulas for Numerical-Reasoning-Aware Table Pretraining

An Adaptive MMC Synchronous Stability Control Method Based on Local PMU measurements

A Numerical Study on the Wiretap Network with a Simple Network Topology

Higher Order Derivatives in Costa's Entropy Power Inequality

Generalization of Mrs. Gerber's Lemma

Imperfect Secrecy in Wiretap Channel II

Performance Bounds on a Wiretap Network with Arbitrary Wiretap Sets