Source author record

Mingming Zhang

Mingming Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Machine Learning math.NA Numerical Analysis Artificial Intelligence Computation and Language Computational Engineering, Finance, and Science Computer Vision Information Retrieval math.CO physics.optics

Catalog footprint

What is connected

11works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A Construction Framework of Coded Caching Scheme for Multi-Access MISO Systems via Knapsack Problem

This paper investigates the coded caching problem in a multi-access multiple-input single-output (MAMISO) network with the combinatorial topology. The considered system consists of a server containing $N$ files, $Λ$ cache nodes, and $K$ cache-less users, where each user can access a unique subset of $r$ cache nodes. The server is equipped with $L$ transmit antennas. Our objective is to design a caching scheme that simultaneously achieves a high sum Degree of Freedom (sum-DoF) and low subpacketization complexity. To address this challenge, we formulate the design of multi-antenna placement delivery arrays (MAPDA) as a $0$--$1$ knapsack problem to maximize the achievable DoF, thereby transforming the complex combinatorial caching structure into a tractable optimization framework that yields efficient cache placement and flexible delivery strategies. Theoretical and numerical analyses demonstrate that: for networks with combinatorial topologies, the proposed scheme achieves a higher sum-DoF than existing schemes. Under identical cache size constraints, the subpacketization level remains comparable to existing linear subpacketization schemes. Moreover, under specific system conditions, the proposed scheme attains the theoretical maximum sum-DoF of $\min\{L+KM/N, K\}$ while achieving further reductions subpacketization. For particular combinatorial structures, we further derive optimized constructions that achieve even higher sum-DoF with lower subpacketization. ```

preprint2026arXiv

First Thin-Film Lithium Tantalate Polarization Controller Enabling Reset-Free Mrad/s Tracking for Optical Interconnects

The rapid escalation of computing power driven by large-scale artificial intelligence is placing unprecedented demands on the bandwidth, latency, and energy efficiency of data-center interconnects (DCIs). Self-homodyne coherent (SHC) transmission is a promising architecture because it preserves the spectral efficiency of coherent detection while greatly simplifying digital signal processing, but its practical deployment is critically limited by random and often ultrafast state-of-polarization (SOP) fluctuations that induce carrier fading and destabilize coherent reception. Here we report the first integrated polarization controller based on thin-film lithium tantalate (TFLT), enabling reset-free polarization tracking at Mrad/s speeds. The four-stage electro-optic device exhibits polarization-dependent loss (PDL) below 0.3 dB, a half-wave voltage below 2.5 V, high modulation bandwidth, and negligible DC drift. To accommodate the finite tuning range of integrated phase shifters, we develop a finite-boundary gradient-descent (FBGD) control algorithm that ensures reset-free SOP evolution with no phase jump. The implemented adaptive polarization controller (APC) is validated through both standalone polarization-tracking measurements and a dual-polarization 16-QAM SHC 400-Gbps transmission system. Transient polarization disturbances can be tracked at speeds up to 2 Mrad/s, while stable reset-free operation under continuous polarization disturbances is maintained up to 1 Mrad/s. This reset-free performance represents more than doubling the state of the art, while the pre-FEC bit-error rates remain below the HD-FEC threshold under realistic DCI conditions and lightning-scale polarization disturbances. These results establish TFLT as a new platform for ultrafast, low-power, reset-free, and drift-free polarization control in coherent optical interconnects and beyond.

preprint2026arXiv

Generative Auto-Bidding with Unified Modeling and Exploration

Automated bidding is central to modern digital advertising. Early rule-based methods lacked adaptability, while subsequent Reinforcement Learning approaches modeled bidding as a Markov Decision Process but struggled with long-term dependencies. Recent generative models show promise, yet they lack explicit mechanisms to balance exploration and safety, relying solely on action perturbations or trajectory guidance without a safety fallback. This results in inefficient exploration and elevated financial risk for advertising platforms. To address this gap, we propose GUIDE (Generative Auto-Bidding with Unified Modeling and Exploration), a framework that synergistically integrates directed exploration with a safe fallback mechanism. GUIDE employs a Decision Transformer (DT) to jointly model historical bidding actions and environmental state transitions. A Q-value module guides the DT's exploration via regularization constraints, while an Inverse Dynamics Module (IDM) leverages DT-predicted future states to infer robust, behaviorally consistent actions as a safe policy fallback. The Q-value module then adaptively selects the final action between these two options, balancing exploration and safety. Together, these components form an integrated "explore-safeguard-select" pipeline that unifies efficiency and safety. We conduct extensive experiments on public datasets, in simulated auction environments, and through large-scale online deployment on Taobao, a leading Chinese advertising platform. Results show GUIDE consistently outperforms state-of-the-art baselines across all scenarios. In real-world deployment, GUIDE achieves notable gains: +4.10% ad GMV, +1.40% ad clicks, +1.66% ad cost, and +3.52% ad ROI, demonstrating its effectiveness and strong industrial applicability.

preprint2026arXiv

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

Foundation models have established unified representations for natural language processing, yet this paradigm remains largely unexplored for tabular data. Existing methods face fundamental limitations: LLM-based approaches lack retrieval-compatible vector outputs, whereas text embedding models often fail to capture tabular structure and numerical semantics. To bridge this gap, we first introduce the Tabular Embedding Benchmark (TabBench), a comprehensive suite designed to evaluate the tabular understanding capability of embedding models. We then propose TabEmbed, the first generalist embedding model that unifies tabular classification and retrieval within a shared embedding space. By reformulating diverse tabular tasks as semantic matching problems, TabEmbed leverages large-scale contrastive learning with positive-aware hard negative mining to discern fine-grained structural and numerical nuances. Experimental results on TabBench demonstrate that TabEmbed significantly outperforms state-of-the-art text embedding models, establishing a new baseline for universal tabular representation learning. Code and datasets are publicly available at https://github.com/qiangminjie27/TabEmbed and https://huggingface.co/datasets/qiangminjie27/TabBench.

preprint2022arXiv

An Adaptive Finite Element DtN Method for Maxwell's Equations

This paper is concerned with a numerical solution to the scattering of a time-harmonic electromagnetic wave by a bounded and impenetrable obstacle in three dimensions. The electromagnetic wave propagation is modeled by a boundary value problem of Maxwell's equations in the exterior domain of the obstacle. Based on the Dirichlet-to-Neumann (DtN) operator, which is defined by an infinite series, an exact transparent boundary condition is introduced and the scattering problem is reduced equivalently into a bounded domain. An a posteriori error estimate based adaptive finite element DtN method is developed to solve the discrete variational problem, where the DtN operator is truncated into a sum of finitely many terms. The a posteriori error estimate takes into account both the finite element approximation error and the truncation error of the DtN operator. The latter is shown to decay exponentially with respect to the truncation parameter. Numerical experiments are presented to illustrate the effectiveness of the proposed method.

preprint2022arXiv

Coded Caching for Two-Dimensional Multi-Access Networks

This paper studies a novel multi-access coded caching (MACC) model in the two-dimensional (2D) topology, which is a generalization of the one-dimensional (1D) MACC model proposed by Hachem et al. The 2D MACC model is formed by a server containing $N$ files, $K_1\times K_2$ cache-nodes with $M$ files located at a grid with $K_1$ rows and $K_2$ columns, and $K_1\times K_2$ cache-less users where each user is connected to $L^2$ nearby cache-nodes. The server is connected to the users through an error-free shared link, while the users can retrieve the cached content of the connected cache-nodes without cost. Our objective is to minimize the worst-case transmission load over all possible users' demands. In this paper, we first propose a grouping scheme for the case where $K_1$ and $K_2$ are divisible by $L$. By partitioning the cache-nodes and users into $L^2$ groups such that no two users in the same group share any cache-node, we use the shared-link coded caching scheme proposed by Maddah-Ali and Niesen for each group. Then for any model parameters satisfying $\min\{K_1,K_2\}>L$, we propose a transformation approach which constructs a 2D MACC scheme from two classes of 1D MACC schemes in vertical and horizontal projections, respectively. As a result, we can construct 2D MACC schemes that achieve maximum local caching gain and improved coded caching gain, compared to the baseline scheme by a direct extension from 1D MACC schemes.

preprint2020arXiv

An adaptive finite element DtN method for the three-dimensional acoustic scattering problem

This paper is concerned with a numerical solution of the acoustic scattering by a bounded impenetrable obstacle in three dimensions. The obstacle scattering problem is formulated as a boundary value problem in a bounded domain by using a Dirichlet-to-Neumann (DtN) operator. An a posteriori error estimate is derived for the finite element method with the truncated DtN operator. The a posteriori error estimate consists of the finite element approximation error and the truncation error of the DtN operator, where the latter is shown to decay exponentially with respect to the truncation parameter. Based on the a posteriori error estimate, an adaptive finite element method is developed for the obstacle scattering problem. The truncation parameter is determined by the truncation error of the DtN operator and the mesh elements for local refinement are marked through the finite element approximation error. Numerical experiments are presented to demonstrate the effectiveness of the proposed method.

preprint2020arXiv

Lifelong Property Price Prediction: A Case Study for the Toronto Real Estate Market

We present Luce, the first life-long predictive model for automated property valuation. Luce addresses two critical issues of property valuation: the lack of recent sold prices and the sparsity of house data. It is designed to operate on a limited volume of recent house transaction data. As a departure from prior work, Luce organizes the house data in a heterogeneous information network (HIN) where graph nodes are house entities and attributes that are important for house price valuation. We employ a Graph Convolutional Network (GCN) to extract the spatial information from the HIN for house-related data like geographical locations, and then use a Long Short Term Memory (LSTM) network to model the temporal dependencies for house transaction data over time. Unlike prior work, Luce can make effective use of the limited house transactions data in the past few months to update valuation information for all house entities within the HIN. By providing a complete and up-to-date house valuation dataset, Luce thus massively simplifies the downstream valuation task for the targeting properties. We demonstrate the benefit of Luce by applying it to large, real-life datasets obtained from the Toronto real estate market. Extensive experimental results show that Luce not only significantly outperforms prior property valuation methods but also often reaches and sometimes exceeds the valuation accuracy given by independent experts when using the actual realization price as the ground truth.

preprint2020arXiv

New coded caching schemes from placement delivery arrays

Coded caching schemes with low subpacketization and small transmission rate are desirable in practice due to the requirement of low implementation complexity and efficiency of the transmission. Placement delivery arrays (PDA in short) can be used to generate coded caching schemes. However, many known coded caching schemes have large memory ratios. In this paper, we realize that some schemes with low subpacketization generated by PDAs do not fully use the users' caching content to create multicasting opportunities and thus propose to overcome this drawback. As an application, we obtain two new schemes with low subpacketizations, which have significantly advantages on the memory ratio and transmission rate compared with the original scheme.

preprint2019arXiv

A lossless data hiding scheme in JPEG images with segment coding

In this paper, we propose a lossless data hiding scheme in JPEG images. After quantified DCT transform, coefficients have characteristics that distribution in high frequencies is relatively sparse and absolute values are small. To improve encoding efficiency, we put forward an encoding algorithm that searches for a high frequency as terminate point and recode the coefficients above, so spare space is reserved to embed secret data and appended data with no file expansion. Receiver can obtain terminate point through data analysis, extract additional data and recover original JPEG images lossless. Experimental results show that the proposed method has a larger capacity than state-of-the-art works.

preprint2012arXiv

Explosion prediction of oil gas using SVM and Logistic Regression

The prevention of dangerous chemical accidents is a primary problem of industrial manufacturing. In the accidents of dangerous chemicals, the oil gas explosion plays an important role. The essential task of the explosion prevention is to estimate the better explosion limit of a given oil gas. In this paper, Support Vector Machines (SVM) and Logistic Regression (LR) are used to predict the explosion of oil gas. LR can get the explicit probability formula of explosion, and the explosive range of the concentrations of oil gas according to the concentration of oxygen. Meanwhile, SVM gives higher accuracy of prediction. Furthermore, considering the practical requirements, the effects of penalty parameter on the distribution of two types of errors are discussed.

Mingming Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

A Construction Framework of Coded Caching Scheme for Multi-Access MISO Systems via Knapsack Problem

First Thin-Film Lithium Tantalate Polarization Controller Enabling Reset-Free Mrad/s Tracking for Optical Interconnects

Generative Auto-Bidding with Unified Modeling and Exploration

TabEmbed: Benchmarking and Learning Generalist Embeddings for Tabular Understanding

An Adaptive Finite Element DtN Method for Maxwell's Equations

Coded Caching for Two-Dimensional Multi-Access Networks

An adaptive finite element DtN method for the three-dimensional acoustic scattering problem

Lifelong Property Price Prediction: A Case Study for the Toronto Real Estate Market

New coded caching schemes from placement delivery arrays

A lossless data hiding scheme in JPEG images with segment coding

Explosion prediction of oil gas using SVM and Logistic Regression