Source author record

Tianyu Zhang

Tianyu Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computer Vision math.ST Methodology Statistics Theory Human-Computer Interaction Multiagent Systems physics.class-ph Computation and Language Cryptography and Security cs.CY Distributed, Parallel, and Cluster Computing eess.SP eess.SY math.OC physics.data-an Robotics Systems and Control

Catalog footprint

What is connected

17works

19topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Diagnostic Performance of Universal-Learning Ultrasound AI Across Multiple Organs and Tasks: the UUSIC25 Challenge

IMPORTANCE: Modern ultrasound systems are universal diagnostic tools capable of imaging the entire body. However, current AI solutions remain fragmented into single-task tools. This critical gap between hardware versatility and software specificity limits workflow integration and clinical utility. OBJECTIVE: To evaluate the diagnostic accuracy, versatility, and efficiency of single general-purpose deep learning models for multi-organ classification and segmentation. DESIGN: The Universal UltraSound Image Challenge 2025 (UUSIC25) involved developing algorithms on 11,644 images aggregated from 12 sources (9 public, 3 private). Evaluation used an independent, multi-center private test set of 2,479 images, including data from a center completely unseen during training to assess generalization. OUTCOMES: Diagnostic performance (Dice Similarity Coefficient [DSC]; Area Under the Receiver Operating Characteristic Curve [AUC]) and computational efficiency (inference time, GPU memory). RESULTS: Of 15 valid algorithms, the top model (SMART) achieved a macro-averaged DSC of 0.854 across 5 segmentation tasks and AUC of 0.766 for binary classification. Models demonstrated high capability in anatomical segmentation (e.g., fetal head DSC: 0.942) but variability in complex diagnostic tasks subject to domain shift. Specifically, in breast cancer molecular subtyping, the top model's performance dropped from an AUC of 0.571 (internal) to 0.508 (unseen external center), highlighting the challenge of generalization. CONCLUSIONS: General-purpose AI models can achieve high accuracy and efficiency across multiple tasks using a single architecture. However, significant performance degradation on unseen data suggests domain generalization is critical for future clinical deployment.

preprint2026arXiv

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Large Language Models (LLMs) apply uniform computation to all tokens, despite language exhibiting highly non-uniform information density. This token-uniform regime wastes capacity on locally predictable spans while under-allocating computation to semantically critical transitions. We propose $\textbf{Dynamic Large Concept Models (DLCM)}$, a hierarchical language modeling framework that learns semantic boundaries from latent representations and shifts computation from tokens to a compressed concept space where reasoning is more efficient. DLCM discovers variable-length concepts end-to-end without relying on predefined linguistic units. Hierarchical compression fundamentally changes scaling behavior. We introduce the first $\textbf{compression-aware scaling law}$, which disentangles token-level capacity, concept-level reasoning capacity, and compression ratio, enabling principled compute allocation under fixed FLOPs. To stably train this heterogeneous architecture, we further develop a $\textbf{decoupled $μ$P parametrization}$ that supports zero-shot hyperparameter transfer across widths and compression regimes. At a practical setting ($R=4$, corresponding to an average of four tokens per concept), DLCM reallocates roughly one-third of inference compute into a higher-capacity reasoning backbone, achieving a $\textbf{+2.69$\%$ average improvement}$ across 12 zero-shot benchmarks under matched inference FLOPs.

preprint2026arXiv

Pareto-Guided Optimal Transport for Multi-Reward Alignment

Text-to-image generation models have achieved remarkable progress in preference optimization, yet achieving robust alignment across diverse reward models remains a significant challenge. Existing multi-reward fusion approaches rely on weighted summation, which is costly to tune and insufficient for balancing conflicting objectives. More critically, optimization with reward models is highly susceptible to reward hacking, where reward scores increase while the perceived quality of generated images deteriorates. We demonstrate that optimizing against a unified global target under heterogeneous reward upper bounds can induce reward hacking, a risk further exacerbated by the inherent instability of weak reward models. To mitigate this, we propose a Pareto Frontier-Guided Optimal Transport (PG-OT) framework. Our method constructs a prompt-specific Pareto frontier and maps dominated samples toward it via distribution-aware optimal transport. Furthermore, we develop both online and offline optimization strategies tailored to diverse reward signal characteristics. To provide a more rigorous assessment, we introduce the Joint Domination Rate (JDR) and Joint Collapse Rate (JCR) as principled metrics to quantify multi-reward synergy and reward hacking. Experimental results show that our approach outperforms strong baselines with an 11% gain in JDR and achieves a near 80% win rate in human evaluations.

preprint2026arXiv

When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks

Large language models (LLMs) conventionally process structured inputs as 1D token sequences. While natural for prose, such linearization may introduce additional representational burden for tasks whose computation depends directly on explicit 2D structure, because row--column alignment and local neighborhoods are no longer directly expressed in the input. We study this setting, which we refer to as serialization friction, on a small diagnostic testbed of synthetic tasks with explicit 2D structure: matrix transpose, Conway's Game of Life, and LU decomposition. To examine this question, we compare a text-only language pathway over serialized inputs with a vision-augmented pathway, built on the same language backbone, that receives the same underlying content rendered in task-faithful 2D layout, yielding a system-level comparison between two end-to-end input pathways. Across the tasks and settings we study, the visual pathway consistently outperforms the textual pathway; the gap often widens at larger dimensions, and error patterns under serialization become increasingly spatially structured. These findings indicate that the relationship between input representation and model performance on such tasks warrants further investigation, and suggest that preserving task-relevant 2D layout is a promising direction for structured 2D tasks.

preprint2026arXiv

Winners with Confidence: Discrete Argmin Inference with an Application to Model Selection

We study the problem of finding the index of the minimum value of a vector from noisy observations. This problem is relevant in population/policy comparison, discrete maximum likelihood, and model selection. We develop an asymptotically normal test statistic, even in high-dimensional settings and with potentially many ties in the population mean vector, by integrating concepts and tools from cross-validation and differential privacy. The key technical ingredient is a central limit theorem for globally dependent data. We also propose practical ways to select the tuning parameter that adapts to the signal landscape. Numerical experiments and data examples demonstrate the ability of the proposed method to achieve a favorable bias-variance trade-off in practical scenarios.

preprint2023arXiv

Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads

To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times.

preprint2022arXiv

A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev ellipsoids

The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In nonparametric regression, one assumes that the regression function belongs to a pre-specified infinite-dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal mean squared error (MSE) under a set of simple and direct conditions. The proposed estimator can be constructed with a low computational (time and space) expense: We also formally show that Sieve-SGD requires almost minimal memory usage among all statistically rate-optimal estimators.

preprint2022arXiv

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

Comprehensive global cooperation is essential to limit global temperature increases while continuing economic development, e.g., reducing severe inequality or achieving long-term economic growth. Achieving long-term cooperation on climate change mitigation with n strategic agents poses a complex game-theoretic problem. For example, agents may negotiate and reach climate agreements, but there is no central authority to enforce adherence to those agreements. Hence, it is critical to design negotiation and agreement frameworks that foster cooperation, allow all agents to meet their individual policy objectives, and incentivize long-term adherence. This is an interdisciplinary challenge that calls for collaboration between researchers in machine learning, economics, climate science, law, policy, ethics, and other fields. In particular, we argue that machine learning is a critical tool to address the complexity of this domain. To facilitate this research, here we introduce RICE-N, a multi-region integrated assessment model that simulates the global climate and economy, and which can be used to design and evaluate the strategic outcomes for different negotiation and agreement frameworks. We also describe how to use multi-agent reinforcement learning to train rational agents using RICE-N. This framework underpinsAI for Global Climate Cooperation, a working group collaboration and competition on climate negotiation and agreement design. Here, we invite the scientific community to design and evaluate their solutions using RICE-N, machine learning, economic intuition, and other domain knowledge. More information can be found on www.ai4climatecoop.org.

preprint2022arXiv

Deep Reinforcement Learning for Exact Combinatorial Optimization: Learning to Branch

Branch-and-bound is a systematic enumerative method for combinatorial optimization, where the performance highly relies on the variable selection strategy. State-of-the-art handcrafted heuristic strategies suffer from relatively slow inference time for each selection, while the current machine learning methods require a significant amount of labeled data. We propose a new approach for solving the data labeling and inference latency issues in combinatorial optimization based on the use of the reinforcement learning (RL) paradigm. We use imitation learning to bootstrap an RL agent and then use Proximal Policy Optimization (PPO) to further explore global optimal actions. Then, a value network is used to run Monte-Carlo tree search (MCTS) to enhance the policy network. We evaluate the performance of our method on four different categories of combinatorial optimization problems and show that our approach performs strongly compared to the state-of-the-art machine learning and heuristics based methods.

preprint2022arXiv

PromotionLens: Inspecting Promotion Strategies of Online E-commerce via Visual Analytics

Promotions are commonly used by e-commerce merchants to boost sales. The efficacy of different promotion strategies can help sellers adapt their offering to customer demand in order to survive and thrive. Current approaches to designing promotion strategies are either based on econometrics, which may not scale to large amounts of sales data, or are spontaneous and provide little explanation of sales volume. Moreover, accurately measuring the effects of promotion designs and making bootstrappable adjustments accordingly remains a challenge due to the incompleteness and complexity of the information describing promotion strategies and their market environments. We present PromotionLens, a visual analytics system for exploring, comparing, and modeling the impact of various promotion strategies. Our approach combines representative multivariant time-series forecasting models and well-designed visualizations to demonstrate and explain the impact of sales and promotional factors, and to support "what-if" analysis of promotions. Two case studies, expert feedback, and a qualitative user study demonstrate the efficacy of PromotionLens.

preprint2022arXiv

Regression in Tensor Product Spaces by the Method of Sieves

Estimation of a conditional mean (linking a set of features to an outcome of interest) is a fundamental statistical task. While there is an appeal to flexible nonparametric procedures, effective estimation in many classical nonparametric function spaces (e.g., multivariate Sobolev spaces) can be prohibitively difficult -- both statistically and computationally -- especially when the number of features is large. In this paper, we present (penalized) sieve estimators for regression in nonparametric tensor product spaces: These spaces are more amenable to multivariate regression, and allow us to, in-part, avoid the curse of dimensionality. Our estimators can be easily applied to multivariate nonparametric problems and have appealing statistical and computational properties. Moreover, they can effectively leverage additional structures such as feature sparsity. In this manuscript, we give theoretical guarantees, indicating that the predictive performance of our estimators scale favorably in dimension. In addition, we also present numerical examples to compare the finite-sample performance of the proposed estimators with several popular machine learning methods.

preprint2021arXiv

A robust solution of a statistical inverse problem in multiscale computational mechanics using an artificial neural network

This work addresses the inverse identification of apparent elastic properties of random heterogeneous materials using machine learning based on artificial neural networks. The proposed neural network-based identification method requires the construction of a database from which an artificial neural network can be trained to learn the nonlinear relationship between the hyperparameters of a prior stochastic model of the random compliance field and some relevant quantities of interest of an ad hoc multiscale computational model. An initial database made up with input and target data is first generated from the computational model, from which a processed database is deduced by conditioning the input data with respect to the target data using the nonparametric statistics. Two-and three-layer feedforward artificial neural networks are then trained from each of the initial and processed databases to construct an algebraic representation of the nonlinear mapping between the hyperparameters (network outputs) and the quantities of interest (network inputs). The performances of the trained artificial neural networks are analyzed in terms of mean squared error, linear regression fit and probability distribution between network outputs and targets for both databases. An ad hoc probabilistic model of the input random vector is finally proposed in order to take into account uncertainties on the network input and to perform a robustness analysis of the network output with respect to the input uncertainties level. The capability of the proposed neural network-based identification method to efficiently solve the underlying statistical inverse problem is illustrated through two numerical examples developed within the framework of 2D plane stress linear elasticity, namely a first validation example on synthetic data obtained through computational simulations and a second application example on real experimental data obtained through a physical experiment monitored by digital image correlation on a real heterogeneous biological material (beef cortical bone).

preprint2021arXiv

ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods

Climate change is a major threat to humanity, and the actions required to prevent its catastrophic consequences include changes in both policy-making and individual behaviour. However, taking action requires understanding the effects of climate change, even though they may seem abstract and distant. Projecting the potential consequences of extreme climate events such as flooding in familiar places can help make the abstract impacts of climate change more concrete and encourage action. As part of a larger initiative to build a website that projects extreme climate events onto user-chosen photos, we present our solution to simulate photo-realistic floods on authentic images. To address this complex task in the absence of suitable training data, we propose ClimateGAN, a model that leverages both simulated and real data for unsupervised domain adaptation and conditional image generation. In this paper, we describe the details of our framework, thoroughly evaluate components of our architecture and demonstrate that our model is capable of robustly generating photo-realistic flooding.

preprint2020arXiv

Interactive Rainbow Score: A Visual-centered Multimodal Flute Tutoring System

Learning to play an instrument is intrinsically multimodal, and we have seen a trend of applying visual and haptic feedback in music games and computer-aided music tutoring systems. However, most current systems are still designed to master individual pieces of music; it is unclear how well the learned skills can be generalized to new pieces. We aim to explore this question. In this study, we contribute Interactive Rainbow Score, an interactive visual system to boost the learning of sight-playing, the general musical skill to read music and map the visual representations to performance motions. The key design of Interactive Rainbow Score is to associate pitches (and the corresponding motions) with colored notation and further strengthen such association via real-time interactions. Quantitative results show that the interactive feature on average increases the learning efficiency by 31.1%. Further analysis indicates that it is critical to apply the interaction in the early period of learning.

preprint2020arXiv

Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks

This paper investigates capabilities of Privacy-Preserving Deep Learning (PPDL) mechanisms against various forms of privacy attacks. First, we propose to quantitatively measure the trade-off between model accuracy and privacy losses incurred by reconstruction, tracing and membership attacks. Second, we formulate reconstruction attacks as solving a noisy system of linear equations, and prove that attacks are guaranteed to be defeated if condition (2) is unfulfilled. Third, based on theoretical analysis, a novel Secret Polarization Network (SPN) is proposed to thwart privacy attacks, which pose serious challenges to existing PPDL methods. Extensive experiments showed that model accuracies are improved on average by 5-20% compared with baseline mechanisms, in regimes where data privacy are satisfactorily protected.

preprint2020arXiv

Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization

The fundamental difficulty in person re-identification (ReID) lies in learning the correspondence among individual cameras. It strongly demands costly inter-camera annotations, yet the trained models are not guaranteed to transfer well to previously unseen cameras. These problems significantly limit the application of ReID. This paper rethinks the working mechanism of conventional ReID approaches and puts forward a new solution. With an effective operator named Camera-based Batch Normalization (CBN), we force the image data of all cameras to fall onto the same subspace, so that the distribution gap between any camera pair is largely shrunk. This alignment brings two benefits. First, the trained model enjoys better abilities to generalize across scenarios with unseen cameras as well as transfer across multiple training sets. Second, we can rely on intra-camera annotations, which have been undervalued before due to the lack of cross-camera information, to achieve competitive ReID performance. Experiments on a wide range of ReID tasks demonstrate the effectiveness of our approach. The code is available at https://github.com/automan000/Camera-based-Person-ReID.

preprint2020arXiv

Robust Multiscale Identification of Apparent Elastic Properties at Mesoscale for Random Heterogeneous Materials with Multiscale Field Measurements

The aim of this work is to efficiently and robustly solve the statistical inverse problem related to the identification of the elastic properties at both macroscopic and mesoscopic scales of heterogeneous anisotropic materials with a complex microstructure that usually cannot be properly described in terms of their mechanical constituents at microscale. Within the context of linear elasticity theory, the apparent elasticity tensor field at a given mesoscale is modeled by a prior non-Gaussian tensor-valued random field. A general methodology using multiscale displacement field measurements simultaneously made at both macroscale and mesoscale has been recently proposed for the identification the hyperparameters of such a prior stochastic model by solving a multiscale statistical inverse problem using a stochastic computational model and some information from displacement fields at both macroscale and mesoscale. This paper contributes to the improvement of the computational efficiency, accuracy and robustness of such a method by introducing (i) a mesoscopic numerical indicator related to the spatial correlation length(s) of kinematic fields, allowing the time-consuming global optimization algorithm (genetic algorithm) used in a previous work to be replaced with a more efficient algorithm and (ii) an ad hoc stochastic representation of the hyperparameters involved in the prior stochastic model in order to enhance both the robustness and the precision of the statistical inverse identification method. Finally, the proposed improved method is first validated on in silico materials within the framework of 2D plane stress and 3D linear elasticity (using multiscale simulated data obtained through numerical computations) and then exemplified on a real heterogeneous biological material (beef cortical bone) within the framework of 2D plane stress linear elasticity (using multiscale experimental data obtained through mechanical testing monitored by digital image correlation).

Tianyu Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Diagnostic Performance of Universal-Learning Ultrasound AI Across Multiple Organs and Tasks: the UUSIC25 Challenge

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Pareto-Guided Optimal Transport for Multi-Reward Alignment

When 2D Tasks Meet 1D Serialization: On Serialization Friction in Structured Tasks

Winners with Confidence: Discrete Argmin Inference with an Application to Model Selection

Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads

A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev ellipsoids

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

Deep Reinforcement Learning for Exact Combinatorial Optimization: Learning to Branch

PromotionLens: Inspecting Promotion Strategies of Online E-commerce via Visual Analytics

Regression in Tensor Product Spaces by the Method of Sieves

A robust solution of a statistical inverse problem in multiscale computational mechanics using an artificial neural network

ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods

Interactive Rainbow Score: A Visual-centered Multimodal Flute Tutoring System

Rethinking Privacy Preserving Deep Learning: How to Evaluate and Thwart Privacy Attacks

Rethinking the Distribution Gap of Person Re-identification with Camera-based Batch Normalization

Robust Multiscale Identification of Apparent Elastic Properties at Mesoscale for Random Heterogeneous Materials with Multiscale Field Measurements