Source author record

Sihan Wang

Sihan Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision eess.IV Computation and Language cond-mat.mtrl-sci eess.SP Information Theory Machine Learning math.IT physics.app-ph

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DV-World: Benchmarking Data Visualization Agents in Real-World Scenarios

Real-world data visualization (DV) requires native environmental grounding, cross-platform evolution, and proactive intent alignment. Yet, existing benchmarks often suffer from code-sandbox confinement, single-language creation-only tasks, and assumption of perfect intent. To bridge these gaps, we introduce DV-World, a benchmark of 260 tasks designed to evaluate DV agents across real-world professional lifecycles. DV-World spans three domains: DV-Sheet for native spreadsheet manipulation including chart and dashboard creation as well as diagnostic repair; DV-Evolution for adapting and restructuring reference visual artifacts to fit new data across diverse programming paradigms and DV-Interact for proactive intent alignment with a user simulator that mimics real-world ambiguous requirements. Our hybrid evaluation framework integrates Table-value Alignment for numerical precision and MLLM-as-a-Judge with rubrics for semantic-visual assessment. Experiments reveal that state-of-the-art models achieve less than 50% overall performance, exposing critical deficits in handling the complex challenges of real-world data visualization. DV-World provides a realistic testbed to steer development toward the versatile expertise required in enterprise workflows. Our data and code are available at \href{https://github.com/DA-Open/DV-World}{this project page}.

preprint2026arXiv

FishBack: Pullback Fisher Geometry for Optimal Activation Steering in Transformers

Activation steering methods modify intermediate representations of language models to control output behavior, but universally assume the activation space is Euclidean. We show this assumption fails drastically: the local geometry induced by the model's own output behavior -- the Fisher information metric of the softmax layer, pulled back through the Jacobian of subsequent layers -- deviates from the Euclidean metric by over 97% in relative spectral norm on GPT-2, with an effective dimensionality of only 2--17% of the ambient space. From this pullback Fisher metric, we derive a closed-form steering equation that identifies the minimum-distortion direction for any target concept, yielding a closed-form optimal direction at each point that can be applied iteratively without manifold fitting or data-driven geometry estimation. We call the resulting framework FishBack. The metric admits a layer-wise recursive decomposition, which reveals that existing methods -- CAA, ActAdd, ITI, and others -- each implicitly adopt a particular approximate metric, and that their performance gaps are quantitatively predicted by a single spectral diagnostic: the ratio of their implicit metric's cost to the Fisher-optimal cost. On GPT-2, iterative pullback steering consistently outperforms all Euclidean baselines across three verb-morphology concepts and four layers, with off-target KL reductions of $1.3\times$--$2.5\times$ relative to Euclidean gradient ascent and $1.5\times$ relative to CAA at matched concept probability.

preprint2023arXiv

Unsupervised Cardiac Segmentation Utilizing Synthesized Images from Anatomical Labels

Cardiac segmentation is in great demand for clinical practice. Due to the enormous labor of manual delineation, unsupervised segmentation is desired. The ill-posed optimization problem of this task is inherently challenging, requiring well-designed constraints. In this work, we propose an unsupervised framework for multi-class segmentation with both intensity and shape constraints. Firstly, we extend a conventional non-convex energy function as an intensity constraint and implement it with U-Net. For shape constraint, synthetic images are generated from anatomical labels via image-to-image translation, as shape supervision for the segmentation network. Moreover, augmentation invariance is applied to facilitate the segmentation network to learn the latent features in terms of shape. We evaluated the proposed framework using the public datasets from MICCAI2019 MSCMR Challenge and achieved promising results on cardiac MRIs with Dice scores of 0.5737, 0.7796, and 0.6287 in Myo, LV, and RV, respectively.

preprint2022arXiv

AutoDisk: Automated Diffraction Processing and Strain Mapping in 4D-STEM

Development in lattice strain mapping using four-dimensional scanning transmission electron microscopy (4D-STEM) method now offers improved precision and feasibility. However, automatic and accurate diffraction analysis is still challenging due to noise and the complexity of intensity in diffraction patterns. In this work, we demonstrate an approach, employing the blob detection on cross-correlated diffraction patterns followed by lattice fitting algorithm, to automate the processing of four-dimensional data, including identifying and locating disks, and extracting local lattice parameters without prior knowledge about the material. The approach is both tested using simulated diffraction patterns and applied on experimental data acquired from a Pd@Pt core-shell nanoparticle. Our method shows robustness against various sample thicknesses and high noise, capability to handle complex patterns, and picometer-scale accuracy in strain measurement, making it a promising tool for high-throughput 4D-STEM data processing.

preprint2022arXiv

Characterizing the Energy-Efficiency Region of Symbiotic Radio Communications

Symbiotic radio (SR) communication is a promising technology to achieve spectrum- and energy-efficient wireless communication, by enabling passive backscatter devices (BDs) reuse not only the spectrum, but also the power of active primary transmitters (PTs). In this paper, we aim to characterize the energy-efficiency (EE) region of multiple-input single-output (MISO) SR systems, which is defined as all the achievable EE pairs by the active PT and passive BD. To this end, we first derive the maximum individual EE of the PT and BD, respectively, and show that there exists a non-trivial trade-off between these two EEs. To characterize such a trade-off, an optimization problem is formulated to find the Pareto boundary of the EE region by optimizing the transmit beamforming and power allocation. The formulated problem is non-convex and difficult to be directly solved. An efficient algorithm based on successive convex approximation (SCA) is proposed to find a Karush-Kuhn-Tucker (KKT) solution. Simulation results are provided to show that the proposed algorithm is able to effectively characterize the EE region of SR communication systems.

preprint2022arXiv

MyoPS: A Benchmark of Myocardial Pathology Segmentation Combining Three-Sequence Cardiac Magnetic Resonance Images

Assessment of myocardial viability is essential in diagnosis and treatment management of patients suffering from myocardial infarction, and classification of pathology on myocardium is the key to this assessment. This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) combining three-sequence cardiac magnetic resonance (CMR) images, which was first proposed in the MyoPS challenge, in conjunction with MICCAI 2020. The challenge provided 45 paired and pre-aligned CMR images, allowing algorithms to combine the complementary information from the three CMR sequences for pathology segmentation. In this article, we provide details of the challenge, survey the works from fifteen participants and interpret their methods according to five aspects, i.e., preprocessing, data augmentation, learning strategy, model architecture and post-processing. In addition, we analyze the results with respect to different factors, in order to examine the key obstacles and explore potential of solutions, as well as to provide a benchmark for future research. We conclude that while promising results have been reported, the research is still in the early stage, and more in-depth exploration is needed before a successful application to the clinics. Note that MyoPS data and evaluation tool continue to be publicly available upon registration via its homepage (www.sdspeople.fudan.edu.cn/zhuangxiahai/0/myops20/).

preprint2020arXiv

Multi-Modality Pathology Segmentation Framework: Application to Cardiac Magnetic Resonance Images

Multi-sequence of cardiac magnetic resonance (CMR) images can provide complementary information for myocardial pathology (scar and edema). However, it is still challenging to fuse these underlying information for pathology segmentation effectively. This work presents an automatic cascade pathology segmentation framework based on multi-modality CMR images. It mainly consists of two neural networks: an anatomical structure segmentation network (ASSN) and a pathological region segmentation network (PRSN). Specifically, the ASSN aims to segment the anatomical structure where the pathology may exist, and it can provide a spatial prior for the pathological region segmentation. In addition, we integrate a denoising auto-encoder (DAE) into the ASSN to generate segmentation results with plausible shapes. The PRSN is designed to segment pathological region based on the result of ASSN, in which a fusion block based on channel attention is proposed to better aggregate multi-modality information from multi-modality CMR images. Experiments from the MyoPS2020 challenge dataset show that our framework can achieve promising performance for myocardial scar and edema segmentation.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint