Source author record

Shanshan Wang

Shanshan Wang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

43works

21topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

MPM-LLM4DSE: Reaching the Pareto Frontier in HLS with Multimodal Learning and LLM-Driven Exploration

High-Level Synthesis (HLS) design space exploration (DSE) seeks Pareto-optimal designs within expansive pragma configuration spaces. To accelerate HLS DSE, graph neural networks (GNNs) are commonly employed as surrogates for HLS tools to predict quality of results (QoR) metrics, while multi-objective optimization algorithms expedite the exploration. However, GNN-based prediction methods may not fully capture the rich semantic features inherent in behavioral descriptions, and conventional multi-objective optimization algorithms often do not explicitly account for the domain-specific knowledge regarding how pragma directives influence QoR. To address these limitations, this paper proposes the MPM-LLM4DSE framework, which incorporates a multimodal prediction model (MPM) that simultaneously fuses features from behavioral descriptions and control and data flow graphs. Furthermore, the framework employs a large language model (LLM) as an optimizer, accompanied by a tailored prompt engineering methodology. This methodology incorporates pragma impact analysis on QoR to guide the LLM in generating high-quality configurations (LLM4DSE). Experimental results demonstrate that our multimodal predictive model significantly outperforms state-of-the-art work ProgSG by up to 10.25$\times$. Furthermore, in DSE tasks, the proposed LLM4DSE achieves an average performance gain of 39.90\% over prior methods, validating the effectiveness of our prompting methodology. Code and models are available at https://github.com/wslcccc/MPM-LLM4DSE.

preprint2026arXiv

Towards Unified Surgical Scene Understanding:Bridging Reasoning and Grounding via MLLMs

Surgical scene understanding is a cornerstone of computer-assisted intervention. While recent advances, particularly in surgical image segmentation, have driven progress, real-world clinical applications require a more holistic understanding that jointly captures procedural context, semantic reasoning, and precise visual grounding. However, existing approaches typically address these components in isolation, leading to fragmented representations and limited semantic consistency. To address this limitation, we propose SurgMLLM, a unified surgical scene understanding framework that bridges high-level reasoning and low-level visual grounding within a single model. Given surgical videos, SurgMLLM fine-tunes a multimodal large language model (MLLM) to support structured interpretability reasoning, which is used to jointly model phases, instrument-verb-target (IVT) triplets, and triplet-entity segmentation tokens. These tokens are then temporally aggregated and serve as prompts for a segmentation network, enabling accurate pixel-wise grounding of triplet instruments and targets. The entire framework is trained end-to-end with a unified objective that couples language-based reasoning supervision with visual grounding losses, promoting coherent cross-task learning and clinically consistent scene representations. To facilitate unified evaluation, we introduce CholecT45-Scene, extending CholecT45 dataset with 64,299 frames of pixel-level mask annotations for instruments and targets, aligned with existing triplet labels. Extensive experiments show that SurgMLLM significantly advances surgical scene understanding, improving the primary triplet recognition metric AP_IVT from 40.7% to 46.0% and consistently outperforming prior methods in phase recognition and segmentation. These results highlight the effectiveness of unified reasoning-and-grounding for reliable, context-aware surgical assistance.

preprint2024arXiv

AID-DTI: Accelerating High-fidelity Diffusion Tensor Imaging with Detail-Preserving Model-based Deep Learning

Deep learning has shown great potential in accelerating diffusion tensor imaging (DTI). Nevertheless, existing methods tend to suffer from Rician noise and detail loss in reconstructing the DTI-derived parametric maps especially when sparsely sampled q-space data are used. This paper proposes a novel method, AID-DTI (Accelerating hIgh fiDelity Diffusion Tensor Imaging), to facilitate fast and accurate DTI with only six measurements. AID-DTI is equipped with a newly designed Singular Value Decomposition (SVD)-based regularizer, which can effectively capture fine details while suppressing noise during network training. Experimental results on Human Connectome Project (HCP) data consistently demonstrate that the proposed method estimates DTI parameter maps with fine-grained details and outperforms three state-of-the-art methods both quantitatively and qualitatively.

preprint2024arXiv

LESEN: Label-Efficient deep learning for Multi-parametric MRI-based Visual Pathway Segmentation

Recent research has shown the potential of deep learning in multi-parametric MRI-based visual pathway (VP) segmentation. However, obtaining labeled data for training is laborious and time-consuming. Therefore, it is crucial to develop effective algorithms in situations with limited labeled samples. In this work, we propose a label-efficient deep learning method with self-ensembling (LESEN). LESEN incorporates supervised and unsupervised losses, enabling the student and teacher models to mutually learn from each other, forming a self-ensembling mean teacher framework. Additionally, we introduce a reliable unlabeled sample selection (RUSS) mechanism to further enhance LESEN's effectiveness. Our experiments on the human connectome project (HCP) dataset demonstrate the superior performance of our method when compared to state-of-the-art techniques, advancing multimodal VP segmentation for comprehensive analysis in clinical and research settings. The implementation code will be available at: https://github.com/aldiak/Semi-Supervised-Multimodal-Visual-Pathway- Delineation.

preprint2024arXiv

MLIP: Medical Language-Image Pre-training with Masked Local Representation Learning

Existing contrastive language-image pre-training aims to learn a joint representation by matching abundant image-text pairs. However, the number of image-text pairs in medical datasets is usually orders of magnitude smaller than that in natural datasets. Besides, medical image-text pairs often involve numerous complex fine-grained correspondences. This paper aims to enhance the data efficiency by introducing multiple-to-multiple local relationship modeling to capture denser supervisions. More specifically, we propose a Medical Language-Image Pre-training (MLIP) framework, which exploits the limited image-text medical data more efficiently through patch-sentence matching. Furthermore, we introduce a masked contrastive learning strategy with semantic integrity estimation to reduce redundancy in images while preserving the underlying semantics. Our evaluation results show that MLIP outperforms previous work in zero/few-shot classification and few-shot segmentation tasks by a large margin.

preprint2024arXiv

Modality Exchange Network for Retinogeniculate Visual Pathway Segmentation

Accurate segmentation of the retinogeniculate visual pathway (RGVP) aids in the diagnosis and treatment of visual disorders by identifying disruptions or abnormalities within the pathway. However, the complex anatomical structure and connectivity of RGVP make it challenging to achieve accurate segmentation. In this study, we propose a novel Modality Exchange Network (ME-Net) that effectively utilizes multi-modal magnetic resonance (MR) imaging information to enhance RGVP segmentation. Our ME-Net has two main contributions. Firstly, we introduce an effective multi-modal soft-exchange technique. Specifically, we design a channel and spatially mixed attention module to exchange modality information between T1-weighted and fractional anisotropy MR images. Secondly, we propose a cross-fusion module that further enhances the fusion of information between the two modalities. Experimental results demonstrate that our method outperforms existing state-of-the-art approaches in terms of RGVP segmentation performance.

preprint2024arXiv

Simultaneous q-Space Sampling Optimization and Reconstruction for Fast and High-fidelity Diffusion Magnetic Resonance Imaging

Diffusion Magnetic Resonance Imaging (dMRI) plays a crucial role in the noninvasive investigation of tissue microstructural properties and structural connectivity in the \textit{in vivo} human brain. However, to effectively capture the intricate characteristics of water diffusion at various directions and scales, it is important to employ comprehensive q-space sampling. Unfortunately, this requirement leads to long scan times, limiting the clinical applicability of dMRI. To address this challenge, we propose SSOR, a Simultaneous q-Space sampling Optimization and Reconstruction framework. We jointly optimize a subset of q-space samples using a continuous representation of spherical harmonic functions and a reconstruction network. Additionally, we integrate the unique properties of diffusion magnetic resonance imaging (dMRI) in both the q-space and image domains by applying $l1$-norm and total-variation regularization. The experiments conducted on HCP data demonstrate that SSOR has promising strengths both quantitatively and qualitatively and exhibits robustness to noise.

preprint2022arXiv

Artificial Intelligence Enabled Spectral Reconfigurable Fiber Laser

The combinations of artificial intelligence and lasers provide powerful ways to form smart light sources with ground-breaking functions. Here, a Raman fiber laser (RFL) with reconfigurable and programmable spectra in an ultra-wide bandwidth is developed based on spectral-spatial manipulation of light in multimode fiber (MMF). The proposed fiber laser uses nonlinear gain from cascaded stimulated Raman scattering, random distributed feedback from Rayleigh scattering, and point feedback from an MMF-based smart spectral filter. Through wavefront shaping controlled by a genetic algorithm, light of selective wavelength(s) can be focused in the MMF, forming the filter that, together with the active part of the laser, actively shape the output spectrum with a high degree of freedom. We achieved arbitrary spectral shaping of the cascaded RFL (e.g., continuously tunable single-wavelength and multi-wavelength laser with customizable linewidth, mode separation, and power distribution) from the 1st- to the 3rd-order Stokes emission by adjusting the pump power and auto-optimization of the smart filter. Our research uses artificial-intelligence controlled light manipulation in a fiber platform with multi-eigenmodes and nonlinear gain, mapping the spatial control into the spectral domain as well as extending the linear control of light in MMF to active light emission, which is of great significance for applications in optical communication, sensing, and spectroscopy.

preprint2022arXiv

Collective behavior in the North Rhine-Westphalia motorway network

To understand the dynamics on complex networks, measurement of correlations is indispensable. In a motorway network, it is not sufficient to collect information on fluxes and velocities on all individual links, i.e. parts of the freeways between ramps and highway crosses. The interdependencies and mutual connections are also of considerable interest. We analyze correlations in the complete motorway network in North Rhine-Westphalia, the most populous state in Germany. We view the motorway network as a complex system consisting of road sections which interact via the motion of vehicles, implying structures in the corresponding correlation matrices. In particular, we focus on collective behavior, i.e. coherent motion in the whole network or in large parts of it. To this end, we study the eigenvalue and eigenvector statistics and identify significant sections in the motorway network. We find collective behavior in these significant sections and further explore its causes. We show that collectivity throughout the network cannot directly be related to the traffic states (free, synchronous and congested) in Kerner's three-phase theory. Hence, the degree of collectivity provides a new, complementary observable to characterize the motorway network.

preprint2022arXiv

DASP: Defect and Dopant ab-initio Simulation Package

In order to perform automated calculations of defect and dopant properties in semiconductors and insulators, we developed a software package, Defect and Dopant ab-initio Simulation Package (DASP), which is composed of four modules for calculating: (i) elemental chemical potentials, (ii) defect (dopant) formation energies and transition energy levels, (iii) defect and carrier densities and (iv) carrier dynamics properties of high-density defects. DASP uses the materials genome database for quick determination of competing secondary phases and calculation of the energy above convex hull when calculating the elemental chemical potential that stabilizes compound semiconductors, so it can perform high-throughput prediction of thermodynamic stability of multinary compounds. DASP calls the ab-initio softwares to perform the total energy, structural relaxation and electronic structure calculations of the defect supercells with different structure configurations and charge states, based on which the defect formation energies and transition energy levels are calculated and the corrections for electrostatic potential alignment and image charge interaction can be included. Then DASP can calculate the equilibrium densities of defects and electron and hole carriers as well as the Fermi level in semiconductors under different chemical potential conditions and different growth/working temperature. For high-density defects, DASP can calculate the carrier dynamics properties such as the photoluminescence (PL) spectrum, defect-related radiative and non-radiative carrier capture cross sections, and recombination lifetime of non-equilibrium carriers.

preprint2022arXiv

Expert Knowledge-guided Geometric Representation Learning for Magnetic Resonance Imaging-based Glioma Grading

Radiomics and deep learning have shown high popularity in automatic glioma grading. Radiomics can extract hand-crafted features that quantitatively describe the expert knowledge of glioma grades, and deep learning is powerful in extracting a large number of high-throughput features that facilitate the final classification. However, the performance of existing methods can still be improved as their complementary strengths have not been sufficiently investigated and integrated. Furthermore, lesion maps are usually needed for the final prediction at the testing phase, which is very troublesome. In this paper, we propose an expert knowledge-guided geometric representation learning (ENROL) framework . Geometric manifolds of hand-crafted features and learned features are constructed to mine the implicit relationship between deep learning and radiomics, and therefore to dig mutual consent and essential representation for the glioma grades. With a specially designed manifold discrepancy measurement, the grading model can exploit the input image data and expert knowledge more effectively in the training phase and get rid of the requirement of lesion segmentation maps at the testing phase. The proposed framework is flexible regarding deep learning architectures to be utilized. Three different architectures have been evaluated and five models have been compared, which show that our framework can always generate promising results.

preprint2022arXiv

K-space and Image Domain Collaborative Energy based Model for Parallel MRI Reconstruction

Decreasing magnetic resonance (MR) image acquisition times can potentially make MR examinations more accessible. Prior arts including the deep learning models have been devoted to solving the problem of long MRI imaging time. Recently, deep generative models have exhibited great potentials in algorithm robustness and usage flexibility. Nevertheless, none of existing schemes can be learned or employed to the k-space measurement directly. Furthermore, how do the deep generative models work well in hybrid domain is also worth being investigated. In this work, by taking advantage of the deep energy-based models, we propose a k-space and image domain collaborative generative model to comprehensively estimate the MR data from under-sampled measurement. Experimental comparisons with the state-of-the-arts demonstrated that the proposed hybrid method has less error in reconstruction accuracy and is more stable under different acceleration factors

preprint2022arXiv

Mining Function Homology of Bot Loaders from Honeypot Logs

Self-contained loaders are widely adopted in botnets for injecting loading commands and spawning new bots. While researchers can dissect bot clients to get various information of botnets, the cloud-based and self-contained design of loaders effectively hinders researchers from understanding the loaders' evolution and variation using classic methods. The decoupled nature of bot loaders also dramatically reduces the feasibility of investigating relationships among clients and infrastructures. In this paper, we propose a text-based method to investigate and analyze details of bot loaders using honeypots. We leverage high interaction honeypots to collect request logs and define eight families of bot loaders based on the result of agglomerative clustering. At the function level, we push our study further to explore their homological relationship based on similarity analysis of request logs using sequence aligning techniques. This further exploration discloses that the released code of Mirai keeps spawning new generations of botnets both on the client and the server side. This paper uncovers the homology of active botnet infrastructures, providing a new prospect on finding covert relationships among cybercrimes. Bot loaders are precisely investigated at the function level to yield a new insight for researchers to identify the botnet's infrastructures and track their evolution over time.

preprint2022arXiv

Multi-Weight Respecification of Scan-specific Learning for Parallel Imaging

Parallel imaging is widely used in magnetic resonance imaging as an acceleration technology. Traditional linear reconstruction methods in parallel imaging often suffer from noise amplification. Recently, a non-linear robust artificial-neural-network for k-space interpolation (RAKI) exhibits superior noise resilience over other linear methods. However, RAKI performs poorly at high acceleration rates, and needs a large amount of autocalibration signals as the training samples. In order to tackle these issues, we propose a multi-weight method that implements multiple weighting matrices on the undersampled data, named as MW-RAKI. Enforcing multiple weighted matrices on the measurements can effectively reduce the influence of noise and increase the data constraints. Furthermore, we incorporate the strategy of multiple weighting matrixes into a residual version of RAKI, and form MW-rRAKI.Experimental compari-sons with the alternative methods demonstrated noticeably better reconstruction performances, particularly at high acceleration rates.

preprint2022arXiv

Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding

Pre-trained language models (PLM) have demonstrated their effectiveness for a broad range of information retrieval and natural language processing tasks. As the core part of PLM, multi-head self-attention is appealing for its ability to jointly attend to information from different positions. However, researchers have found that PLM always exhibits fixed attention patterns regardless of the input (e.g., excessively paying attention to [CLS] or [SEP]), which we argue might neglect important information in the other positions. In this work, we propose a simple yet effective attention guiding mechanism to improve the performance of PLM by encouraging attention towards the established goals. Specifically, we propose two kinds of attention guiding methods, i.e., map discrimination guiding (MDG) and attention pattern decorrelation guiding (PDG). The former definitely encourages the diversity among multiple self-attention heads to jointly attend to information from different representation subspaces, while the latter encourages self-attention to attend to as many different positions of the input as possible. We conduct experiments with multiple general pre-trained models (i.e., BERT, ALBERT, and Roberta) and domain-specific pre-trained models (i.e., BioBERT, ClinicalBERT, BlueBert, and SciBERT) on three benchmark datasets (i.e., MultiNLI, MedNLI, and Cross-genre-IR). Extensive experimental results demonstrate that our proposed MDG and PDG bring stable performance improvements on all datasets with high efficiency and low cost.

preprint2022arXiv

Rethinking the optimization process for self-supervised model-driven MRI reconstruction

Recovering high-quality images from undersampled measurements is critical for accelerated MRI reconstruction. Recently, various supervised deep learning-based MRI reconstruction methods have been developed. Despite the achieved promising performances, these methods require fully sampled reference data, the acquisition of which is resource-intensive and time-consuming. Self-supervised learning has emerged as a promising solution to alleviate the reliance on fully sampled datasets. However, existing self-supervised methods suffer from reconstruction errors due to the insufficient constraint enforced on the non-sampled data points and the error accumulation happened alongside the iterative image reconstruction process for model-driven deep learning reconstrutions. To address these challenges, we propose K2Calibrate, a K-space adaptation strategy for self-supervised model-driven MR reconstruction optimization. By iteratively calibrating the learned measurements, K2Calibrate can reduce the network's reconstruction deterioration caused by statistically dependent noise. Extensive experiments have been conducted on the open-source dataset FastMRI, and K2Calibrate achieves better results than five state-of-the-art methods. The proposed K2Calibrate is plug-and-play and can be easily integrated with different model-driven deep learning reconstruction methods.

preprint2022arXiv

SelfCoLearn: Self-supervised collaborative learning for accelerating dynamic MR imaging

Lately, deep learning has been extensively investigated for accelerating dynamic magnetic resonance (MR) imaging, with encouraging progresses achieved. However, without fully sampled reference data for training, current approaches may have limited abilities in recovering fine details or structures. To address this challenge, this paper proposes a self-supervised collaborative learning framework (SelfCoLearn) for accurate dynamic MR image reconstruction from undersampled k-space data. The proposed framework is equipped with three important components, namely, dual-network collaborative learning, reunderampling data augmentation and a specially designed co-training loss. The framework is flexible to be integrated with both data-driven networks and model-based iterative un-rolled networks. Our method has been evaluated on in-vivo dataset and compared it to four state-of-the-art methods. Results show that our method possesses strong capabilities in capturing essential and inherent representations for direct reconstructions from the undersampled k-space data and thus enables high-quality and fast dynamic MR imaging.

preprint2022arXiv

Spatial Correlation Analysis of Traffic Flow on Parallel Motorways in Germany

With the widely used method of correlation matrix analysis, this study reveals the change of traffic states on parallel motorways in North Rhine-Westphalia, Germany. In terms of the time series of traffic flow and velocity, we carry out a quantitative analysis in correlations and reveal a high level of strongly positive traffic flow correlation and rich structural features in the corresponding correlation matrices. The strong correlation is mainly ascribed to the daily time evolution of traffic flow during the periods of rush hours and non-rush hours. In terms of free flow and congestion, the structural features are able to capture the average traffic situation we derive from our data. Furthermore, the structural features in correlation matrices for individual time periods corroborate our results from the correlation matrices regarding a whole day. The average correlations in traffic flows and velocities over all pairwise sections disclose the traffic behavior during each individual time period. Our contribution uncovers the potential application of correlation analysis on the study of traffic networks as a complex system.

preprint2022arXiv

Specificity-Preserving Federated Learning for MR Image Reconstruction

Federated learning (FL) can be used to improve data privacy and efficiency in magnetic resonance (MR) image reconstruction by enabling multiple institutions to collaborate without needing to aggregate local data. However, the domain shift caused by different MR imaging protocols can substantially degrade the performance of FL models. Recent FL techniques tend to solve this by enhancing the generalization of the global model, but they ignore the domain-specific features, which may contain important information about the device properties and be useful for local reconstruction. In this paper, we propose a specificity-preserving FL algorithm for MR image reconstruction (FedMRI). The core idea is to divide the MR reconstruction model into two parts: a globally shared encoder to obtain a generalized representation at the global level, and a client-specific decoder to preserve the domain-specific properties of each client, which is important for collaborative reconstruction when the clients have unique distribution. Such scheme is then executed in the frequency space and the image space respectively, allowing exploration of generalized representation and client-specific properties simultaneously in different spaces. Moreover, to further boost the convergence of the globally shared encoder when a domain shift is present, a weighted contrastive regularization is introduced to directly correct any deviation between the client and server during optimization. Extensive experiments demonstrate that our FedMRI's reconstructed results are the closest to the ground-truth for multi-institutional data, and that it outperforms state-of-the-art FL methods.

preprint2022arXiv

Universal Generative Modeling for Calibration-free Parallel Mr Imaging

The integration of compressed sensing and parallel imaging (CS-PI) provides a robust mechanism for accelerating MRI acquisitions. However, most such strategies require the explicit formation of either coil sensitivity profiles or a cross-coil correlation operator, and as a result reconstruction corresponds to solving a challenging bilinear optimization problem. In this work, we present an unsupervised deep learning framework for calibration-free parallel MRI, coined universal generative modeling for parallel imaging (UGM-PI). More precisely, we make use of the merits of both wavelet transform and the adaptive iteration strategy in a unified framework. We train a powerful noise conditional score network by forming wavelet tensor as the network input at the training phase. Experimental results on both physical phantom and in vivo datasets implied that the proposed method is comparable and even superior to state-of-the-art CS-PI reconstruction approaches.

preprint2022arXiv

Variable Augmented Network for Invertible MR Coil Compression

A large number of coils are able to provide enhanced signal-to-noise ratio and improve imaging performance in parallel imaging. Nevertheless, the increasing growth of coil number simultaneously aggravates the drawbacks of data storage and reconstruction speed, especially in some iterative reconstructions. Coil compression addresses these issues by generating fewer virtual coils. In this work, a novel variable augmentation network for invertible coil compression termed VAN-ICC is presented. It utilizes inherent reversibility of normalizing flow-based models for high-precision compression and invertible recovery. By employing the variable augmentation technology to image/k-space variables from multi-coils, VAN-ICC trains invertible networks by finding an invertible and bijective function, which can map the original data to the compressed counterpart and vice versa. Experiments conducted on both fully-sampled and under-sampled data verified the effectiveness and flexibility of VAN-ICC. Quantitative and qualitative comparisons with traditional non-deep learning-based approaches demonstrated that VAN-ICC can carry much higher compression effects. Additionally, its performance is not susceptible to different number of virtual coils.

preprint2021arXiv

A coarse-to-fine framework for unsupervised multi-contrast MR image deformable registration with dual consistency constraint

Multi-contrast magnetic resonance (MR) image registration is useful in the clinic to achieve fast and accurate imaging-based disease diagnosis and treatment planning. Nevertheless, the efficiency and performance of the existing registration algorithms can still be improved. In this paper, we propose a novel unsupervised learning-based framework to achieve accurate and efficient multi-contrast MR image registrations. Specifically, an end-to-end coarse-to-fine network architecture consisting of affine and deformable transformations is designed to improve the robustness and achieve end-to-end registration. Furthermore, a dual consistency constraint and a new prior knowledge-based loss function are developed to enhance the registration performances. The proposed method has been evaluated on a clinical dataset containing 555 cases, and encouraging performances have been achieved. Compared to the commonly utilized registration methods, including VoxelMorph, SyN, and LT-Net, the proposed method achieves better registration performance with a Dice score of 0.8397 in identifying stroke lesions. With regards to the registration speed, our method is about 10 times faster than the most competitive method of SyN (Affine) when testing on a CPU. Moreover, we prove that our method can still perform well on more challenging tasks with lacking scanning information data, showing high robustness for the clinical application.

preprint2021arXiv

A Curated Dataset of Urban Scenes for Audio-Visual Scene Analysis

This paper introduces a curated dataset of urban scenes for audio-visual scene analysis which consists of carefully selected and recorded material. The data was recorded in multiple European cities, using the same equipment, in multiple locations for each scene, and is openly available. We also present a case study for audio-visual scene recognition and show that joint modeling of audio and visual modalities brings significant performance gain compared to state of the art uni-modal systems. Our approach obtained an 84.8% accuracy compared to 75.8% for the audio-only and 68.4% for the video-only equivalent systems.

preprint2021arXiv

Homotopic Gradients of Generative Density Priors for MR Image Reconstruction

Deep learning, particularly the generative model, has demonstrated tremendous potential to significantly speed up image reconstruction with reduced measurements recently. Rather than the existing generative models that often optimize the density priors, in this work, by taking advantage of the denoising score matching, homotopic gradients of generative density priors (HGGDP) are proposed for magnetic resonance imaging (MRI) reconstruction. More precisely, to tackle the low-dimensional manifold and low data density region issues in generative density prior, we estimate the target gradients in higher-dimensional space. We train a more powerful noise conditional score network by forming high-dimensional tensor as the network input at the training phase. More artificial noise is also injected in the embedding space. At the reconstruction stage, a homotopy method is employed to pursue the density prior, such as to boost the reconstruction performance. Experiment results imply the remarkable performance of HGGDP in terms of high reconstruction accuracy; only 10% of the k-space data can still generate images of high quality as effectively as standard MRI reconstruction with the fully sampled data.

preprint2020arXiv

Bounding boxes for weakly supervised segmentation: Global constraints get close to full supervision

We propose a novel weakly supervised learning segmentation based on several global constraints derived from box annotations. Particularly, we leverage a classical tightness prior to a deep learning setting via imposing a set of constraints on the network outputs. Such a powerful topological prior prevents solutions from excessive shrinking by enforcing any horizontal or vertical line within the bounding box to contain, at least, one pixel of the foreground region. Furthermore, we integrate our deep tightness prior with a global background emptiness constraint, guiding training with information outside the bounding box. We demonstrate experimentally that such a global constraint is much more powerful than standard cross-entropy for the background class. Our optimization problem is challenging as it takes the form of a large set of inequality constraints on the outputs of deep networks. We solve it with sequence of unconstrained losses based on a recent powerful extension of the log-barrier method, which is well-known in the context of interior-point methods. This accommodates standard stochastic gradient descent (SGD) for training deep networks, while avoiding computationally expensive and unstable Lagrangian dual steps and projections. Extensive experiments over two different public data sets and applications (prostate and brain lesions) demonstrate that the synergy between our global tightness and emptiness priors yield very competitive performances, approaching full supervision and outperforming significantly DeepCut. Furthermore, our approach removes the need for computationally expensive proposal generation. Our code is shared anonymously.

preprint2020arXiv

Parameter-Transferred Wasserstein Generative Adversarial Network (PT-WGAN) for Low-Dose PET Image Denoising

Due to the widespread use of positron emission tomography (PET) in clinical practice, the potential risk of PET-associated radiation dose to patients needs to be minimized. However, with the reduction in the radiation dose, the resultant images may suffer from noise and artifacts that compromise diagnostic performance. In this paper, we propose a parameter-transferred Wasserstein generative adversarial network (PT-WGAN) for low-dose PET image denoising. The contributions of this paper are twofold: i) a PT-WGAN framework is designed to denoise low-dose PET images without compromising structural details, and ii) a task-specific initialization based on transfer learning is developed to train PT-WGAN using trainable parameters transferred from a pretrained model, which significantly improves the training efficiency of PT-WGAN. The experimental results on clinical data show that the proposed network can suppress image noise more effectively while preserving better image fidelity than recently published state-of-the-art methods. We make our code available at https://github.com/90n9-yu/PT-WGAN.

preprint2020arXiv

Self-adaptive Re-weighted Adversarial Domain Adaptation

Existing adversarial domain adaptation methods mainly consider the marginal distribution and these methods may lead to either under transfer or negative transfer. To address this problem, we present a self-adaptive re-weighted adversarial domain adaptation approach, which tries to enhance domain alignment from the perspective of conditional distribution. In order to promote positive transfer and combat negative transfer, we reduce the weight of the adversarial loss for aligned features while increasing the adversarial force for those poorly aligned measured by the conditional entropy. Additionally, triplet loss leveraging source samples and pseudo-labeled target samples is employed on the confusing domain. Such metric loss ensures the distance of the intra-class sample pairs closer than the inter-class pairs to achieve the class-level alignment. In this way, the high accurate pseudolabeled target samples and semantic alignment can be captured simultaneously in the co-training process. Our method achieved low joint error of the ideal source and target hypothesis. The expected target error can then be upper bounded following Ben-David's theorem. Empirical evidence demonstrates that the proposed model outperforms state of the arts on standard domain adaptation datasets.

preprint2019arXiv

CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke

Segmenting stroke lesions from T1-weighted MR images is of great value for large-scale stroke rehabilitation neuroimaging analyses. Nevertheless, there are great challenges with this task, such as large range of stroke lesion scales and the tissue intensity similarity. The famous encoder-decoder convolutional neural network, which although has made great achievements in medical image segmentation areas, may fail to address these challenges due to the insufficient uses of multi-scale features and context information. To address these challenges, this paper proposes a Cross-Level fusion and Context Inference Network (CLCI-Net) for the chronic stroke lesion segmentation from T1-weighted MR images. Specifically, a Cross-Level feature Fusion (CLF) strategy was developed to make full use of different scale features across different levels; Extending Atrous Spatial Pyramid Pooling (ASPP) with CLF, we have enriched multi-scale features to handle the different lesion sizes; In addition, convolutional long short-term memory (ConvLSTM) is employed to infer context information and thus capture fine structures to address the intensity similarity issue. The proposed approach was evaluated on an open-source dataset, the Anatomical Tracings of Lesions After Stroke (ATLAS) with the results showing that our network outperforms five state-of-the-art methods. We make our code and models available at https://github.com/YH0517/CLCI_Net.

preprint2019arXiv

D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation

Assessing the location and extent of lesions caused by chronic stroke is critical for medical diagnosis, surgical planning, and prognosis. In recent years, with the rapid development of 2D and 3D convolutional neural networks (CNN), the encoder-decoder structure has shown great potential in the field of medical image segmentation. However, the 2D CNN ignores the 3D information of medical images, while the 3D CNN suffers from high computational resource demands. This paper proposes a new architecture called dimension-fusion-UNet (D-UNet), which combines 2D and 3D convolution innovatively in the encoding stage. The proposed architecture achieves a better segmentation performance than 2D networks, while requiring significantly less computation time in comparison to 3D networks. Furthermore, to alleviate the data imbalance issue between positive and negative samples for the network training, we propose a new loss function called Enhance Mixing Loss (EML). This function adds a weighted focal coefficient and combines two traditional loss functions. The proposed method has been tested on the ATLAS dataset and compared to three state-of-the-art methods. The results demonstrate that the proposed method achieves the best quality performance in terms of DSC = 0.5349+0.2763 and precision = 0.6331+0.295).

preprint2019arXiv

X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies

The morbidity of brain stroke increased rapidly in the past few years. To help specialists in lesion measurements and treatment planning, automatic segmentation methods are critically required for clinical practices. Recently, approaches based on deep learning and methods for contextual information extraction have served in many image segmentation tasks. However, their performances are limited due to the insufficient training of a large number of parameters, which sometimes fail in capturing long-range dependencies. To address these issues, we propose a depthwise separable convolution based X-Net that designs a nonlocal operation namely Feature Similarity Module (FSM) to capture long-range dependencies. The adopted depthwise convolution allows to reduce the network size, while the developed FSM provides a more effective, dense contextual information extraction and thus facilitates better segmentation. The effectiveness of X-Net was evaluated on an open dataset Anatomical Tracings of Lesions After Stroke (ATLAS) with superior performance achieved compared to other six state-of-the-art approaches. We make our code and models available at https://github.com/Andrewsher/X-Net.

preprint2018arXiv

Object Activity Scene Description, Construction and Recognition

Action recognition is a critical task for social robots to meaningfully engage with their environment. 3D human skeleton-based action recognition is an attractive research area in recent years. Although, the existing approaches are good at action recognition, it is a great challenge to recognize a group of actions in an activity scene. To tackle this problem, at first, we partition the scene into several primitive actions (PAs) based upon motion attention mechanism. Then, the primitive actions are described by the trajectory vectors of corresponding joints. After that, motivated by text classification based on word embedding, we employ convolution neural network (CNN) to recognize activity scenes by considering motion of joints as "word" of activity. The experimental results on the scenes of human activity dataset show the efficiency of the proposed approach.

preprint2016arXiv

Average cross-responses in correlated financial market

There are non-vanishing price responses across different stocks in correlated financial markets. We further study this issue by performing different averages, which identify active and passive cross-responses. The two average cross-responses show different characteristic dependences on the time lag. The passive cross-response exhibits a shorter response period with sizeable volatilities, while the corresponding period for the active cross-response is longer. The average cross-responses for a given stock are evaluated either with respect to the whole market or to different sectors. Using the response strength, the influences of individual stocks are identified and discussed. Moreover, the various cross-responses as well as the average cross-responses are compared with the self-responses. In contrast, the short memory of trade sign cross-correlation for stock pairs, the sign cross-correlation has long memory when averaged over different pairs of stocks.

preprint2016arXiv

Cross Domain Adaptation by Learning Partially Shared Classifiers and Weighting Source Data Points in the Shared Subspaces

Transfer learning is a problem defined over two domains. These two domains share the same feature space and class label space, but have significantly different distributions. One domain has sufficient labels, named as source domain, and the other domain has few labels, named as target do- main. The problem is to learn a effective classifier for the target domain. In this paper, we propose a novel transfer learning method for this problem by learning a partially shared classifier for the target domain, and weighting the source domain data points. We learn some shared subspaces for both the data points of the two domains, and a shared classifier in the shared subspaces. We hope that in the shared subspaces, the distributions of two domain can match each other well, and to match the distributions, we weight the source domain data points with different weighting factors. Moreover, we adapt the shared classifier to each domain by learning different adaptation functions. To learn the subspace transformation matrices, the classifier parameters, and the adaptation parameters, we build a objective function with weighted clas- sification errors, parameter regularization, local reconstruction regularization, and distribution matching. This objective function is minimized by an itera- tive algorithm. Experiments show its effectiveness over benchmark data sets, including travel destination review data set, face expression data set, spam email data set, etc.

preprint2016arXiv

Cross-response in correlated financial markets: individual stocks

Previous studies of the stock price response to trades focused on the dynamics of single stocks, i.e. they addressed the self-response. We empirically investigate the price response of one stock to the trades of other stocks in a correlated market, i.e. the cross-responses. How large is the impact of one stock on others and vice versa? -- This impact of trades on the price change across stocks appears to be transient instead of permanent as we discuss from the viewpoint of market efficiency. Furthermore, we compare the self-responses on different scales and the self- and cross-responses on the same scale. We also find that the cross-correlation of the trade signs turns out to be a short-memory process.

preprint2016arXiv

Distributed Power Allocation and Channel Access Probability Assignment for Cognitive Radio

In this paper, we present a framework for distributively optimizing the transmission strategies of secondary users in an ad hoc cognitive radio network. In particular, the proposed approach allows secondary users to set their transmit powers and channel access probabilities such that, on average, the quality of service of both the primary and secondary networks are satisfied. The system under consideration assumes several primary and secondary transceiver pairs and assumes no cooperation or information exchange between neither primary and secondary users nor among secondary users. The outage probability, and consequently the connection probability, is derived for the system and is used in defining a new performance metric in the optimization problem using tools from stochastic geometry. We refer to this metric as the spatial density of successful transmission. We corroborate our derivations through numerical evaluations. We further demonstrate that even in the absence of any form of cooperation, an acceptable quality of service can be attained in the cognitive radio environment.

preprint2016arXiv

On Spectrum and Infrastructure Sharing in Multi-Operator Cellular Networks

In this paper, we introduce a mathematical framework for analyzing and optimizing multi-operator cellular networks that are allowed to share spectrum licenses and infrastructure elements. The proposed approach exploits stochastic geometry for modeling the locations of cellular base stations and for computing the aggregate average rate. The trade-offs that emerge from sharing spectrum frequencies and cellular base stations are quantified and discussed.

preprint2016arXiv

Price response in correlated financial markets: empirical results

Previous studies of the stock price response to individual trades focused on single stocks. We empirically investigate the price response of one stock to the trades of other stocks. How large is the impact of one stock on others and vice versa? -- This impact of trades on the price change across stocks appears to be transient instead of permanent. Performing different averages, we distinguish active and passive responses. The two average responses show different characteristic dependences on the time lag. The passive response exhibits a shorter response period with sizeable volatilities, and the active response a longer period. We also study the response for a given stock with respect to different sectors and to the whole market. Furthermore, we compare the self-response with the various cross-responses. The correlation of the trade signs is a short-memory process for a pair of stocks, but it turns into a long-memory process when averaged over different pairs of stocks.

preprint2015arXiv

Directional antennas improve the link-connectivity of interference limited ad hoc networks

We study wireless ad hoc networks in the absence of any channel contention or transmit power control and ask how antenna directivity affects network connectivity in the interference limited regime. We answer this question by deriving closed-form expressions for the outage probability, capacity and mean node degree of the network using tools from stochastic geometry. These novel results provide valuable insights for the design of future ad hoc networks. Significantly, our results suggest that the more directional the interfering transmitters are, the less detrimental are the effects of interference to individual links. We validate our analytical results through computer simulations.

preprint2015arXiv

Er3+-doped Na0.5Bi0.5TiO3 Ferroelectric Thin Films with Enhanced Electrical Properties and Strong Green Up-conversion Luminescence

Ferroelectric materials with up-conversion luminescence (UCL) properties have potential opto-electric applications for display and sensing etc. Here, we demonstrate strong green UCL and enhanced electrical properties in Er3+-doped Na0.5Bi0.5TiO3 thin films. The thin films are prepared via using a modified chemical solution deposition method. These thin films are phase-pure and crystallized in perovskite structure. The largest remnant polarization (Pr) and highest dielectric constant are obtained from Na0.5Bi0.49Er0.01TiO3 thin films, and their values are 0.22 C/m2 and 1166, respectively. Meanwhile, strong green UCL at 525 nm and 548 nm are observed in Er3+-doped thin films. They are attributed to 2H11/2 to 4I15/2 and 4S3/2 to 4I15/2 transitions of Er3+ ions. These thin films have potentials in optoelectrical device applications.

preprint2015arXiv

Location, location, location: Border effects in interference limited ad hoc networks

Wireless networks are fundamentally limited by the intensity of the received signals and by their inherent interference. It is shown here that in finite ad hoc networks where node placement is modelled according to a Poisson point process and no carrier sensing is employed for medium access, the SINR received by nodes located at the border of the network deployment/operation region is on average greater than the rest. This is primarily due to the uneven interference landscape of such networks which is particularly kind to border nodes giving rise to all sorts of performance inhomogeneities and access unfairness. Using tools from stochastic geometry we quantify these spatial variations and provide closed form communication-theoretic results showing why the receiver's location is so important.

preprint2014arXiv

Partial Penalized Likelihood Ratio Test under Sparse Case

This work is concern with testing the low-dimensional parameters of interest with divergent dimensional data and variable selection for the rest under the sparse case. A consistent test via the partial penalized likelihood approach, called the partial penalized likelihood ratio test statistic is derived, and its asymptotic distributions under the null hypothesis and the local alternatives of order $n^{-1/2}$ are obtained under some regularity conditions. Meanwhile, the oracle property of the partial penalized likelihood estimator also holds. The proposed partial penalized likelihood ratio test statistic outperforms the full penalized likelihood ratio test statistic in term of size and power, and performs as well as the classical likelihood ratio test statistic. Moreover, the proposed method obtains the variable selection results as well as the p-values of testing. Numerical simulations and an analysis of Prostate Cancer data confirm our theoretical findings and demonstrate the promising performance of the proposed partial penalized likelihood in hypothesis testing and variable selection.

preprint2012arXiv

Exploiting Channel Correlation and PU Traffic Memory for Opportunistic Spectrum Scheduling

We consider a cognitive radio network with multiple primary users (PUs) and one secondary user (SU), where a spectrum server is utilized for spectrum sensing and scheduling the SU to transmit over one of the PU channels opportunistically. One practical yet challenging scenario is when \textit{both} the PU occupancy and the channel fading vary over time and exhibit temporal correlations. Little work has been done for exploiting such temporal memory in the channel fading and the PU occupancy simultaneously for opportunistic spectrum scheduling. A main goal of this work is to understand the intricate tradeoffs resulting from the interactions of the two sets of system states - the channel fading and the PU occupancy, by casting the problem as a partially observable Markov decision process. We first show that a simple greedy policy is optimal in some special cases. To build a clear understanding of the tradeoffs, we then introduce a full-observation genie-aided system, where the spectrum server collects channel fading states from all PU channels. The genie-aided system is used to decompose the tradeoffs in the original system into multiple tiers, which are examined progressively. Numerical examples indicate that the optimal scheduler in the original system, with observation on the scheduled channel only, achieves a performance very close to the genie-aided system. Further, as expected, the optimal policy in the original system significantly outperforms randomized scheduling, pointing to the merit of exploiting the temporal correlation structure in both channel fading and PU occupancy.

preprint2010arXiv

Spectrum Shaping via Network Coding in Cognitive Radio Networks

We consider a cognitive radio network where primary users (PUs) employ network coding for data transmissions. We view network coding as a spectrum shaper, in the sense that it increases spectrum availability to secondary users (SUs) and offers more structure of spectrum holes that improves the predictability of the primary spectrum. With this spectrum shaping effect of network coding, each SU can carry out adaptive channel sensing by dynamically updating the list of the PU channels predicted to be idle while giving priority to these channels when sensing. This dynamic spectrum access approach with network coding improves how SUs detect and utilize temporal spectrum holes over PU channels. Our results show that compared to the existing approaches based on retransmission, both PUs and SUs can achieve higher stable throughput, thanks to the spectrum shaping effect of network coding.

Shanshan Wang

What is connected

Connect this record

See the researcher in context

Building this map preview

43 published item(s)

MPM-LLM4DSE: Reaching the Pareto Frontier in HLS with Multimodal Learning and LLM-Driven Exploration

Towards Unified Surgical Scene Understanding:Bridging Reasoning and Grounding via MLLMs

AID-DTI: Accelerating High-fidelity Diffusion Tensor Imaging with Detail-Preserving Model-based Deep Learning

LESEN: Label-Efficient deep learning for Multi-parametric MRI-based Visual Pathway Segmentation

MLIP: Medical Language-Image Pre-training with Masked Local Representation Learning

Modality Exchange Network for Retinogeniculate Visual Pathway Segmentation

Simultaneous q-Space Sampling Optimization and Reconstruction for Fast and High-fidelity Diffusion Magnetic Resonance Imaging

Artificial Intelligence Enabled Spectral Reconfigurable Fiber Laser

Collective behavior in the North Rhine-Westphalia motorway network

DASP: Defect and Dopant ab-initio Simulation Package

Expert Knowledge-guided Geometric Representation Learning for Magnetic Resonance Imaging-based Glioma Grading

K-space and Image Domain Collaborative Energy based Model for Parallel MRI Reconstruction

Mining Function Homology of Bot Loaders from Honeypot Logs

Multi-Weight Respecification of Scan-specific Learning for Parallel Imaging

Paying More Attention to Self-attention: Improving Pre-trained Language Models via Attention Guiding

Rethinking the optimization process for self-supervised model-driven MRI reconstruction

SelfCoLearn: Self-supervised collaborative learning for accelerating dynamic MR imaging

Spatial Correlation Analysis of Traffic Flow on Parallel Motorways in Germany

Specificity-Preserving Federated Learning for MR Image Reconstruction

Universal Generative Modeling for Calibration-free Parallel Mr Imaging

Variable Augmented Network for Invertible MR Coil Compression

A coarse-to-fine framework for unsupervised multi-contrast MR image deformable registration with dual consistency constraint

A Curated Dataset of Urban Scenes for Audio-Visual Scene Analysis

Homotopic Gradients of Generative Density Priors for MR Image Reconstruction

Bounding boxes for weakly supervised segmentation: Global constraints get close to full supervision

Parameter-Transferred Wasserstein Generative Adversarial Network (PT-WGAN) for Low-Dose PET Image Denoising

Self-adaptive Re-weighted Adversarial Domain Adaptation

CLCI-Net: Cross-Level fusion and Context Inference Networks for Lesion Segmentation of Chronic Stroke

D-UNet: a dimension-fusion U shape network for chronic stroke lesion segmentation

X-Net: Brain Stroke Lesion Segmentation Based on Depthwise Separable Convolution and Long-range Dependencies

Object Activity Scene Description, Construction and Recognition

Average cross-responses in correlated financial market

Cross Domain Adaptation by Learning Partially Shared Classifiers and Weighting Source Data Points in the Shared Subspaces

Cross-response in correlated financial markets: individual stocks

Distributed Power Allocation and Channel Access Probability Assignment for Cognitive Radio

On Spectrum and Infrastructure Sharing in Multi-Operator Cellular Networks

Price response in correlated financial markets: empirical results

Directional antennas improve the link-connectivity of interference limited ad hoc networks

Er3+-doped Na0.5Bi0.5TiO3 Ferroelectric Thin Films with Enhanced Electrical Properties and Strong Green Up-conversion Luminescence

Location, location, location: Border effects in interference limited ad hoc networks

Partial Penalized Likelihood Ratio Test under Sparse Case

Exploiting Channel Correlation and PU Traffic Memory for Opportunistic Spectrum Scheduling

Spectrum Shaping via Network Coding in Cognitive Radio Networks