Source author record

Yihang Liu

Yihang Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci Artificial Intelligence cond-mat.mes-hall eess.IV Machine Learning

Catalog footprint

What is connected

7works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Evading Visual Aphasia: Contrastive Adaptive Semantic Token Pruning for Vision-Language Models

Are low-attention visual tokens truly redundant in vision-language reasoning? Existing pruning methods often assume so, ranking visual tokens by shallow text-to-image attention and discarding low-scoring patches to accelerate LVLM inference. We show that this scalar criterion is unreliable for compositional reasoning: tokens ignored in early layers can later become essential for resolving secondary objects, spatial relations, and contextual cues. Premature pruning can therefore induce Visual Aphasia, a failure mode in which the model loses visual grounding and falls back on language priors. We introduce COAST (COntrastive Adaptive Semantic Token Pruning), a training-free pruning framework that casts compression as adaptive semantic routing. COAST uses native cross-modal attention to identify query-specific anchors and estimate contextual dispersion via attention entropy, then adapts the retention trade-off between semantic evidence and spatial context. It further uses a contrastive routing score to preserve both anchor-aligned evidence and complementary spatial context. Across seven benchmarks, COAST reduces visual tokens by 77.8% and achieves a 2.15x latency speedup while retaining 98.64% of the original average performance. Beyond a single backbone or compression setting, COAST consistently outperforms strong pruning baselines across token budgets and generalizes across multiple LVLM families, showing that adaptive semantic routing is a robust alternative to one-shot scalar pruning

preprint2026arXiv

PRISM: Iterative Cross-Modal Posterior Refinement for Dynamic Text-Attributed Graphs

Dynamic text-attributed graphs (DyTAGs) provide a powerful framework for modeling evolving systems in which node semantics and time-dependent interactions are tightly coupled. Recently, multimodal learning has emerged as a promising yet underexplored direction for enhancing DyTAG representation learning. However, existing methods typically rely on rigid modality partitions and one-shot fusion strategies, which limit their ability to capture the intrinsic and evolving dependencies between node semantics and interaction behaviors. To address these limitations, we propose \textbf{PRISM}, an iterative cross-modal posterior refinement framework for DyTAG representation learning. PRISM organizes DyTAG information into semantic and behavioral modalities, providing a more intrinsic alternative to carrier-level modality partitions. Instead of fusing the two modalities in a single step, PRISM learns a refinement trajectory that progressively transforms semantic priors into behavior-conditioned posterior states through cross-modal interaction with behavioral evidence. Extensive experiments on DTGB benchmark datasets show that PRISM achieves strong performance on temporal link prediction and destination node retrieval tasks. Further ablation studies validate the effectiveness of semantic--behavioral modeling and iterative posterior refinement.

preprint2023arXiv

Hierarchical Dynamic Masks for Visual Explanation of Neural Networks

Saliency methods generating visual explanatory maps representing the importance of image pixels for model classification is a popular technique for explaining neural network decisions. Hierarchical dynamic masks (HDM), a novel explanatory maps generation method, is proposed in this paper to enhance the granularity and comprehensiveness of saliency maps. First, we suggest the dynamic masks (DM), which enables multiple small-sized benchmark mask vectors to roughly learn the critical information in the image through an optimization method. Then the benchmark mask vectors guide the learning of large-sized auxiliary mask vectors so that their superimposed mask can accurately learn fine-grained pixel importance information and reduce the sensitivity to adversarial perturbations. In addition, we construct the HDM by concatenating DM modules. These DM modules are used to find and fuse the regions of interest in the remaining neural network classification decisions in the mask image in a learning-based way. Since HDM forces DM to perform importance analysis in different areas, it makes the fused saliency map more comprehensive. The proposed method outperformed previous approaches significantly in terms of recognition and localization capabilities when tested on natural and medical datasets.

preprint2022arXiv

MDM: Multiple Dynamic Masks for Visual Explanation of Neural Networks

The Class Activation Map (CAM) lookup of a neural network tells us to which regions the neural network focuses when it makes a decision. In the past, the CAM search method was dependent upon a specific internal module of the network. It has specific constraints on the structure of the neural network. To make the search of CAM have generality and high performance. We propose a learning-based algorithm, namely Multiple Dynamic Masks (MDM). It is based on a public cognition that only active features of a picture related to classification will affect the classification results of the neural network, and other features will hardly affect the classification results of the network. The mask generated by MDM conforms to the above cognition. It trains mask vectors of different sizes by constraining mask values and activating consistency, then it uses stacking masks of different scale to generate CAM that can balance spatial information and semantic information. Comparing the results of MDM with those of the recent advanced CAM search method, the performance of MDM has reached the state of the art results. We applied the MDM method to the interpretable neural networks ProtoPNet and XProtoNet, which improved the performance of model in the explainable prototype search. Finally, we visualized the CAM generation effect of MDM on neural networks of different architectures, verifying the generality of the MDM method.

preprint2020arXiv

High-speed and high-efficiency three-dimensional shape measurement based on Gray-coded light

Fringe projection profilometry has been increasingly sought and applied in dynamic three-dimensional (3D) shape measurement. In this work, a robust and high-efficiency 3D measurement based on Gray-code light is proposed. Unlike the traditional method, a novel tripartite phase unwrapping method is proposed to avoid the jump errors on the boundary of code words, which are mainly caused by the defocusing of the projector and the motion of the tested object. Subsequently, the time-overlapping coding strategy is presented to greatly increase the coding efficiency, decreasing the projected number in each group, e.g. from 7 (3 + 4) to 4 (3 + 1) for one restored 3D frame. Combination of two proposed techniques allows to reconstruct a pixel-wise and unambiguous 3D geometry of dynamic scenes with strong noise using every 4 projected patterns. The presented techniques preserve the high anti-noise ability of Gray-coded-based method while overcoming the drawbacks of jump errors and low coding efficiency. Experiments have demonstrated that the proposed method can achieve the robust and high-efficiency 3D shape measurement of high-speed dynamic scenes even polluted by strong noise.

preprint2016arXiv

High Performance WSe2 Field-Effect Transistors via Controlled Formation of In-Plane Heterojunctions

Monolayer WSe2 is a two dimensional (2D) semiconductor with a direct bandgap, and it has been recently explored as a promising material for electronics and optoelectronics. Low field effect mobility is the main constraint preventing WSe2 from becoming one of the competing channel materials for field-effect transistors (FETs). Recent results have demonstrated that chemical treatments can modify the electrical properties of transition metal dichalcogenides (TMDCs) including MoS2 and WSe2. Here, we report that controlled heating in air significantly improves device performance of WSe2 FETs in terms of on-state currents and field-effect mobilities. Specifically, after heating at optimized conditions, chemical vapor deposition grown monolayer WSe2 FETs showed an average FET mobility of 31 cm2/Vs and on/off current ratios up to 5*108. For few-layer WSe2 FETs, after the same treatment applied, we achieved a high mobility up to 92 cm2/Vs. These values are significantly higher than FETs fabricated using as-grown WSe2 flakes without heating treatment, demonstrating the effectiveness of air heating on the performance improvements of WSe2 FETs. The underlying chemical processes involved during air heating and the formation of in-plane heterojunctions of WSe2 and WO3-x, which is believed to be the reason for the improved FET performance, were studied by spectroscopy and transmission electron microscopy. We further demonstrated that by combining air heating method developed in this work with supporting 2D materials on BN substrate, we achieved a noteworthy field effect mobility of 83 cm2/Vs for monolayer WSe2 FETs. This work is a step towards controlled modification of the properties of WSe2 and potentially other TMDCs, and may greatly improve device performance for future applications of 2D materials in electronics and optoelectronics.

preprint2014arXiv

Screw-Dislocation-Driven Growth of Two-Dimensional Few-Layer and Pyramid-Like WSe2 by Sulfur-Assisted Chemical Vapor Deposition

Two-dimensional (2D) layered tungsten diselenides (WSe2) material has recently drawn a lot of attention due to its unique optoelectronic properties and ambipolar transport behavior. However, direct chemical vapor deposition (CVD) synthesis of 2D WSe2 is not as straightforward as other 2D materials due to the low reactivity between reactants in WSe2 synthesis. In addition, the growth mechanism of WSe2 in such CVD process remains unclear. Here we report the observation of a screw-dislocation-driven (SDD) spiral growth of 2D WSe2 flakes and pyramid-like structures using a sulfur-assisted CVD method. Few-layer and pyramid-like WSe2 flakes instead of monolayer were synthesized by introducing a small amount of sulfur as a reducer to help the selenization of WO3, which is the precursor of tungsten. Clear observations of steps, helical fringes, and herring-bone contours under atomic force microscope characterization reveal the existence of screw dislocations in the as-grown WSe2. The generation and propagation mechanisms of screw dislocations during the growth of WSe2 were discussed. Back-gated field-effect transistors were made on these 2D WSe2 materials, which show on/off current ratios of 106 and mobility up to 44 cm2/Vs.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint