Source author record

Yifan He

Yifan He appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Computation and Language Machine Learning Neural and Evolutionary Computing Computer Vision eess.SP eess.SY Information Retrieval math.OC Quantitative Methods Systems and Control

Catalog footprint

What is connected

8works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

Optical Chemical Structure Recognition (OCSR) is essential for converting molecular images into machine-readable formats. While recent vision-language models (VLMs) have shown promise, their image-captioning approach often struggles with complex molecular structures and inconsistent annotations. To address these issues, we introduce GTR-VL, featuring two key innovations: (1) the \textit{Graph Traversal as Visual Chain of Thought} mechanism that emulates human reasoning by incrementally parsing molecular graphs through sequential atom-bond predictions, and (2) the data-centric \textit{Faithfully Recognize What You've Seen} principle, which aligns abbreviated structures in images with their expanded annotations. For hand-drawn OCSR tasks, where datasets lack graph annotations and only provide final SMILES, we apply reinforcement learning using the GRPO method, introducing reward mechanisms like format reward, graph reward, and SMILES reward. This approach significantly enhances performance in hand-drawn recognition tasks through weak supervision. We developed GTR-1.3M, a large-scale instruction-tuning dataset with corrected annotations, and MolRec-Bench, the first benchmark for fine-grained evaluation of graph-parsing accuracy in OCSR. Our two-stage training scheme involves SFT training for printed images and the GRPO method for transferring capabilities to hand-drawn tasks. Experiments show that GTR-VL outperforms specialist models, chemistry-domain VLMs, and commercial VLMs on both printed and hand-drawn datasets.

preprint2025arXiv

Adaptive Clutter Suppression via Convex Optimization

Passive and bistatic radar systems are often limited by strong clutter and direct-path interference that mask weak moving targets. Conventional cancellation methods such as the extensive cancellation algorithm require careful tuning and can distort the delay-Doppler response. This paper introduces a convex optimization framework that adaptively synthesizes per-cell delay-Doppler filters to suppress clutter while preserving the canonical cross-ambiguity function (CAF). The approach formulates a quadratic program that minimizes distortion of the CAF surface subject to linear clutter-suppression constraints, eliminating the need for a separate cancellation stage. Monte Carlo simulations using common communication waveforms demonstrate strong clutter suppression, accurate CFAR calibration, and major detection-rate gains over the classical CAF. The results highlight a scalable, CAF-faithful method for adaptive clutter mitigation in passive radar.

preprint2022arXiv

Knowledge-Driven Program Synthesis via Adaptive Replacement Mutation and Auto-constructed Subprogram Archives

We introduce Knowledge-Driven Program Synthesis (KDPS) as a variant of the program synthesis task that requires the agent to solve a sequence of program synthesis problems. In KDPS, the agent should use knowledge from the earlier problems to solve the later ones. We propose a novel method based on PushGP to solve the KDPS problem, which takes subprograms as knowledge. The proposed method extracts subprograms from the solution of previously solved problems by the Even Partitioning (EP) method and uses these subprograms to solve the upcoming programming task using Adaptive Replacement Mutation (ARM). We call this method PushGP+EP+ARM. With PushGP+EP+ARM, no human effort is required in the knowledge extraction and utilization processes. We compare the proposed method with PushGP, as well as a method using subprograms manually extracted by a human. Our PushGP+EP+ARM achieves better train error, success count, and faster convergence than PushGP. Additionally, we demonstrate the superiority of PushGP+EP+ARM when consecutively solving a sequence of six program synthesis problems.

preprint2021arXiv

Machine Learning for Electronic Design Automation: A Survey

With the down-scaling of CMOS technology, the design complexity of very large-scale integrated (VLSI) is increasing. Although the application of machine learning (ML) techniques in electronic design automation (EDA) can trace its history back to the 90s, the recent breakthrough of ML and the increasing complexity of EDA tasks have aroused more interests in incorporating ML to solve EDA tasks. In this paper, we present a comprehensive review of existing ML for EDA studies, organized following the EDA hierarchy.

preprint2020arXiv

Deep Interleaved Network for Image Super-Resolution With Asymmetric Co-Attention

Recently, Convolutional Neural Networks (CNN) based image super-resolution (SR) have shown significant success in the literature. However, these methods are implemented as single-path stream to enrich feature maps from the input for the final prediction, which fail to fully incorporate former low-level features into later high-level features. In this paper, to tackle this problem, we propose a deep interleaved network (DIN) to learn how information at different states should be combined for image SR where shallow information guides deep representative features prediction. Our DIN follows a multi-branch pattern allowing multiple interconnected branches to interleave and fuse at different states. Besides, the asymmetric co-attention (AsyCA) is proposed and attacked to the interleaved nodes to adaptively emphasize informative features from different states and improve the discriminative ability of networks. Extensive experiments demonstrate the superiority of our proposed DIN in comparison with the state-of-the-art SR methods.

preprint2020arXiv

Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models

Many business documents processed in modern NLP and IR pipelines are visually rich: in addition to text, their semantics can also be captured by visual traits such as layout, format, and fonts. We study the problem of information extraction from visually rich documents (VRDs) and present a model that combines the power of large pre-trained language models and graph neural networks to efficiently encode both textual and visual information in business documents. We further introduce new fine-tuning objectives to improve in-domain unsupervised fine-tuning to better utilize large amount of unlabeled in-domain data. We experiment on real world invoice and resume data sets and show that the proposed method outperforms strong text-based RoBERTa baselines by 6.3% absolute F1 on invoices and 4.7% absolute F1 on resumes. When evaluated in a few-shot setting, our method requires up to 30x less annotation data than the baseline to achieve the same level of performance at ~90% F1.

preprint2020arXiv

Solving Portfolio Optimization Problems Using MOEA/D and Levy Flight

Portfolio optimization is a financial task which requires the allocation of capital on a set of financial assets to achieve a better trade-off between return and risk. To solve this problem, recent studies applied multi-objective evolutionary algorithms (MOEAs) for its natural bi-objective structure. This paper presents a method injecting a distribution-based mutation method named Lévy Flight into a decomposition based MOEA named MOEA/D. The proposed algorithm is compared with three MOEA/D-like algorithms, NSGA-II, and other distribution-based mutation methods on five portfolio optimization benchmarks sized from 31 to 225 in OR library without constraints, assessing with six metrics. Numerical results and statistical test indicate that this method can outperform comparison methods in most cases. We analyze how Levy Flight contributes to this improvement by promoting global search early in the optimization. We explain this improvement by considering the interaction between mutation method and the property of the problem.

preprint2015arXiv

Jointly Embedding Relations and Mentions for Knowledge Population

This paper contributes a joint embedding model for predicting relations between a pair of entities in the scenario of relation inference. It differs from most stand-alone approaches which separately operate on either knowledge bases or free texts. The proposed model simultaneously learns low-dimensional vector representations for both triplets in knowledge repositories and the mentions of relations in free texts, so that we can leverage the evidence both resources to make more accurate predictions. We use NELL to evaluate the performance of our approach, compared with cutting-edge methods. Results of extensive experiments show that our model achieves significant improvement on relation extraction.

Yifan He

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

GTR-CoT: Graph Traversal as Visual Chain of Thought for Molecular Structure Recognition

Adaptive Clutter Suppression via Convex Optimization

Knowledge-Driven Program Synthesis via Adaptive Replacement Mutation and Auto-constructed Subprogram Archives

Machine Learning for Electronic Design Automation: A Survey

Deep Interleaved Network for Image Super-Resolution With Asymmetric Co-Attention

Robust Layout-aware IE for Visually Rich Documents with Pre-trained Language Models

Solving Portfolio Optimization Problems Using MOEA/D and Levy Flight

Jointly Embedding Relations and Mentions for Knowledge Population