Source author record

Tao Lu

Tao Lu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision physics.optics Cryptography and Security eess.IV Machine Learning physics.app-ph physics.comp-ph

Catalog footprint

What is connected

12works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Towards Trustworthy Dermatology MLLMs: A Benchmark and Multimodal Evaluator for Diagnostic Narratives

Multimodal large language models (LLMs) are increasingly used to generate dermatology diagnostic narratives directly from images. However, reliable evaluation remains the primary bottleneck for responsible clinical deployment. We introduce a novel evaluation framework that combines DermBench, a meticulously curated benchmark, with DermEval, a robust automatic evaluator, to enable clinically meaningful, reproducible, and scalable assessment. We build DermBench, which pairs 4,000 real-world dermatology images with expert-certified diagnostic narratives and uses an LLM-based judge to score candidate narratives across clinically grounded dimensions, enabling consistent and comprehensive evaluation of multimodal models. For individual case assessment, we train DermEval, a reference-free multimodal evaluator. Given an image and a generated narrative, DermEval produces a structured critique along with an overall score and per-dimension ratings. This capability enables fine-grained, per-case analysis, which is critical for identifying model limitations and biases. Experiments on a diverse dataset of 4,500 cases demonstrate that DermBench and DermEval achieve close alignment with expert ratings, with mean deviations of 0.251 and 0.117 (out of 5), respectively, providing reliable measurement of diagnostic ability and trustworthiness across different multimodal LLMs.

preprint2022arXiv

APP-Net: Auxiliary-point-based Push and Pull Operations for Efficient Point Cloud Classification

Aggregating neighbor features is essential for point cloud classification. In the existing work, each point in the cloud may inevitably be selected as the neighbors of multiple aggregation centers, as all centers will gather neighbor features from the whole point cloud independently. Thus each point has to participate in the calculation repeatedly and generates redundant duplicates in the memory, leading to intensive computation costs and memory consumption. Meanwhile, to pursue higher accuracy, previous methods often rely on a complex local aggregator to extract fine geometric representation, which further slows down the classification pipeline. To address these issues, we propose a new local aggregator of linear complexity for point cloud classification, coined as APP. Specifically, we introduce an auxiliary container as an anchor to exchange features between the source point and the aggregating center. Each source point pushes its feature to only one auxiliary container, and each center point pulls features from only one auxiliary container. This avoids the re-computation issue of each source point. To facilitate the learning of the local structure of cloud point, we use an online normal estimation module to provide the explainable geometric information to enhance our APP modeling capability. Our built network is more efficient than all the previous baselines with a clear margin while still consuming a lower memory. Experiments on both synthetic and real datasets demonstrate that APP-Net reaches comparable accuracies to other networks. It can process more than 10,000 samples per second with less than 10GB of memory on a single GPU. We will release the code in https://github.com/MCG-NJU/APP-Net.

preprint2022arXiv

CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation

In this paper, we study the problem of jointly estimating the optical flow and scene flow from synchronized 2D and 3D data. Previous methods either employ a complex pipeline that splits the joint task into independent stages, or fuse 2D and 3D information in an "early-fusion" or "late-fusion" manner. Such one-size-fits-all approaches suffer from a dilemma of failing to fully utilize the characteristic of each modality or to maximize the inter-modality complementarity. To address the problem, we propose a novel end-to-end framework, called CamLiFlow. It consists of 2D and 3D branches with multiple bidirectional connections between them in specific layers. Different from previous work, we apply a point-based 3D branch to better extract the geometric features and design a symmetric learnable operator to fuse dense image features and sparse point features. Experiments show that CamLiFlow achieves better performance with fewer parameters. Our method ranks 1st on the KITTI Scene Flow benchmark, outperforming the previous art with 1/7 parameters. Code is available at https://github.com/MCG-NJU/CamLiFlow.

preprint2022arXiv

Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis

Through a study of multi-gas mixture datasets, we show that in multi-component spectral analysis, the number of functional or non-functional principal components required to retain the essential information is the same as the number of independent constituents in the mixture set. Due to the mutual in-dependency among different gas molecules, near one-to-one projection from the principal component to the mixture constituent can be established, leading to a significant simplification of spectral quantification. Further, with the knowledge of the molar extinction coefficients of each constituent, a complete principal component set can be extracted from the coefficients directly, and few to none training samples are required for the learning model. Compared to other approaches, the proposed methods provide fast and accurate spectral quantification solutions with a small memory size needed.

preprint2022arXiv

GLSD: The Global Large-Scale Ship Database and Baseline Evaluations

In this paper, we introduce a challenging global large-scale ship database (called GLSD), designed specifically for ship detection tasks. The designed GLSD database includes a total of 212,357 annotated instances from 152,576 images. Based on the collected images, we propose 13 ship categories that widely exist in international routes. These categories include Sailing boat, Fishing boat, Passenger ship, Warship, General cargo ship, Container ship, Bulk cargo carrier, Barge, Ore carrier, Speed boat, Canoe, Oil carrier, and Tug. The motivations of developing GLSD include the following: 1) providing a refine and extensive ship detection database that benefits the object detection community, 2) establishing a database with exhaustive labels (bounding boxes and ship class categories) in a uniform classification scheme, and 3) providing a large-scale ship database with geographic information (covering more than 3000 ports and 33 routes) that benefits multi-modal analysis. In addition, we discuss the evaluation protocols corresponding to image characteristics in GLSD and analyze the performance of selected state-of-the-art object detection algorithms on GSLD, aiming to establish baselines for future studies. More information regarding the designed GLSD can be found at https://github.com/jiaming-wang/GLSD.

preprint2021arXiv

SSCAN: A Spatial-spectral Cross Attention Network for Hyperspectral Image Denoising

Hyperspectral images (HSIs) have been widely used in a variety of applications thanks to the rich spectral information they are able to provide. Among all HSI processing tasks, HSI denoising is a crucial step. Recently, deep learning-based image denoising methods have made great progress and achieved great performance. However, existing methods tend to ignore the correlations between adjacent spectral bands, leading to problems such as spectral distortion and blurred edges in denoised results. In this study, we propose a novel HSI denoising network, termed SSCAN, that combines group convolutions and attention modules. Specifically, we use a group convolution with a spatial attention module to facilitate feature extraction by directing models' attention to band-wise important features. We propose a spectral-spatial attention block (SSAB) to exploit the spatial and spectral information in hyperspectral images in an effective manner. In addition, we adopt residual learning operations with skip connections to ensure training stability. The experimental results indicate that the proposed SSCAN outperforms several state-of-the-art HSI denoising algorithms.

preprint2020arXiv

Feature-Driven Super-Resolution for Object Detection

Although some convolutional neural networks (CNNs) based super-resolution (SR) algorithms yield good visual performances on single images recently. Most of them focus on perfect perceptual quality but ignore specific needs of subsequent detection task. This paper proposes a simple but powerful feature-driven super-resolution (FDSR) to improve the detection performance of low-resolution (LR) images. First, the proposed method uses feature-domain prior which extracts from an existing detector backbone to guide the HR image reconstruction. Then, with the aligned features, FDSR update SR parameters for better detection performance. Comparing with some state-of-the-art SR algorithms with 4$\times$ scale factor, FDSR outperforms the detection performance mAP on MS COCO validation, VOC2007 databases with good generalization to other detection networks.

preprint2019arXiv

Efficient Electro-optical Tuning of Optical Frequency Microcomb on a Monolithically Integrated High-Q Lithium Niobate Microdisk

We demonstrate efficient tuning of a monolithically integrated lithium niobate microdisk (LN) optical frequency microcomb. Utilizing the high optical quality (Q) factor (i.e., Q~7.1*10^6) of the microdisk, the microcomb spans over a spectral bandwidth of ~200 nm at a pump power as low as 20.4 mW. Combining the large eletro-optic coefficient of LN and optimum design of the geometry of microelectrodes, we demonstrate electro-optical tuning of the comb with a spectral range of 400 pm and a tuning efficiency of ~38 pm/100V.

preprint2016arXiv

Hierarchical Online Intrusion Detection for SCADA Networks

We propose a novel hierarchical online intrusion detection system (HOIDS) for supervisory control and data acquisition (SCADA) networks based on machine learning algorithms. By utilizing the server-client topology while keeping clients distributed for global protection, high detection rate is achieved with minimum network impact. We implement accurate models of normal-abnormal binary detection and multi-attack identification based on logistic regression and quasi-Newton optimization algorithm using the Broyden-Fletcher-Goldfarb-Shanno approach. The detection system is capable of accelerating detection by information gain based feature selection or principle component analysis based dimension reduction. By evaluating our system using the KDD99 dataset and the industrial control system dataset, we demonstrate that HOIDS is highly scalable, efficient and cost effective for securing SCADA infrastructures.

preprint2014arXiv

Generalized Full-Vector Multi-Mode Matching Analysis of Whispering-Gallery Microcavities

We outline a full-vectorial three-dimensional multi-mode matching technique in a cylindrical coordinate system that addresses the mutual coupling among multiple modes copropagating in a perturbed whispering-gallery-mode microcavity. In addition to its superior accuracy in respect to our previously implemented single-mode matching technique, this current technique is suitable for modelling waveguide-to-cavity coupling where the influence of multi-mode coupling is non-negligible. Using this methodology, a robust scheme for hybrid integration of a microcavity onto a silicon-on-insulator platform is proposed.

preprint2014arXiv

Highly Efficient Boundary Element Analysis of Whispering Gallery Microcavities

We demonstrate that the efficiency of the boundary element whispering gallery microcavity analysis can be improved by orders of magnitude with the inclusion of Fresnel approximation. Using this formulation, simulation of a microdisk with wave-number-radius product as large as $kR\approx8,000$ was demonstrated in contrast to a previous record of $kR\approx100$. In addition to its high accuracy on computing the modal field distribution and resonance wavelength, this method yields a relative error of $10%$ in calculating the quality factor as high as $10^{11}$ through a direct root searching method where the conventional boundary element method failed to achieve. Finally, quadrupole shaped cavities and double disks as large as $100 μm$ in diameter were modeled by employing as few as $512$ boundary elements whilst the simulation of such large cavities using conventional boundary element method were not reported previously.

preprint2013arXiv

Cylindrical Beam Propagation Modelling of Perturbed Whispering-Gallery Mode Microcavities

We simulate light propagation in perturbed whispering-gallery mode microcavities using a two-dimensional finite-difference beam prop- agation method in a cylindrical coordinate system. Optical properties of whispering-gallery microcavities perturbed by polystyrene nanobeads are investigated through this formulation. The light perturbation as well as quality factor degradation arising from cavity ellipticity are also studied.

Tao Lu

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Towards Trustworthy Dermatology MLLMs: A Benchmark and Multimodal Evaluator for Diagnostic Narratives

APP-Net: Auxiliary-point-based Push and Pull Operations for Efficient Point Cloud Classification

CamLiFlow: Bidirectional Camera-LiDAR Fusion for Joint Optical Flow and Scene Flow Estimation

Essential Number of Principal Components and Nearly Training-Free Model for Spectral Analysis

GLSD: The Global Large-Scale Ship Database and Baseline Evaluations

SSCAN: A Spatial-spectral Cross Attention Network for Hyperspectral Image Denoising

Feature-Driven Super-Resolution for Object Detection

Efficient Electro-optical Tuning of Optical Frequency Microcomb on a Monolithically Integrated High-Q Lithium Niobate Microdisk

Hierarchical Online Intrusion Detection for SCADA Networks

Generalized Full-Vector Multi-Mode Matching Analysis of Whispering-Gallery Microcavities

Highly Efficient Boundary Element Analysis of Whispering Gallery Microcavities

Cylindrical Beam Propagation Modelling of Perturbed Whispering-Gallery Mode Microcavities