Researcher profile

Kun Li

Kun Li contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
18works
0followers
14topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

18 published item(s)

preprint2026arXiv

EmbodiSkill: Skill-Aware Reflection for Self-Evolving Embodied Agents

Embodied agents can benefit from skills that guide object search, action execution, and state changes across diverse environments. Since embodied environments vary across layouts, object states, and other execution factors, these skills must self-evolve from trajectories generated during task execution. However, existing skill self-evolution methods are mainly developed in digital environments and often convert trajectories into coarse skill updates. Directly applying this paradigm to embodied settings is problematic, because a failed task execution may reflect not only incorrect skill content, but also an execution lapse in which the agent fails to follow valid guidance. We propose EmbodiSkill, a training-free framework for embodied skill self-evolution through skill-aware reflection and targeted revision. EmbodiSkill interprets each trajectory with respect to the current skill, uses skill-changing evidence to update the skill body, and uses execution-lapse evidence to preserve and emphasize valid guidance. Experiments on ALFWorld and EmbodiedBench show that EmbodiSkill consistently improves embodied task success. On ALFWorld, EmbodiSkill enables a frozen Qwen3.5-27B executor to reach 93.28% task success, outperforming GPT-5.2 used as a direct agent without skills by 31.58%. These results show that skill-aware self-evolution helps embodied agents accumulate reusable procedural knowledge from their own trajectories.

preprint2024arXiv

KeDuSR: Real-World Dual-Lens Super-Resolution via Kernel-Free Matching

Dual-lens super-resolution (SR) is a practical scenario for reference (Ref) based SR by utilizing the telephoto image (Ref) to assist the super-resolution of the low-resolution wide-angle image (LR input). Different from general RefSR, the Ref in dual-lens SR only covers the overlapped field of view (FoV) area. However, current dual-lens SR methods rarely utilize these specific characteristics and directly perform dense matching between the LR input and Ref. Due to the resolution gap between LR and Ref, the matching may miss the best-matched candidate and destroy the consistent structures in the overlapped FoV area. Different from them, we propose to first align the Ref with the center region (namely the overlapped FoV area) of the LR input by combining global warping and local warping to make the aligned Ref be sharp and consistent. Then, we formulate the aligned Ref and LR center as value-key pairs, and the corner region of the LR is formulated as queries. In this way, we propose a kernel-free matching strategy by matching between the LR-corner (query) and LR-center (key) regions, and the corresponding aligned Ref (value) can be warped to the corner region of the target. Our kernel-free matching strategy avoids the resolution gap between LR and Ref, which makes our network have better generalization ability. In addition, we construct a DuSR-Real dataset with (LR, Ref, HR) triples, where the LR and HR are well aligned. Experiments on three datasets demonstrate that our method outperforms the second-best method by a large margin. Our code and dataset are available at https://github.com/ZifanCui/KeDuSR.

preprint2023arXiv

HLC2: a highly efficient cross-matching framework for large astronomical catalogues on heterogeneous computing environments

Cross-matching operation, which is to find corresponding data for the same celestial object or region from multiple catalogues,is indispensable to astronomical data analysis and research. Due to the large amount of astronomical catalogues generated by the ongoing and next-generation large-scale sky surveys, the time complexity of the cross-matching is increasing dramatically. Heterogeneous computing environments provide a theoretical possibility to accelerate the cross-matching, but the performance advantages of heterogeneous computing resources have not been fully utilized. To meet the challenge of cross-matching for substantial increasing amount of astronomical observation data, this paper proposes Heterogeneous-computing-enabled Large Catalogue Cross-matcher (HLC2), a high-performance cross-matching framework based on spherical position deviation on CPU-GPU heterogeneous computing platforms. It supports scalable and flexible cross-matching and can be directly applied to the fusion of large astronomical cataloguesfrom survey missions and astronomical data centres. A performance estimation model is proposed to locate the performance bottlenecks and guide the optimizations. A two-level partitioning strategy is designed to generate an optimized data placement according to the positions of celestial objects to increase throughput. To make HLC2 a more adaptive solution, the architecture-aware task splitting, thread parallelization, and concurrent scheduling strategies are designed and integrated. Moreover, a novel quad-direction strategy is proposed for the boundary problem to effectively balance performance and completeness. We have experimentally evaluated HLC2 using public released catalogue data. Experiments demonstrate that HLC2 scales well on different sizes of catalogues and the cross-matching speed is significantly improved compared to the state-of-the-art cross-matchers.

preprint2022arXiv

AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds

There have been two streams in the 3D detection from point clouds: single-stage methods and two-stage methods. While the former is more computationally efficient, the latter usually provides better detection accuracy. By carefully examining the two-stage approaches, we have found that if appropriately designed, the first stage can produce accurate box regression. In this scenario, the second stage mainly rescores the boxes such that the boxes with better localization get selected. From this observation, we have devised a single-stage anchor-free network that can fulfill these requirements. This network, named AFDetV2, extends the previous work by incorporating a self-calibrated convolution block in the backbone, a keypoint auxiliary supervision, and an IoU prediction branch in the multi-task head. As a result, the detection accuracy is drastically boosted in the single-stage. To evaluate our approach, we have conducted extensive experiments on the Waymo Open Dataset and the nuScenes Dataset. We have observed that our AFDetV2 achieves the state-of-the-art results on these two datasets, superior to all the prior arts, including both the single-stage and the two-stage 3D detectors. AFDetV2 won the 1st place in the Real-Time 3D Detection of the Waymo Open Dataset Challenge 2021. In addition, a variant of our model AFDetV2-Base was entitled the "Most Efficient Model" by the Challenge Sponsor, showing a superior computational efficiency. To demonstrate the generality of this single-stage method, we have also applied it to the first stage of the two-stage networks. Without exception, the results show that with the strengthened backbone and the rescoring approach, the second stage refinement is no longer needed.

preprint2022arXiv

Group Gated Fusion on Attention-based Bidirectional Alignment for Multimodal Emotion Recognition

Emotion recognition is a challenging and actively-studied research area that plays a critical role in emotion-aware human-computer interaction systems. In a multimodal setting, temporal alignment between different modalities has not been well investigated yet. This paper presents a new model named as Gated Bidirectional Alignment Network (GBAN), which consists of an attention-based bidirectional alignment network over LSTM hidden states to explicitly capture the alignment relationship between speech and text, and a novel group gated fusion (GGF) layer to integrate the representations of different modalities. We empirically show that the attention-aligned representations outperform the last-hidden-states of LSTM significantly, and the proposed GBAN model outperforms existing state-of-the-art multimodal approaches on the IEMOCAP dataset.

preprint2022arXiv

Model order reduction for parameterized electromagnetic problems using matrix decomposition and deep neural networks

A non-intrusive model order reduction (MOR) method for solving parameterized electromagnetic scattering problems is proposed in this paper. A database collecting snapshots of high-fidelity solutions is built by solving the parameterized time-domain Maxwell equations for some values of the material parameters using a fullwave solver based on a high order discontinuous Galerkin time-domain (DGTD) method. To perform a prior dimensionality reduction, a set of reduced basis (RB) functions are extracted from the database via a two-step proper orthogonal decomposition (POD) method. Projection coefficients of the reduced basis functions are further compressed through a convolutional autoencoder (CAE) network. Singular value decomposition (SVD) is then used to extract the principal components of the reduced-order matrices generated by CAE, and a cubic spline interpolation-based (CSI) approach is employed for approximating the dominating time- and parameter-modes of the reduced-order matrices. The generation of the reduced basis and the training of the CAE and CSI are accomplished in the offline stage, thus the RB solution for given time/parameter values can be quickly recovered via outputs of the interpolation model and decoder network. In particular, the offline and online stages of the proposed RB method are completely decoupled, which ensures the validity of the method. The performance of the proposed CAE-CSI ROM is illustrated with numerical experiments for scattering of a plane wave by a 2-D dielectric disk and a multi-layer heterogeneous medium.

preprint2021arXiv

Polycaprolactone/graphite nanoplates composite nanopapers

Nanopapers based on graphene and related materials were recently proposed for application in heat spreader applications. To overcome typical limitations in brittleness of such materials, this work addressed the combination of graphite nanoplatelets (GNP) with a soft, tough and crystalline polymer, acting as an efficient binder between nanoplates. With this aim, polycaprolactone (PCL) was selected and exploited in this paper. The crystalline organization of PCL within the nanopaper was studied to investigate the effect of polymer confinement between GNP. Thermomechanical properties were studied by dynamic mechanical analyses at variable temperature and creep measurements at high temperature, demonstrating superior resistance at temperatures well above PCL melting. Finally, the heat conduction properties on the nanopapers were evaluated, resulting in outstanding values above 150 Wm-1K-1.

preprint2020arXiv

4D Association Graph for Realtime Multi-person Motion Capture Using Multiple Video Cameras

This paper contributes a novel realtime multi-person motion capture algorithm using multiview video inputs. Due to the heavy occlusions in each view, joint optimization on the multiview images and multiple temporal frames is indispensable, which brings up the essential challenge of realtime efficiency. To this end, for the first time, we unify per-view parsing, cross-view matching, and temporal tracking into a single optimization framework, i.e., a 4D association graph that each dimension (image space, viewpoint and time) can be treated equally and simultaneously. To solve the 4D association graph efficiently, we further contribute the idea of 4D limb bundle parsing based on heuristic searching, followed with limb bundle assembling by proposing a bundle Kruskal's algorithm. Our method enables a realtime online motion capture system running at 30fps using 5 cameras on a 5-person scene. Benefiting from the unified parsing, matching and tracking constraints, our method is robust to noisy detection, and achieves high-quality online pose reconstruction quality. The proposed method outperforms the state-of-the-art method quantitatively without using high-level appearance information. We also contribute a multiview video dataset synchronized with a marker-based motion capture system for scientific evaluation.

preprint2020arXiv

A Comparative Analysis of Inconel 718 Made by Additive Manufacturing and Suction Casting: Microstructure Evolution in Homogenization

Homogenization is one of the critical stages in the post-heat treatment of additive manufacturing (AM) component to achieve uniform microstructure. During homogenization, grain coarsening could be an issue to reserve strength, which requires careful design of both time and temperature. Therefore, a proper design of homogenization becomes particularly important for AM design, for which work hardening is usually no longer an option. In this work, we discovered an intriguing phenomenon during homogenization of suction-cast and AM Inconel 718 superalloys. Through both short and long-term isothermal heat treatments at 1180°C, we observed an abnormal grain growth in the suction-cast alloy but continuous recrystallization in the alloy made by laser powder bed fusion (LPBF). The grain size of AM samples keeps as small as 130 μm and is even slightly reduced after homogenization for 12 hours. The homogeneity of Nb in the AM alloys is identified as the critical factor for NbC formation, which further influences the recrystallization kinetics at 1180°C. Multi-type dislocation behaviors are studied to elucidate the grain refinement observed in homogenized alloys after LPBF. This work provides a new pathway on microstructure engineering of AM alloys for improved mechanical performance superior to traditionally manufactured ones.

preprint2020arXiv

A Federated F-score Based Ensemble Model for Automatic Rule Extraction

In this manuscript, we propose a federated F-score based ensemble tree model for automatic rule extraction, namely Fed-FEARE. Under the premise of data privacy protection, Fed-FEARE enables multiple agencies to jointly extract set of rules both vertically and horizontally. Compared with that without federated learning, measures in evaluating model performance are highly improved. At present, Fed-FEARE has already been applied to multiple business, including anti-fraud and precision marketing, in a China nation-wide financial holdings group.

preprint2020arXiv

A Misreport- and Collusion-Proof Crowdsourcing Mechanism without Quality Verification

Quality control plays a critical role in crowdsourcing. The state-of-the-art work is not suitable for large-scale crowdsourcing applications, since it is a long haul for the requestor to verify task quality or select professional workers in a one-by-one mode. In this paper, we propose a misreport- and collusion-proof crowdsourcing mechanism, guiding workers to truthfully report the quality of submitted tasks without collusion by designing a mechanism, so that workers have to act the way the requestor would like. In detail, the mechanism proposed by the requester makes no room for the workers to obtain profit through quality misreport and collusion, and thus, the quality can be controlled without any verification. Extensive simulation results verify the effectiveness of the proposed mechanism. Finally, the importance and originality of our work lie in that it reveals some interesting and even counterintuitive findings: 1) a high-quality worker may pretend to be a low-quality one; 2) the rise of task quality from high-quality workers may not result in the increased utility of the requestor; 3) the utility of the requestor may not get improved with the increasing number of workers. These findings can boost forward looking and strategic planning solutions for crowdsourcing.

preprint2020arXiv

A new high-throughput method using additive manufacturing for alloy design and heat treatment optimization

Many alloys made by Additive Manufacturing (AM) require careful design of post-heat treatment as an indispensable step of microstructure engineering to further enhance the performance. We developed a high-throughput approach by fabricating a long-bar sample heat-treated under a monitored gradient temperature zone for phase transformation study to accelerate the post-heat treatment development of AM alloys. This approach has been proven efficient in determining the aging temperature with peak hardness. We observed that the precipitation strengthening is predominant for the studied superalloy by laser powder bed fusion, and the grain size variation is insensitive on temperature between 605 and 825 Celcius. This new approach can be applied to post-heat treatment optimization of other materials made by AM, and further assist new alloy development.

preprint2020arXiv

A Vertical Federated Learning Method for Interpretable Scorecard and Its Application in Credit Scoring

With the success of big data and artificial intelligence in many fields, the applications of big data driven models are expected in financial risk management especially credit scoring and rating. Under the premise of data privacy protection, we propose a projected gradient-based method in the vertical federated learning framework for the traditional scorecard, which is based on logistic regression with bounded constraints, namely FL-LRBC. The latter enables multiple agencies to jointly train an optimized scorecard model in a single training session. It leads to the formation of the model with positive coefficients, while the time-consuming parameter-tuning process can be avoided. Moreover, the performance in terms of both AUC and the Kolmogorov-Smirnov (KS) statistics is significantly improved due to data enrichment using FL-LRBC. At present, FL-LRBC has already been applied to credit business in a China nation-wide financial holdings group.

preprint2020arXiv

Adaptive 3D Face Reconstruction from a Single Image

3D face reconstruction from a single image is a challenging problem, especially under partial occlusions and extreme poses. This is because the uncertainty of the estimated 2D landmarks will affect the quality of face reconstruction. In this paper, we propose a novel joint 2D and 3D optimization method to adaptively reconstruct 3D face shapes from a single image, which combines the depths of 3D landmarks to solve the uncertain detections of invisible landmarks. The strategy of our method involves two aspects: a coarse-to-fine pose estimation using both 2D and 3D landmarks, and an adaptive 2D and 3D re-weighting based on the refined pose parameter to recover accurate 3D faces. Experimental results on multiple datasets demonstrate that our method can generate high-quality reconstruction from a single color image and is robust for self-occlusion and large poses.

preprint2020arXiv

AstroCatR: a Mechanism and Tool for Efficient Time Series Reconstruction of Large-Scale Astronomical Catalogues

Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analyzing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or databases, match each item to determine which object it belongs to, and finally produce time series datasets. To support the high-performance parallel processing of large-scale datasets, AstroCatR uses the extract-transform-load (ETL) preprocessing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3X faster than methods using relational database management systems at matching massive catalogues.

preprint2020arXiv

Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation

Aspect term extraction aims to extract aspect terms from review texts as opinion targets for sentiment analysis. One of the big challenges with this task is the lack of sufficient annotated data. While data augmentation is potentially an effective technique to address the above issue, it is uncontrollable as it may change aspect words and aspect labels unexpectedly. In this paper, we formulate the data augmentation as a conditional generation task: generating a new sentence while preserving the original opinion targets and labels. We propose a masked sequence-to-sequence method for conditional augmentation of aspect term extraction. Unlike existing augmentation approaches, ours is controllable and allows us to generate more diversified sentences. Experimental results confirm that our method alleviates the data scarcity problem significantly. It also effectively boosts the performances of several current models for aspect term extraction.

preprint2020arXiv

Geometry-guided Dense Perspective Network for Speech-Driven Facial Animation

Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet), to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to strengthen feature propagation and encourage the re-use of audio features, and the decoder is integrated with an attention mechanism to adaptively recalibrate point-wise feature responses by explicitly modeling interdependencies between different neuron units. We also introduce a non-linear face reconstruction representation as a guidance of latent space to obtain more accurate deformation, which helps solve the geometry-related deformation and is good for generalization across subjects. Huber and HSIC (Hilbert-Schmidt Independence Criterion) constraints are adopted to promote the robustness of our model and to better exploit the non-linear and high-order correlations. Experimental results on the public dataset and real scanned dataset validate the superiority of our proposed GDPnet compared with state-of-the-art model.

preprint2020arXiv

Post-Heat Treatment Design of High-Strength Low-Alloy Steels Processed by Laser Powder Bed Fusion

In this study, a post-heat treatment design for additively manufactured copper-bearing high-strength low-alloy (HSLA)-100 steel is performed by understanding the process-structure-property relationships. Hot isostatic pressing (HIP) is designed to reduce the porosity from 3% to less than 1% for the HSLA-100 steel processed by laser powder bed fusion (LPBF). Quenching dilatometry is employed to design the HIP parameters with the optimized cooling rate for the maximum amount of martensite transformed after HIP. Afterward, a post-heat treatment step with cyclic re-austenitization is introduced for an effective grain refinement to compensate the coarsened microstructure after HIP. Finally, tempering is optimized through microstructure characterization and microhardness. A two-fold increase in the yield strength of the HSLA with tailored microstructure during post-heat treatment is achieved in comparison with the as-built HSLA.