Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
19topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

NanoResearch: Co-Evolving Skills, Memory, and Policy for Personalized Research Automation

LLM-powered multi-agent systems can now automate the full research pipeline from ideation to paper writing, but a fundamental question remains: automation for whom? Researchers operate under different resource configurations, hold different methodological preferences, and target different output formats. A system that produces uniform outputs regardless of these differences will systematically under-serve every individual user, making personalization a precondition for research automation to be genuinely usable. However, achieving it requires three capabilities that current systems lack: accumulating reusable procedural knowledge across projects, retaining user-specific experience across sessions, and internalizing implicit preferences that resist explicit formalization. We propose NanoResearch, a multi-agent framework that addresses these gaps through tri-level co-evolution. A skill bank distills recurring operations into compact procedural rules reusable across projects. A memory module maintains user- and project-specific experience that grounds planning decisions in each user's research history. A label-free policy learning converts free-form feedback into persistent parameter updates of the planner, reshaping subsequent coordination. These three layers co-evolve: reliable skills produce richer memory, richer memory informs better planning, and preference internalization continuously realigns the loop to each user. Extensive experiments demonstrate that NanoResearch delivers substantial gains over state-of-the-art AI research systems, and progressively refines itself to produce better research at lower cost over successive cycles.

preprint2024arXiv

CoMoSVC: Consistency Model-based Singing Voice Conversion

The diffusion-based Singing Voice Conversion (SVC) methods have achieved remarkable performances, producing natural audios with high similarity to the target timbre. However, the iterative sampling process results in slow inference speed, and acceleration thus becomes crucial. In this paper, we propose CoMoSVC, a consistency model-based SVC method, which aims to achieve both high-quality generation and high-speed sampling. A diffusion-based teacher model is first specially designed for SVC, and a student model is further distilled under self-consistency properties to achieve one-step sampling. Experiments on a single NVIDIA GTX4090 GPU reveal that although CoMoSVC has a significantly faster inference speed than the state-of-the-art (SOTA) diffusion-based SVC system, it still achieves comparable or superior conversion performance based on both subjective and objective metrics. Audio samples and codes are available at https://comosvc.github.io/.

preprint2024arXiv

Freeze the backbones: A Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-training

Modern healthcare often utilises radiographic images alongside textual reports for diagnostics, encouraging the use of Vision-Language Self-Supervised Learning (VL-SSL) with large pre-trained models to learn versatile medical vision representations. However, most existing VL-SSL frameworks are trained end-to-end, which is computation-heavy and can lose vital prior information embedded in pre-trained encoders. To address both issues, we introduce the backbone-agnostic Adaptor framework, which preserves medical knowledge in pre-trained image and text encoders by keeping them frozen, and employs a lightweight Adaptor module for cross-modal learning. Experiments on medical image classification and segmentation tasks across three datasets reveal that our framework delivers competitive performance while cutting trainable parameters by over 90% compared to current pre-training approaches. Notably, when fine-tuned with just 1% of data, Adaptor outperforms several Transformer-based methods trained on full datasets in medical image segmentation.

preprint2022arXiv

A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion

Multi-person motion capture can be challenging due to ambiguities caused by severe occlusion, fast body movement, and complex interactions. Existing frameworks build on 2D pose estimations and triangulate to 3D coordinates via reasoning the appearance, trajectory, and geometric consistencies among multi-camera observations. However, 2D joint detection is usually incomplete and with wrong identity assignments due to limited observation angle, which leads to noisy 3D triangulation results. To overcome this issue, we propose to explore the short-range autoregressive characteristics of skeletal motion using transformer. First, we propose an adaptive, identity-aware triangulation module to reconstruct 3D joints and identify the missing joints for each identity. To generate complete 3D skeletal motion, we then propose a Dual-Masked Auto-Encoder (D-MAE) which encodes the joint status with both skeletal-structural and temporal position encoding for trajectory completion. D-MAE's flexible masking and encoding mechanism enable arbitrary skeleton definitions to be conveniently deployed under the same framework. In order to demonstrate the proposed model's capability in dealing with severe data loss scenarios, we contribute a high-accuracy and challenging motion capture dataset of multi-person interactions with severe occlusion. Evaluations on both benchmark and our new dataset demonstrate the efficiency of our proposed model, as well as its advantage against the other state-of-the-art methods.

preprint2022arXiv

A Scalable Workflow to Build Machine Learning Classifiers with Clinician-in-the-Loop to Identify Patients in Specific Diseases

Clinicians may rely on medical coding systems such as International Classification of Diseases (ICD) to identify patients with diseases from Electronic Health Records (EHRs). However, due to the lack of detail and specificity as well as a probability of miscoding, recent studies suggest the ICD codes often cannot characterise patients accurately for specific diseases in real clinical practice, and as a result, using them to find patients for studies or trials can result in high failure rates and missing out on uncoded patients. Manual inspection of all patients at scale is not feasible as it is highly costly and slow. This paper proposes a scalable workflow which leverages both structured data and unstructured textual notes from EHRs with techniques including NLP, AutoML and Clinician-in-the-Loop mechanism to build machine learning classifiers to identify patients at scale with given diseases, especially those who might currently be miscoded or missed by ICD codes. Case studies in the MIMIC-III dataset were conducted where the proposed workflow demonstrates a higher classification performance in terms of F1 scores compared to simply using ICD codes on gold testing subset to identify patients with Ovarian Cancer (0.901 vs 0.814), Lung Cancer (0.859 vs 0.828), Cancer Cachexia (0.862 vs 0.650), and Lupus Nephritis (0.959 vs 0.855). Also, the proposed workflow that leverages unstructured notes consistently outperforms the baseline that uses structured data only with an increase of F1 (Ovarian Cancer 0.901 vs 0.719, Lung Cancer 0.859 vs 0.787, Cancer Cachexia 0.862 vs 0.838 and Lupus Nephritis 0.959 vs 0.785). Experiments on the large testing set also demonstrate the proposed workflow can find more patients who are miscoded or missed by ICD codes. Moreover, interpretability studies are also conducted to clinically validate the top impact features of the classifiers.

preprint2022arXiv

ChoreoGraph: Music-conditioned Automatic Dance Choreography over a Style and Tempo Consistent Dynamic Graph

To generate dance that temporally and aesthetically matches the music is a challenging problem, as the following factors need to be considered. First, the aesthetic styles and messages conveyed by the motion and music should be consistent. Second, the beats of the generated motion should be locally aligned to the musical features. And finally, basic choreomusical rules should be observed, and the motion generated should be diverse. To address these challenges, we propose ChoreoGraph, which choreographs high-quality dance motion for a given piece of music over a Dynamic Graph. A data-driven learning strategy is proposed to evaluate the aesthetic style and rhythmic connections between music and motion in a progressively learned cross-modality embedding space. The motion sequences will be beats-aligned based on the music segments and then incorporated as nodes of a Dynamic Motion Graph. Compatibility factors such as the style and tempo consistency, motion context connection, action completeness, and transition smoothness are comprehensively evaluated to determine the node transition in the graph. We demonstrate that our repertoire-based framework can generate motions with aesthetic consistency and robustly extensible in diversity. Both quantitative and qualitative experiment results show that our proposed model outperforms other baseline models.

preprint2022arXiv

Label Dependent Attention Model for Disease Risk Prediction Using Multimodal Electronic Health Records

Disease risk prediction has attracted increasing attention in the field of modern healthcare, especially with the latest advances in artificial intelligence (AI). Electronic health records (EHRs), which contain heterogeneous patient information, are widely used in disease risk prediction tasks. One challenge of applying AI models for risk prediction lies in generating interpretable evidence to support the prediction results while retaining the prediction ability. In order to address this problem, we propose the method of jointly embedding words and labels whereby attention modules learn the weights of words from medical notes according to their relevance to the names of risk prediction labels. This approach boosts interpretability by employing an attention mechanism and including the names of prediction tasks in the model. However, its application is only limited to the handling of textual inputs such as medical notes. In this paper, we propose a label dependent attention model LDAM to 1) improve the interpretability by exploiting Clinical-BERT (a biomedical language model pre-trained on a large clinical corpus) to encode biomedically meaningful features and labels jointly; 2) extend the idea of joint embedding to the processing of time-series data, and develop a multi-modal learning framework for integrating heterogeneous information from medical notes and time-series health status indicators. To demonstrate our method, we apply LDAM to the MIMIC-III dataset to predict different disease risks. We evaluate our method both quantitatively and qualitatively. Specifically, the predictive power of LDAM will be shown, and case studies will be carried out to illustrate its interpretability.

preprint2022arXiv

Medical Scientific Table-to-Text Generation with Human-in-the-Loop under the Data Sparsity Constraint

Structured (tabular) data in the preclinical and clinical domains contains valuable information about individuals and an efficient table-to-text summarization system can drastically reduce manual efforts to condense this data into reports. However, in practice, the problem is heavily impeded by the data paucity, data sparsity and inability of the state-of-the-art natural language generation models (including T5, PEGASUS and GPT-Neo) to produce accurate and reliable outputs. In this paper, we propose a novel table-to-text approach and tackle these problems with a novel two-step architecture which is enhanced by auto-correction, copy mechanism and synthetic data augmentation. The study shows that the proposed approach selects salient biomedical entities and values from structured data with improved precision (up to 0.13 absolute increase) of copying the tabular values to generate coherent and accurate text for assay validation reports and toxicology reports. Moreover, we also demonstrate a light-weight adaptation of the proposed system to new datasets by fine-tuning with as little as 40\% training examples. The outputs of our model are validated by human experts in the Human-in-the-Loop scenario.

preprint2022arXiv

Receding Neuron Importances for Structured Pruning

Structured pruning efficiently compresses networks by identifying and removing unimportant neurons. While this can be elegantly achieved by applying sparsity-inducing regularisation on BatchNorm parameters, an L1 penalty would shrink all scaling factors rather than just those of superfluous neurons. To tackle this issue, we introduce a simple BatchNorm variation with bounded scaling parameters, based on which we design a novel regularisation term that suppresses only neurons with low importance. Under our method, the weights of unnecessary neurons effectively recede, producing a polarised bimodal distribution of importances. We show that neural networks trained this way can be pruned to a larger extent and with less deterioration. We one-shot prune VGG and ResNet architectures at different ratios on CIFAR and ImagenNet datasets. In the case of VGG-style networks, our method significantly outperforms existing approaches particularly under a severe pruning regime.

preprint2022arXiv

Suggestive Annotation of Brain MR Images with Gradient-guided Sampling

Machine learning has been widely adopted for medical image analysis in recent years given its promising performance in image segmentation and classification tasks. The success of machine learning, in particular supervised learning, depends on the availability of manually annotated datasets. For medical imaging applications, such annotated datasets are not easy to acquire, it takes a substantial amount of time and resource to curate an annotated medical image set. In this paper, we propose an efficient annotation framework for brain MR images that can suggest informative sample images for human experts to annotate. We evaluate the framework on two different brain image analysis tasks, namely brain tumour segmentation and whole brain segmentation. Experiments show that for brain tumour segmentation task on the BraTS 2019 dataset, training a segmentation model with only 7% suggestively annotated image samples can achieve a performance comparable to that of training on the full dataset. For whole brain segmentation on the MALC dataset, training with 42% suggestively annotated image samples can achieve a comparable performance to training on the full dataset. The proposed framework demonstrates a promising way to save manual annotation cost and improve data efficiency in medical imaging applications.

preprint2022arXiv

Unsupervised Numerical Reasoning to Extract Phenotypes from Clinical Text by Leveraging External Knowledge

Extracting phenotypes from clinical text has been shown to be useful for a variety of clinical use cases such as identifying patients with rare diseases. However, reasoning with numerical values remains challenging for phenotyping in clinical text, for example, temperature 102F representing Fever. Current state-of-the-art phenotyping models are able to detect general phenotypes, but perform poorly when they detect phenotypes requiring numerical reasoning. We present a novel unsupervised methodology leveraging external knowledge and contextualized word embeddings from ClinicalBERT for numerical reasoning in a variety of phenotypic contexts. Comparing against unsupervised benchmarks, it shows a substantial performance improvement with absolute gains on generalized Recall and F1 scores up to 79% and 71%, respectively. In the supervised setting, it also surpasses the performance of alternative approaches with absolute gains on generalized Recall and F1 scores up to 70% and 44%, respectively.

preprint2021arXiv

A Blockchain-based Trust System for Decentralised Applications: When trustless needs trust

Blockchain technology has been envisaged to commence an era of decentralised applications and services (DApps) without the need for a trusted intermediary. Such DApps open a marketplace in which services are delivered to end-users by contributors which are then incentivised by cryptocurrencies in an automated, peer-to-peer, and trustless fashion. However, blockchain, consolidated by smart contracts, only ensures on-chain data security, autonomy and integrity of the business logic execution defined in smart contracts. It cannot guarantee the quality of service of DApps, which entirely depends on the services' performance. Thus, there is a critical need for a trust system to reduce the risk of dealing with fraudulent counterparts in a blockchain network. These reasons motivate us to develop a fully decentralised trust framework deployed on top of a blockchain platform, operating along with DApps in the marketplace to demoralise deceptive entities while encouraging trustworthy ones. The trust system works as an underlying decentralised service providing a feedback mechanism for end-users and maintaining trust relationships among them in the ecosystem accordingly. We believe this research fortifies the DApps ecosystem by introducing an universal trust middleware for DApps as well as shedding light on the implementation of a decentralised trust system.

preprint2021arXiv

A General Framework for Revealing Human Mind with auto-encoding GANs

Addressing the question of visualising human mind could help us to find regions that are associated with observed cognition and responsible for expressing the elusive mental image, leading to a better understanding of cognitive function. The traditional approach treats brain decoding as a classification problem, reading the mind through statistical analysis of brain activity. However, human thought is rich and varied, that it is often influenced by more of a combination of object features than a specific type of category. For this reason, we propose an end-to-end brain decoding framework which translates brain activity into an image by latent space alignment. To find the correspondence from brain signal features to image features, we embedded them into two latent spaces with modality-specific encoders and then aligned the two spaces by minimising the distance between paired latent representations. The proposed framework was trained by simultaneous electroencephalogram and functional MRI data, which were recorded when the subjects were viewing or imagining a set of image stimuli. In this paper, we focused on implementing the fMRI experiment. Our experimental results demonstrated the feasibility of translating brain activity to an image. The reconstructed image matches image stimuli approximate in both shape and colour. Our framework provides a promising direction for building a direct visualisation to reveal human mind.

preprint2021arXiv

Adversarially trained LSTMs on reduced order models of urban air pollution simulations

This paper presents an approach to improve computational fluid dynamics simulations forecasts of air pollution using deep learning. Our method, which integrates Principal Components Analysis (PCA) and adversarial training, is a way to improve the forecast skill of reduced order models obtained from the original model solution. Once the reduced-order model (ROM) is obtained via PCA, a Long Short-Term Memory network (LSTM) is adversarially trained on the ROM to make forecasts. Once trained, the adversarially trained LSTM outperforms a LSTM trained in a classical way. The study area is in London, including velocities and a concentration tracer that replicates a busy traffic junction. This adversarially trained LSTM-based approach is used on the ROM in order to produce faster forecasts of the air pollution tracer.

preprint2021arXiv

BU-Trace: A Permissionless Mobile System for Privacy-Preserving Intelligent Contact Tracing

The coronavirus disease 2019 (COVID-19) pandemic has caused an unprecedented health crisis for the global. Digital contact tracing, as a transmission intervention measure, has shown its effectiveness on pandemic control. Despite intensive research on digital contact tracing, existing solutions can hardly meet users' requirements on privacy and convenience. In this paper, we propose BU-Trace, a novel permissionless mobile system for privacy-preserving intelligent contact tracing based on QR code and NFC technologies. First, a user study is conducted to investigate and quantify the user acceptance of a mobile contact tracing system. Second, a decentralized system is proposed to enable contact tracing while protecting user privacy. Third, an intelligent behavior detection algorithm is designed to ease the use of our system. We implement BU-Trace and conduct extensive experiments in several real-world scenarios. The experimental results show that BU-Trace achieves a privacy-preserving and intelligent mobile system for contact tracing without requesting location or other privacy-related permissions.

preprint2021arXiv

Quantifying high-order interdependencies on individual patterns via the local O-information: theory and applications to music analysis

High-order, beyond-pairwise interdependencies are at the core of biological, economic, and social complex systems, and their adequate analysis is paramount to understand, engineer, and control such systems. This paper presents a framework to measure high-order interdependence that disentangles their effect on each individual pattern exhibited by a multivariate system. The approach is centred on the 'local O-information', a new measure that assesses the balance between synergistic and redundant interdependencies at each pattern. To illustrate the potential of this framework, we present a detailed analysis of music scores from J.S. Bach, which reveals how high-order interdependence is deeply connected with highly non-trivial aspects of the musical discourse. Our results place the local O-information as a promising tool of wide applicability, which opens new perspectives for analysing high-order relationships in the patterns exhibited by complex systems.

preprint2020arXiv

A Bayesian Updating Scheme for Pandemics: Estimating the Infection Dynamics of COVID-19

Epidemic models play a key role in understanding and responding to the emerging COVID-19 pandemic. Widely used compartmental models are static and are of limited use to evaluate intervention strategies with the emerging pandemic. Applying the technology of data assimilation, we propose a Bayesian updating approach for estimating epidemiological parameters using observable information for the purpose of assessing the impacts of different intervention strategies. We adopt a concise renewal model and propose new parameters by disentangling the reduction of instantaneous reproduction number Rt into mitigation and suppression factors for quantifying intervention impacts at a finer granularity. Then we developed a data assimilation framework for estimating these parameters including constructing an observation function and developing a Bayesian updating scheme. A statistical analysis framework is then built to quantify the impact of intervention strategies by monitoring the evolution of these estimated parameters. By Investigating the impacts of intervention measures of European countries, the United States and Wuhan with the framework, we reveal the effects of interventions in these countries and the resurgence risk in the USA.

preprint2020arXiv

Efficient Deep Representation Learning by Adaptive Latent Space Sampling

Supervised deep learning requires a large amount of training samples with annotations (e.g. label class for classification task, pixel- or voxel-wised label map for segmentation tasks), which are expensive and time-consuming to obtain. During the training of a deep neural network, the annotated samples are fed into the network in a mini-batch way, where they are often regarded of equal importance. However, some of the samples may become less informative during training, as the magnitude of the gradient start to vanish for these samples. In the meantime, other samples of higher utility or hardness may be more demanded for the training process to proceed and require more exploitation. To address the challenges of expensive annotations and loss of sample informativeness, here we propose a novel training framework which adaptively selects informative samples that are fed to the training process. The adaptive selection or sampling is performed based on a hardness-aware strategy in the latent space constructed by a generative model. To evaluate the proposed training framework, we perform experiments on three different datasets, including MNIST and CIFAR-10 for image classification task and a medical image dataset IVUS for biophysical simulation task. On all three datasets, the proposed framework outperforms a random sampling method, which demonstrates the effectiveness of proposed framework.

preprint2020arXiv

Suggestive Annotation of Brain Tumour Images with Gradient-guided Sampling

Machine learning has been widely adopted for medical image analysis in recent years given its promising performance in image segmentation and classification tasks. As a data-driven science, the success of machine learning, in particular supervised learning, largely depends on the availability of manually annotated datasets. For medical imaging applications, such annotated datasets are not easy to acquire. It takes a substantial amount of time and resource to curate an annotated medical image set. In this paper, we propose an efficient annotation framework for brain tumour images that is able to suggest informative sample images for human experts to annotate. Our experiments show that training a segmentation model with only 19% suggestively annotated patient scans from BraTS 2019 dataset can achieve a comparable performance to training a model on the full dataset for whole tumour segmentation task. It demonstrates a promising way to save manual annotation cost and improve data efficiency in medical imaging applications.

preprint2019arXiv

Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification

Different aspects of a clinical sample can be revealed by multiple types of omics data. Integrated analysis of multi-omics data provides a comprehensive view of patients, which has the potential to facilitate more accurate clinical decision making. However, omics data are normally high dimensional with large number of molecular features and relatively small number of available samples with clinical labels. The "dimensionality curse" makes it challenging to train a machine learning model using high dimensional omics data like DNA methylation and gene expression profiles. Here we propose an end-to-end deep learning model called OmiVAE to extract low dimensional features and classify samples from multi-omics data. OmiVAE combines the basic structure of variational autoencoders with a classification network to achieve task-oriented feature extraction and multi-class classification. The training procedure of OmiVAE is comprised of an unsupervised phase without the classifier and a supervised phase with the classifier. During the unsupervised phase, a hierarchical cluster structure of samples can be automatically formed without the need for labels. And in the supervised phase, OmiVAE achieved an average classification accuracy of 97.49% after 10-fold cross-validation among 33 tumour types and normal samples, which shows better performance than other existing methods. The OmiVAE model learned from multi-omics data outperformed that using only one type of omics data, which indicates that the complementary information from different omics datatypes provides useful insights for biomedical tasks like cancer classification.