Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
25works
0followers
21topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

25 published item(s)

preprint2026arXiv

Efficient State Preparation for Quantum Machine Learning

One of the key considerations in the development of Quantum Machine Learning (QML) protocols is the encoding of classical data onto a quantum device. In this chapter we introduce the Matrix Product State representation of quantum systems and show how it may be used to construct circuits which encode a desired state. Putting this in the context of QML we show how this process may be modified to give a low depth approximate encoding and crucially that this encoding does not hinder classification accuracy and is indeed exhibits an increased robustness against classical adversarial attacks. This is illustrated by demonstrations of adversarially robust variational quantum classifiers for the MNIST and FMNIST dataset, as well as a small-scale experimental demonstration on a superconducting quantum device.

preprint2026arXiv

Quantum Error Correction and Detection for Quantum Machine Learning

At the intersection of quantum computing and machine learning, quantum machine learning (QML) is poised to revolutionize artificial intelligence. However, the vulnerability of the current generation of quantum computers to noise and computational error poses a significant barrier to this vision. Whilst quantum error correction (QEC) offers a promising solution for almost any type of hardware noise, its application requires millions of qubits to encode even a simple logical algorithm, rendering it impractical in the near term. In this chapter, we examine strategies for integrating QEC and quantum error detection (QED) into QML under realistic resource constraints. We first quantify the resource demands of fully error-corrected QML and propose a partial QEC approach that reduces overhead while enabling error correction. We then demonstrate the application of a simple QED method, evaluating its impact on QML performance and highlighting challenges we have yet to overcome before we achieve fully fault-tolerant QML.

preprint2026arXiv

Time-Dynamic Circuits for Fault-Tolerant Shift Automorphisms in Quantum LDPC Codes

Quantum low-density parity-check (qLDPC) codes have emerged as a promising approach for realizing low-overhead logical quantum memories. Recent theoretical developments have established shift automorphisms as a fundamental building block for completing the universal set of logical gates for qLDPC codes. However, practical challenges remain because the existing SWAP-based shift automorphism yields logical error rates that are orders of magnitude higher than those for fault-tolerant idle operations. In this work, we address this issue by dynamically varying the syndrome measurement circuits to implement the shift automorphisms without reducing the circuit distance. We benchmark our approach on both twisted and untwisted weight-6 generalized toric codes, including the gross code family. Our time-dynamic circuits for shift automorphisms achieve performance comparable to the idle operations under the circuit-level noise model (SI1000). Specifically, the dynamic circuits achieve more than an order of magnitude reduction in logical error rates relative to the SWAP-based scheme for the gross code at a physical error rate of $10^{-3}$, employing the BP-OSD decoder. Our findings improve both the error resilience and the time overhead of the shift automorphisms in qLDPC codes. Furthermore, our work can lead to alternative syndrome extraction circuit designs, such as leakage removal protocols, providing a practical pathway to utilizing dynamic circuits that extend beyond surface codes towards qLDPC codes.

preprint2025arXiv

Symmetry-Checking in Band Structure Calculations on a Noisy Quantum Computer

Band crossings in electronic band structures play an important role in determining the electronic, topological, and transport properties in solid-state systems, making them central to both condensed matter physics and materials science. The emergence of noisy intermediate-scale quantum (NISQ) processors has sparked great interest in developing quantum algorithms to compute band structure properties of materials. While significant research has been reported on computing ground state and excited state energy bands in the presence of noise that breaks the degeneracy, identifying the symmetry at crossing points using quantum computers is still an open question. In this work, we propose a method for identifying the symmetry of bands around crossings and anti-crossings in the band structure of bilayer graphene with two distinct configurations on a NISQ device. The method utilizes eigenstates at neighbouring $\mathbf{k}$ points on either side of the touching point to recover the local symmetry by implementing a character-checking quantum circuit that uses ancilla qubit measurements for a probabilistic test. We then evaluate the performance of our method under a depolarizing noise model, using four distinct matrix representations of symmetry operations to assess its robustness. Finally, we demonstrate the reliability of our method by correctly identifying the correct band crossings of AA-stacked bilayer graphene around $K$ point, using the character-checking circuit implemented on a noisy IBM quantum processor $ibm\_marrakesh$.

preprint2025arXiv

Variational Quantum Machine Learning with Quantum Error Detection

Quantum machine learning (QML) is an emerging field that promises advantages such as faster training, improved reliability and superior feature extraction over classical counterparts. However, its implementation on quantum hardware is challenging due to the noise inherent in these systems, necessitating the use of quantum error correction (QEC) codes. Current QML research remains primarily theoretical, often assuming noise-free environments and offering little insight into the integration of QEC with QML implementations. To address this, we investigate the performance of a simple, parity-classifying Variational Quantum Classifier (VQC) implemented with the [[4,2,2]] error-detecting stabiliser code in a simulated noisy environment, marking the first study into the implementation of a QML algorithm with a QEC code. We invoke ancilla qubits to logically encode rotation gates, and classically simulate the logically-encoded VQC under two simple noise models representing gate noise and environmental noise. We demonstrate that the stabiliser code improves the training accuracy at convergence compared to noisy implementations without QEC. However, we find that the effectiveness and reliability of error detection is contingent upon keeping the ancilla qubit error rates below a specific threshold, due to the propagation of ancilla errors to the physical qubits. Our results provide an important insight: for QML implementations with QEC codes that both require ancilla qubits for logical rotations and cannot fully correct errors propagated between ancilla and physical qubits, the maximum achievable accuracy of the QML model is limited. This highlights the need for additional error correction or mitigation strategies to support the practical implementation of QML algorithms with QEC on quantum devices.

preprint2022arXiv

An Overview of Structural Coverage Metrics for Testing Neural Networks

Deep neural network (DNN) models, including those used in safety-critical domains, need to be thoroughly tested to ensure that they can reliably perform well in different scenarios. In this article, we provide an overview of structural coverage metrics for testing DNN models, including neuron coverage (NC), k-multisection neuron coverage (kMNC), top-k neuron coverage (TKNC), neuron boundary coverage (NBC), strong neuron activation coverage (SNAC) and modified condition/decision coverage (MC/DC). We evaluate the metrics on realistic DNN models used for perception tasks (including LeNet-1, LeNet-4, LeNet-5, and ResNet20) as well as on networks used in autonomy (TaxiNet). We also provide a tool, DNNCov, which can measure the testing coverage for all these metrics. DNNCov outputs an informative coverage report to enable researchers and practitioners to assess the adequacy of DNN testing, compare different coverage measures, and to more conveniently inspect the model's internals during testing.

preprint2022arXiv

AntidoteRT: Run-time Detection and Correction of Poison Attacks on Neural Networks

We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. %There are several techniques proposed in the literature that aim to detect the attack but only a few also propose to defend against it, and they typically involve retraining the network which is not always possible in practice. We propose lightweight automated detection and correction techniques against poisoning attacks, which are based on neuron patterns mined from the network using a small set of clean and poisoned test samples with known labels. The patterns built based on the mis-classified samples are used for run-time detection of new poisoned inputs. For correction, we propose an input correction technique that uses a differential analysis to identify the trigger in the detected poisoned images, which is then reset to a neutral color. Our detection and correction are performed at run-time and input level, which is in contrast to most existing work that is focused on offline model-level defenses. We demonstrate that our technique outperforms existing defenses such as NeuralCleanse and STRIP on popular benchmarks such as MNIST, CIFAR-10, and GTSRB against the popular BadNets attack and the more complex DFST attack.

preprint2022arXiv

Comparative analysis of error mitigation techniques for variational quantum eigensolver implementations on IBM quantum system

Quantum computers are anticipated to transcend classical supercomputers for computationally intensive tasks by exploiting the principles of quantum mechanics. However, the capabilities of the current generation of quantum devices are limited due to noise or errors, and therefore implementation of error mitigation and/or correction techniques is pivotal to reliably process quantum algorithms. In this work, we have performed a comparative analysis of the error mitigation capability of the [[4,2,2]] quantum error-detecting code (QEC method), duplicate circuit technique, and the Bayesian read-out error mitigation (BREM) approach in the context of proof-of-concept implementations of variational quantum eigensolver (VQE) algorithm for determining the ground state energy of H$_2$ molecule. Based on experiments on IBM quantum device, our results show that the duplicate circuit approach performs superior to the QEC method in the presence of the hardware noise. A significant impact of cross-talk noise was observed when multiple mappings of the duplicate circuit and the QEC method were implemented simultaneously $-$ again the duplicate circuit approach overall performed better than the QEC method. To gain further insights into the performance of the studied error mitigation techniques, we also performed quantum simulations on IBM system with varying strengths of depolarising circuit noise and read-out errors which further supported the main finding of our work that the duplicate circuit offer superior performance towards mitigating of errors when compared to the QEC method. Our work reports a first assessment of the duplicate circuit approach for a quantum algorithm implementation and the documented evidence will pave the way for future scalable implementations of the duplicated circuit techniques for the error-mitigated practical applications of near-term quantum computers.

preprint2022arXiv

Multiplier with Reduced Activities and Minimized Interconnect for Inner Product Arrays

We present a pipelined multiplier with reduced activities and minimized interconnect based on online digit-serial arithmetic. The working precision has been truncated such that $p<n$ bits are used to compute $n$ bits product, resulting in significant savings in area and power. The digit slices follow variable precision according to input, increasing upto $p$ and then decreases according to the error profile. Pipelining has been done to achieve high throughput and low latency which is desirable for compute intensive inner products. Synthesis results of the proposed designs have been presented and compared with the non-pipelined online multiplier, pipelined online multiplier with full working precision and conventional serial-parallel and array multipliers. For $8, 16, 24$ and $32$ bit precision, the proposed low power pipelined design show upto $38\%$ and $44\%$ reduction in power and area respectively compared to the pipelined online multiplier without working precision truncation.

preprint2022arXiv

Optimizing Indoor Navigation Policies For Spatial Distancing

In this paper, we focus on the modification of policies that can lead to movement patterns and directional guidance of occupants, which are represented as agents in a 3D simulation engine. We demonstrate an optimization method that improves a spatial distancing metric by modifying the navigation graph by introducing a measure of spatial distancing of agents as a function of agent density (i.e., occupancy). Our optimization framework utilizes such metrics as the target function, using a hybrid approach of combining genetic algorithm and simulated annealing. We show that within our framework, the simulation-optimization process can help to improve spatial distancing between agents by optimizing the navigation policies for a given indoor environment.

preprint2022arXiv

Performance analysis of coreset selection for quantum implementation of K-Means clustering algorithm

Quantum computing is anticipated to offer immense computational capabilities which could provide efficient solutions to many data science problems. However, the current generation of quantum devices are small and noisy, which makes it difficult to process large data sets relevant for practical problems. Coreset selection aims to circumvent this problem by reducing the size of input data without compromising the accuracy. Recent work has shown that coreset selection can help to implement quantum K-Means clustering problem. However, the impact of coreset selection on the performance of quantum K-Means clustering has not been explored. In this work, we compare the relative performance of two coreset techniques (BFL16 and ONESHOT), and the size of coreset construction in each case, with respect to a variety of data sets and layout the advantages and limitations of coreset selection in implementing quantum algorithms. We also investigated the effect of depolarisation quantum noise and bit-flip error, and implemented the Quantum AutoEncoder technique for surpassing the noise effect. Our work provides useful insights for future implementation of data science algorithms on near-term quantum devices where problem size has been reduced by coreset selection.

preprint2022arXiv

Surgery transformations and eigenvalue estimates for quantum graphs with $δ&#39;$ vertex interactions

We extend the surgical tool box for quantum graphs to anti-standard and $δ&#39;$ vertex conditions. Monotonicity properties of eigenvalues of graph Laplacian with $δ&#39;$ interactions at vertices depend on the sign of vertex parameter. Using several interlacing inequalities between eigenvalues of graph Laplacian with different vertex conditions and surgery principles we obtain new upper and lower bounds on the eigenvalues of $δ$ and $δ&#39;$ Laplacians.

preprint2022arXiv

VPN: Verification of Poisoning in Neural Networks

Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely data poisoning. In this case an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger in an input causes the trained model to misclassify to some target class. We show how to formulate the check for data poisoning as a property that can be checked with off-the-shelf verification tools, such as Marabou and nneum, where counterexamples of failed checks constitute the triggers. We further show that the discovered triggers are `transferable&#39; from a small model to a larger, better-trained model, allowing us to analyze state-of-the art performant models trained for image classification tasks.

preprint2021arXiv

Compliance Requirements in Large-Scale Software Development: An Industrial Case Study

Regulatory compliance is a well-studied area, including research on how to model, check, analyse, enact, and verify compliance of software. However, while the theoretical body of knowledge is vast, empirical evidence on challenges with regulatory compliance, as faced by industrial practitioners particularly in the Software Engineering domain, is still lacking. In this paper, we report on an industrial case study which aims at providing insights into common practices and challenges with checking and analysing regulatory compliance, and we discuss our insights in direct relation to the state of reported evidence. Our study is performed at Ericsson AB, a large telecommunications company, which must comply to both locally and internationally governing regulatory entities and standards such as GDPR. The main contributions of this work are empirical evidence on challenges experienced by Ericsson that complement the existing body of knowledge on regulatory compliance.

preprint2021arXiv

NEUROSPF: A tool for the Symbolic Analysis of Neural Networks

This paper presents NEUROSPF, a tool for the symbolic analysis of neural networks. Given a trained neural network model, the tool extracts the architecture and model parameters and translates them into a Java representation that is amenable for analysis using the Symbolic PathFinder symbolic execution tool. Notably, NEUROSPF encodes specialized peer classes for parsing the model&#39;s parameters, thereby enabling efficient analysis. With NEUROSPF the user has the flexibility to specify either the inputs or the network internal parameters as symbolic, promoting the application of program analysis and testing approaches from software engineering to the field of machine learning. For instance, NEUROSPF can be used for coverage-based testing and test generation, finding adversarial examples and also constraint-based repair of neural networks, thus improving the reliability of neural networks and of the applications that use them. Video URL: https://youtu.be/seal8fG78LI

preprint2021arXiv

Recent Trends in Food Intake Monitoring using Wearable Sensors

Obesity and being over-weight add to the risk of some major life threatening diseases. According to W.H.O., a considerable population suffers from these disease whereas poor nutrition plays an important role in this context. Traditional food activity monitoring systems like Food Diaries allow manual record keeping of eating activities over time, and conduct nutrition analysis. However, these systems are prone to the problems of manual record keeping and biased-reporting. Therefore, recently, the research community has focused on designing automatic food monitoring systems since the last decade which consist of one or multiple wearable sensors. These systems aim at providing different macro and micro activity detections like chewing, swallowing, eating episodes, and food types as well as estimations like food mass and eating duration. Researchers have emphasized on high detection accuracy, low estimation errors, un-intrusive nature, low cost and real life implementation while designing these systems, however a comprehensive automatic food monitoring system has yet not been developed. Moreover, according to the best of our knowledge, there is no comprehensive survey in this field that delineates the automatic food monitoring paradigm, covers a handful number of research studies, analyses these studies against food intake monitoring tasks using various parameters, enlists the limitations and sets up future directions. In this research work, we delineate the automatic food intake monitoring paradigm and present a survey of research studies. With special focus on studies with wearable sensors, we analyze these studies against food activity monitoring tasks. We provide brief comparison of these studies along with shortcomings based upon experimentation results conducted under these studies. We setup future directions at the end to facilitate the researchers working in this domain.

preprint2021arXiv

The Diabetic Buddy: A Diet Regulator andTracking System for Diabetics

The prevalence of Diabetes mellitus (DM) in the Middle East is exceptionally high as compared to the rest of the world. In fact, the prevalence of diabetes in the Middle East is 17-20%, which is well above the global average of 8-9%. Research has shown that food intake has strong connections with the blood glucose levels of a patient. In this regard, there is a need to build automatic tools to monitor the blood glucose levels of diabetics and their daily food intake. This paper presents an automatic way of tracking continuous glucose and food intake of diabetics using off-the-shelf sensors and machine learning, respectively. Our system not only helps diabetics to track their daily food intake but also assists doctors to analyze the impact of the food in-take on blood glucose in real-time. For food recognition, we collected a large-scale Middle-Eastern food dataset and proposed a fusion-based framework incorporating several existing pre-trained deep models for Middle-Eastern food recognition.

preprint2020arXiv

A Study of the Learnability of Relational Properties: Model Counting Meets Machine Learning (MCML)

This paper introduces the MCML approach for empirically studying the learnability of relational properties that can be expressed in the well-known software design language Alloy. A key novelty of MCML is quantification of the performance of and semantic differences among trained machine learning (ML) models, specifically decision trees, with respect to entire (bounded) input spaces, and not just for given training and test datasets (as is the common practice). MCML reduces the quantification problems to the classic complexity theory problem of model counting, and employs state-of-the-art model counters. The results show that relatively simple ML models can achieve surprisingly high performance (accuracy and F1-score) when evaluated in the common setting of using training and test datasets - even when the training dataset is much smaller than the test dataset - indicating the seeming simplicity of learning relational properties. However, MCML metrics based on model counting show that the performance can degrade substantially when tested against the entire (bounded) input space, indicating the high complexity of precisely learning these properties, and the usefulness of model counting in quantifying the true performance.

preprint2020arXiv

Cross Lingual Speech Emotion Recognition: Urdu vs. Western Languages

Cross-lingual speech emotion recognition is an important task for practical applications. The performance of automatic speech emotion recognition systems degrades in cross-corpus scenarios, particularly in scenarios involving multiple languages or a previously unseen language such as Urdu for which limited or no data is available. In this study, we investigate the problem of cross-lingual emotion recognition for Urdu language and contribute URDU---the first ever spontaneous Urdu-language speech emotion database. Evaluations are performed using three different Western languages against Urdu and experimental results on different possible scenarios suggest various interesting aspects for designing more adaptive emotion recognition system for such limited languages. In results, selecting training instances of multiple languages can deliver comparable results to baseline and augmentation a fraction of testing language data while training can help to boost accuracy for speech emotion recognition. URDU data is publicly available for further research.

preprint2020arXiv

Motion Corrected Multishot MRI Reconstruction Using Generative Networks with Sensitivity Encoding

Multishot Magnetic Resonance Imaging (MRI) is a promising imaging modality that can produce a high-resolution image with relatively less data acquisition time. The downside of multishot MRI is that it is very sensitive to subject motion and even small amounts of motion during the scan can produce artifacts in the final MR image that may cause misdiagnosis. Numerous efforts have been made to address this issue; however, all of these proposals are limited in terms of how much motion they can correct and the required computational time. In this paper, we propose a novel generative networks based conjugate gradient SENSE (CG-SENSE) reconstruction framework for motion correction in multishot MRI. The proposed framework first employs CG-SENSE reconstruction to produce the motion-corrupted image and then a generative adversarial network (GAN) is used to correct the motion artifacts. The proposed method has been rigorously evaluated on synthetically corrupted data on varying degrees of motion, numbers of shots, and encoding trajectories. Our analyses (both quantitative as well as qualitative/visual analysis) establishes that the proposed method significantly robust and outperforms state-of-the-art motion correction techniques and also reduces severalfold of computational times.

preprint2020arXiv

Phonocardiographic Sensing using Deep Learning for Abnormal Heartbeat Detection

Cardiac auscultation involves expert interpretation of abnormalities in heart sounds using stethoscope. Deep learning based cardiac auscultation is of significant interest to the healthcare community as it can help reducing the burden of manual auscultation with automated detection of abnormal heartbeats. However, the problem of automatic cardiac auscultation is complicated due to the requirement of reliability and high accuracy, and due to the presence of background noise in the heartbeat sound. In this work, we propose a Recurrent Neural Networks (RNNs) based automated cardiac auscultation solution. Our choice of RNNs is motivated by the great success of deep learning in medical applications and by the observation that RNNs represent the deep learning configuration most suitable for dealing with sequential or temporal data even in the presence of noise. We explore the use of various RNN models, and demonstrate that these models deliver the abnormal heartbeat classification score with significant improvement. Our proposed approach using RNNs can be potentially be used for real-time abnormal heartbeat detection in the Internet of Medical Things for remote monitoring applications.

preprint2020arXiv

Polarization Independent Ground State Optical Transitions in Closely Stacked InAs/GaAs Columnar Quantum Dots

This work presents an analysis of the electronic and optical properties of InAs/GaAs columnar quantum dots (QDs) by performing multi-million-atom tight-binding simulations. The plots of the polarisation-dependent ground state optical transition strengths predict that a nearly zero degree of polarisation can be achieved at 1550 nm emission/absorption wavelength by engineering the number of QD layers in a columnar QD. These results are promising for the design of optical devices requiring polarisation insensitive optical response such as semiconductor optical amplifiers.

preprint2020arXiv

Security and Privacy in IoT Using Machine Learning and Blockchain: Threats & Countermeasures

Security and privacy of the users have become significant concerns due to the involvement of the Internet of things (IoT) devices in numerous applications. Cyber threats are growing at an explosive pace making the existing security and privacy measures inadequate. Hence, everyone on the Internet is a product for hackers. Consequently, Machine Learning (ML) algorithms are used to produce accurate outputs from large complex databases, where the generated outputs can be used to predict and detect vulnerabilities in IoT-based systems. Furthermore, Blockchain (BC) techniques are becoming popular in modern IoT applications to solve security and privacy issues. Several studies have been conducted on either ML algorithms or BC techniques. However, these studies target either security or privacy issues using ML algorithms or BC techniques, thus posing a need for a combined survey on efforts made in recent years addressing both security and privacy issues using ML algorithms and BC techniques. In this paper, we provide a summary of research efforts made in the past few years, starting from 2008 to 2019, addressing security and privacy issues using ML algorithms and BCtechniques in the IoT domain. First, we discuss and categorize various security and privacy threats reported in the past twelve years in the IoT domain. Then, we classify the literature on security and privacy efforts based on ML algorithms and BC techniques in the IoT domain. Finally, we identify and illuminate several challenges and future research directions in using ML algorithms and BC techniques to address security and privacy issues in the IoT domain.

preprint2020arXiv

Volumetric Lung Nodule Segmentation using Adaptive ROI with Multi-View Residual Learning

Accurate quantification of pulmonary nodules can greatly assist the early diagnosis of lung cancer, which can enhance patient survival possibilities. A number of nodule segmentation techniques have been proposed, however, all of the existing techniques rely on radiologist 3-D volume of interest (VOI) input or use the constant region of interest (ROI) and only investigate the presence of nodule voxels within the given VOI. Such approaches restrain the solutions to investigate the nodule presence outside the given VOI and also include the redundant structures into VOI, which may lead to inaccurate nodule segmentation. In this work, a novel semi-automated approach for 3-D segmentation of nodule in volumetric computerized tomography (CT) lung scans has been proposed. The proposed technique can be segregated into two stages, at the first stage, it takes a 2-D ROI containing the nodule as input and it performs patch-wise investigation along the axial axis with a novel adaptive ROI strategy. The adaptive ROI algorithm enables the solution to dynamically select the ROI for the surrounding slices to investigate the presence of nodule using deep residual U-Net architecture. The first stage provides the initial estimation of nodule which is further utilized to extract the VOI. At the second stage, the extracted VOI is further investigated along the coronal and sagittal axis with two different networks and finally, all the estimated masks are fed into the consensus module to produce the final volumetric segmentation of nodule. The proposed approach has been rigorously evaluated on the LIDC dataset, which is the largest publicly available dataset. The result suggests that the approach is significantly robust and accurate as compared to the previous state of the art techniques.

preprint2019arXiv

Atomic-level Characterisation of Quantum Computer Arrays by Machine Learning

Atomic level qubits in silicon are attractive candidates for large-scale quantum computing, however, their quantum properties and controllability are sensitive to details such as the number of donor atoms comprising a qubit and their precise location. This work combines machine learning techniques with million-atom simulations of scanning-tunnelling-microscope (STM) images of dopants to formulate a theoretical framework capable of determining the number of dopants at a particular qubit location and their positions with exact lattice-site precision. A convolutional neural network was trained on 100,000 simulated STM images, acquiring a characterisation fidelity (number and absolute donor positions) of above 98\% over a set of 17,600 test images including planar and blurring noise. The method established here will enable a high-precision post-fabrication characterisation of dopant qubits in silicon, with high-throughput potentially alleviating the requirements on the level of resource required for quantum-based characterisation, which may be otherwise a challenge in the context of large qubit arrays for universal quantum computing.