Researcher profile

Hien Nguyen

Hien Nguyen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2022arXiv

CapsNet for Medical Image Segmentation

Convolutional Neural Networks (CNNs) have been successful in solving tasks in computer vision including medical image segmentation due to their ability to automatically extract features from unstructured data. However, CNNs are sensitive to rotation and affine transformation and their success relies on huge-scale labeled datasets capturing various input variations. This network paradigm has posed challenges at scale because acquiring annotated data for medical segmentation is expensive, and strict privacy regulations. Furthermore, visual representation learning with CNNs has its own flaws, e.g., it is arguable that the pooling layer in traditional CNNs tends to discard positional information and CNNs tend to fail on input images that differ in orientations and sizes. Capsule network (CapsNet) is a recent new architecture that has achieved better robustness in representation learning by replacing pooling layers with dynamic routing and convolutional strides, which has shown potential results on popular tasks such as classification, recognition, segmentation, and natural language processing. Different from CNNs, which result in scalar outputs, CapsNet returns vector outputs, which aim to preserve the part-whole relationships. In this work, we first introduce the limitations of CNNs and fundamentals of CapsNet. We then provide recent developments of CapsNet for the task of medical image segmentation. We finally discuss various effective network architectures to implement a CapsNet for both 2D images and 3D volumetric medical image segmentation.

preprint2022arXiv

SPHERExLabTools (SLT): A Python Data Acquisition System for SPHEREx Characterization and Calibration

Selected as the next NASA Medium Class Explorer mission, SPHEREx, the Spectro-Photometer for the History of the Universe, Epoch of Reionization, and Ices Explorer is planned for launch in early 2025. SPHEREx calibration data products include detector spectral response, non-linearity, persistence, and telescope focus error measurements. To produce these calibration products, we have developed a dedicated data acquisition and instrument control system, SPHERExLabTools (SLT). SLT implements driver-level software for control of all testbed instrumentation, graphical interfaces for control of instruments and automated measurements, real-time data visualization, processing, and data archival tools for a variety of output file formats. This work outlines the architecture of the SLT software as a framework for general purpose laboratory data acquisition and instrument control. Initial SPHEREx calibration products acquired while using SLT are also presented.

preprint2021arXiv

Explaining the data or explaining a model? Shapley values that uncover non-linear dependencies

Shapley values have become increasingly popular in the machine learning literature thanks to their attractive axiomatisation, flexibility, and uniqueness in satisfying certain notions of `fairness'. The flexibility arises from the myriad potential forms of the Shapley value \textit{game formulation}. Amongst the consequences of this flexibility is that there are now many types of Shapley values being discussed, with such variety being a source of potential misunderstanding. To the best of our knowledge, all existing game formulations in the machine learning and statistics literature fall into a category which we name the model-dependent category of game formulations. In this work, we consider an alternative and novel formulation which leads to the first instance of what we call model-independent Shapley values. These Shapley values use a (non-parametric) measure of non-linear dependence as the characteristic function. The strength of these Shapley values is in their ability to uncover and attribute non-linear dependencies amongst features. We introduce and demonstrate the use of the energy distance correlations, affine-invariant distance correlation, and Hilbert-Shmidt independence criterion as Shapley value characteristic functions. In particular, we demonstrate their potential value for exploratory data analysis and model diagnostics. We conclude with an interesting expository application to a classical medical survey data set.

preprint2021arXiv

Shapley values for feature selection: The good, the bad, and the axioms

The Shapley value has become popular in the Explainable AI (XAI) literature, thanks, to a large extent, to a solid theoretical foundation, including four "favourable and fair" axioms for attribution in transferable utility games. The Shapley value is provably the only solution concept satisfying these axioms. In this paper, we introduce the Shapley value and draw attention to its recent uses as a feature selection tool. We call into question this use of the Shapley value, using simple, abstract "toy" counterexamples to illustrate that the axioms may work against the goals of feature selection. From this, we develop a number of insights that are then investigated in concrete simulation settings, with a variety of Shapley value formulations, including SHapley Additive exPlanations (SHAP) and Shapley Additive Global importancE (SAGE).

preprint2020arXiv

DVNet: A Memory-Efficient Three-Dimensional CNN for Large-Scale Neurovascular Reconstruction

Maps of brain microarchitecture are important for understanding neurological function and behavior, including alterations caused by chronic conditions such as neurodegenerative disease. Techniques such as knife-edge scanning microscopy (KESM) provide the potential for whole organ imaging at sub-cellular resolution. However, multi-terabyte data sizes make manual annotation impractical and automatic segmentation challenging. Densely packed cells combined with interconnected microvascular networks are a challenge for current segmentation algorithms. The massive size of high-throughput microscopy data necessitates fast and largely unsupervised algorithms. In this paper, we investigate a fully-convolutional, deep, and densely-connected encoder-decoder for pixel-wise semantic segmentation. The excessive memory complexity often encountered with deep and dense networks is mitigated using skip connections, resulting in fewer parameters and enabling a significant performance increase over prior architectures. The proposed network provides superior performance for semantic segmentation problems applied to open-source benchmarks. We finally demonstrate our network for cellular and microvascular segmentation, enabling quantitative metrics for organ-scale neurovascular analysis.

preprint2020arXiv

SARGDV: Efficient identification of groundwater-dependent vegetation using synthetic aperture radar

Groundwater depletion impacts the sustainability of numerous groundwater-dependent vegetation (GDV) globally, placing significant stress on their capacity to provide environmental and ecological support for flora, fauna, and anthropic benefits. Industries such as mining, agriculture, and plantations are heavily reliant on groundwater, the over-exploitation of which risks impacting groundwater regimes, quality, and accessibility for nearby GDVs. Cost effective methods of GDV identification will enable strategic protection of these critical ecological systems, through improved and sustainable groundwater management by communities and industry. Recent application of synthetic aperture radar (SAR) earth observation data in Australia has demonstrated the utility of radar for identifying terrestrial groundwater-dependent ecosystems at scale. We propose a robust classification method to advance identification of GDVs at scale using processed SAR data products adapted from a recent previous method. The method includes the development of SARGDV, a binary classification model, which uses the extreme gradient boosting (XGBoost) algorithm in conjunction with three data cubes composed of Sentinel-1 SAR interferometric wide images. The images were collected as a one-year time series over Mount Gambier, a region in South Australia, known to support GDVs. The SARGDV model demonstrated high performance for classifying GDVs with 77% precision, 76% true positive rate and 96% accuracy. This method may be used to support the protection of GDV communities globally by providing a long term, cost-effective solution to identify GDVs over variable regions and climates, via the use of freely available, high-resolution, globally available Sentinel-1 SAR data sets.

preprint2020arXiv

Shapley value confidence intervals for attributing variance explained

The coefficient of determination, the $R^2$, is often used to measure the variance explained by an affine combination of multiple explanatory covariates. An attribution of this explanatory contribution to each of the individual covariates is often sought in order to draw inference regarding the importance of each covariate with respect to the response phenomenon. A recent method for ascertaining such an attribution is via the game theoretic Shapley value decomposition of the coefficient of determination. Such a decomposition has the desirable efficiency, monotonicity, and equal treatment properties. Under a weak assumption that the joint distribution is pseudo-elliptical, we obtain the asymptotic normality of the Shapley values. We then utilize this result in order to construct confidence intervals and hypothesis tests regarding such quantities. Monte Carlo studies regarding our results are provided. We found that our asymptotic confidence intervals are computationally superior to competing bootstrap methods and are able to improve upon the performance of such intervals. In an expository application to Australian real estate price modelling, we employ Shapley value confidence intervals to identify significant differences between the explanatory contributions of covariates, between models, which otherwise share approximately the same $R^2$ value. These different models are based on real estate data from the same periods in 2019 and 2020, the latter covering the early stages of the arrival of the novel coronavirus, COVID-19.

preprint2020arXiv

Thermal Kinetic Inductance Detectors for millimeter-wave detection

Thermal Kinetic Inductance Detectors (TKIDs) combine the excellent noise performance of traditional bolometers with a radio frequency multiplexing architecture that enables the large detector counts needed for the next generation of millimeter-wave instruments. In this paper, we first discuss the expected noise sources in TKIDs and derive the limits where the phonon noise contribution dominates over the other detector noise terms: generation-recombination, amplifier, and two-level system (TLS) noise. Second, we characterize aluminum TKIDs in a dark environment. We present measurements of TKID resonators with quality factors of about $10^5$ at 80 mK. We also discuss the bolometer thermal conductance, heat capacity, and time constants. These were measured by the use of a resistor on the thermal island to excite the bolometers. These dark aluminum TKIDs demonstrate a noise equivalent power NEP = $2 \times 10^{-17} \mathrm{W}/\mathrm{\sqrt{Hz}} $, with a $1/f$ knee at 0.1 Hz, which provides background noise limited performance for ground-based telescopes observing at 150 GHz.

preprint2010arXiv

A high signal to noise ratio map of the Sunyaev-Zel'dovich increment at 1.1 mm wavelength in Abell 1835

We present an analysis of an 8 arcminute diameter map of the area around the galaxy cluster Abell 1835 from jiggle map observations at a wavelength of 1.1 mm using the Bolometric Camera (Bolocam) mounted on the Caltech Submillimeter Observatory (CSO). The data is well described by a model including an extended Sunyaev-Zel'dovich (SZ) signal from the cluster gas plus emission from two bright background submm galaxies magnified by the gravitational lensing of the cluster. The best-fit values for the central Compton value for the cluster and the fluxes of the two main point sources in the field: SMM J140104+0252, and SMM J14009+0252 are found to be $y_{0}=(4.34\pm0.52\pm0.69)\times10^{-4}$, 6.5$\pm{2.0}\pm0.7$ mJy and 11.3$\pm{1.9}\pm1.1$ mJy, where the first error represents the statistical measurement error and the second error represents the estimated systematic error in the result. This measurement assumes the presence of dust emission from the cluster's central cD galaxy of $1.8\pm0.5$ mJy, based on higher frequency observations of Abell 1835. The cluster image represents one of the highest-significance SZ detections of a cluster in the positive region of the thermal SZ spectrum to date. The inferred central intensity is compared to other SZ measurements of Abell 1835 and this collection of results is used to obtain values for $y_{0} = (3.60\pm0.24)\times10^{-4}$ and the cluster peculiar velocity $v_{z} = -226\pm275$ km/s.

preprint2010arXiv

The Herschel-SPIRE Legacy Survey (HSLS): the scientific goals of a shallow and wide submillimeter imaging survey with SPIRE

A large sub-mm survey with Herschel will enable many exciting science opportunities, especially in an era of wide-field optical and radio surveys and high resolution cosmic microwave background experiments. The Herschel-SPIRE Legacy Survey (HSLS), will lead to imaging data over 4000 sq. degrees at 250, 350, and 500 micron. Major Goals of HSLS are: (a) produce a catalog of 2.5 to 3 million galaxies down to 26, 27 and 33 mJy (50% completeness; 5 sigma confusion noise) at 250, 350 and 500 micron, respectively, in the southern hemisphere (3000 sq. degrees) and in an equatorial strip (1000 sq. degrees), areas which have extensive multi-wavelength coverage and are easily accessible from ALMA. Two thirds of the of the sources are expected to be at z > 1, one third at z > 2 and about a 1000 at z > 5. (b) Remove point source confusion in secondary anisotropy studies with Planck and ground-based CMB data. (c) Find at least 1200 strongly lensed bright sub-mm sources leading to a 2% test of general relativity. (d) Identify 200 proto-cluster regions at z of 2 and perform an unbiased study of the environmental dependence of star formation. (e) Perform an unbiased survey for star formation and dust at high Galactic latitude and make a census of debris disks and dust around AGB stars and white dwarfs.