Source author record

Giang Nguyen

Giang Nguyen appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computer Vision math.PR Methodology Software Engineering Biological Physics Cell Behavior cond-mat.soft Human-Computer Interaction Social and Information Networks

Catalog footprint

What is connected

11works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Manas: Mining Software Repositories to Assist AutoML

Today deep learning is widely used for building software. A software engineering problem with deep learning is that finding an appropriate convolutional neural network (CNN) model for the task can be a challenge for developers. Recent work on AutoML, more precisely neural architecture search (NAS), embodied by tools like Auto-Keras aims to solve this problem by essentially viewing it as a search problem where the starting point is a default CNN model, and mutation of this CNN model allows exploration of the space of CNN models to find a CNN model that will work best for the problem. These works have had significant success in producing high-accuracy CNN models. There are two problems, however. First, NAS can be very costly, often taking several hours to complete. Second, CNN models produced by NAS can be very complex that makes it harder to understand them and costlier to train them. We propose a novel approach for NAS, where instead of starting from a default CNN model, the initial model is selected from a repository of models extracted from GitHub. The intuition being that developers solving a similar problem may have developed a better starting point compared to the default model. We also analyze common layer patterns of CNN models in the wild to understand changes that the developers make to improve their models. Our approach uses commonly occurring changes as mutation operators in NAS. We have extended Auto-Keras to implement our approach. Our evaluation using 8 top voted problems from Kaggle for tasks including image classification and image regression shows that given the same search time, without loss of accuracy, Manas produces models with 42.9% to 99.6% fewer number of parameters than Auto-Keras' models. Benchmarked on GPU, Manas' models train 30.3% to 641.6% faster than Auto-Keras' models.

preprint2022arXiv

The effectiveness of feature attribution methods and its correlation with automatic evaluation scores

Explaining the decisions of an Artificial Intelligence (AI) model is increasingly critical in many real-world, high-stake applications. Hundreds of papers have either proposed new feature attribution methods, discussed or harnessed these tools in their work. However, despite humans being the target end-users, most attribution methods were only evaluated on proxy automatic-evaluation metrics (Zhang et al. 2018; Zhou et al. 2016; Petsiuk et al. 2018). In this paper, we conduct the first user study to measure attribution map effectiveness in assisting humans in ImageNet classification and Stanford Dogs fine-grained classification, and when an image is natural or adversarial (i.e., contains adversarial perturbations). Overall, feature attribution is surprisingly not more effective than showing humans nearest training-set examples. On a harder task of fine-grained dog categorization, presenting attribution maps to humans does not help, but instead hurts the performance of human-AI teams compared to AI alone. Importantly, we found automatic attribution-map evaluation measures to correlate poorly with the actual human-AI team performance. Our findings encourage the community to rigorously test their methods on the downstream human-in-the-loop applications and to rethink the existing evaluation metrics.

preprint2020arXiv

Bayesian estimation of trend components within Markovian regime-switching models for wholesale electricity prices: an application to the South Australian wholesale electricity market

We discuss and extend methods for estimating Markovian-Regime-Switching (MRS) and trend models for wholesale electricity prices. We argue the existing methods of trend estimation used in the electricity price modelling literature either require an ambiguous definition of an extreme price, or lead to issues when implementing model selection [23]. The first main contribution of this paper is to design and infer a model which has a model-based definition of extreme prices and permits the use of model selection criteria. Due to the complexity of the MRS models inference is not straightforward. In the existing literature an approximate EM algorithm is used [26]. Another contribution of this paper is to implement exact inference in a Bayesian setting. This also allows the use of posterior predictive checks to assess model fit. We demonstrate the methodologies with South Australian electricity market.

preprint2020arXiv

ContCap: A scalable framework for continual image captioning

While advanced image captioning systems are increasingly describing images coherently and exactly, recent progress in continual learning allows deep learning models to avoid catastrophic forgetting. However, the domain where image captioning working with continual learning has not yet been explored. We define the task in which we consolidate continual learning and image captioning as continual image captioning. In this work, we propose ContCap, a framework generating captions over a series of new tasks coming, seamlessly integrating continual learning into image captioning besides addressing catastrophic forgetting. After proving forgetting in image captioning, we propose various techniques to overcome the forgetting dilemma by taking a simple fine-tuning schema as the baseline. We split MS-COCO 2014 dataset to perform experiments in class-incremental settings without revisiting dataset of previously provided tasks. Experiments show remarkable improvements in the performance on the old tasks while the figures for the new surprisingly surpass fine-tuning. Our framework also offers a scalable solution for continual image or video captioning.

preprint2020arXiv

Dissecting Catastrophic Forgetting in Continual Learning by Deep Visualization

Interpreting the behaviors of Deep Neural Networks (usually considered as a black box) is critical especially when they are now being widely adopted over diverse aspects of human life. Taking the advancements from Explainable Artificial Intelligent, this paper proposes a novel technique called Auto DeepVis to dissect catastrophic forgetting in continual learning. A new method to deal with catastrophic forgetting named critical freezing is also introduced upon investigating the dilemma by Auto DeepVis. Experiments on a captioning model meticulously present how catastrophic forgetting happens, particularly showing which components are forgetting or changing. The effectiveness of our technique is then assessed; and more precisely, critical freezing claims the best performance on both previous and coming tasks over baselines, proving the capability of the investigation. Our techniques could not only be supplementary to existing solutions for completely eradicating catastrophic forgetting for life-long learning but also explainable.

preprint2020arXiv

Dynamic Node Embeddings from Edge Streams

Networks evolve continuously over time with the addition, deletion, and changing of links and nodes. Such temporal networks (or edge streams) consist of a sequence of timestamped edges and are seemingly ubiquitous. Despite the importance of accurately modeling the temporal information, most embedding methods ignore it entirely or approximate the temporal network using a sequence of static snapshot graphs. In this work, we propose using the notion of temporal walks for learning dynamic embeddings from temporal networks. Temporal walks capture the temporally valid interactions (e.g., flow of information, spread of disease) in the dynamic network in a lossless fashion. Based on the notion of temporal walks, we describe a general class of embeddings called continuous-time dynamic network embeddings (CTDNEs) that completely avoid the issues and problems that arise when approximating the temporal network as a sequence of static snapshot graphs. Unlike previous work, CTDNEs learn dynamic node embeddings directly from the temporal network at the finest temporal granularity and thus use only temporally valid information. As such CTDNEs naturally support online learning of the node embeddings in a streaming real-time fashion. Finally, the experiments demonstrate the effectiveness of this class of embedding methods that leverage temporal walks as it achieves an average gain in AUC of 11.9% across all methods and graphs.

preprint2020arXiv

Estimation of Markovian-regime-switching models with independent regimes

Markovian-regime-switching (MRS) models are commonly used for modelling economic time series, including electricity prices where independent regime models are used, since they can more accurately and succinctly capture electricity price dynamics than dependent regime MRS models can. We can think of these independent regime MRS models for electricity prices as a collection of independent AR(1) processes, of which only one process is observed at each time; which is observed is determined by a (hidden) Markov chain. Here we develop novel, computationally feasible methods for MRS models with independent regimes including forward, backward and EM algorithms. The key idea is to augment the hidden process with a counter which records the time since the hidden Markov chain last visited each state that corresponding to an AR(1) process.

preprint2020arXiv

Repairing Deep Neural Networks: Fix Patterns and Challenges

Significant interest in applying Deep Neural Network (DNN) has fueled the need to support engineering of software that uses DNNs. Repairing software that uses DNNs is one such unmistakable SE need where automated tools could be beneficial; however, we do not fully understand challenges to repairing and patterns that are utilized when manually repairing DNNs. What challenges should automated repair tools address? What are the repair patterns whose automation could help developers? Which repair patterns should be assigned a higher priority for building automated bug repair tools? This work presents a comprehensive study of bug fix patterns to address these questions. We have studied 415 repairs from Stack overflow and 555 repairs from Github for five popular deep learning libraries Caffe, Keras, Tensorflow, Theano, and Torch to understand challenges in repairs and bug repair patterns. Our key findings reveal that DNN bug fix patterns are distinctive compared to traditional bug fix patterns; the most common bug fix patterns are fixing data dimension and neural network connectivity; DNN bug fixes have the potential to introduce adversarial vulnerabilities; DNN bug fixes frequently introduce new bugs; and DNN bug localization, reuse of trained model, and coping with frequent releases are major challenges faced by developers when fixing bugs. We also contribute a benchmark of 667 DNN (bug, repair) instances.

preprint2015arXiv

Ligand mediated adhesive mechanics of two deformed spheres

A self-consistent model is developed to investigate attachment / detachment kinetics of two soft, deformable microspheres with irregular surface and coated with flexible binding ligands. The model highlights how the microscale binding kinetics of these ligands as well as the attractive/repulsive potential of the charged surface affects the static deformed configuration of the spheres. It is shown that in the limit of smooth, neutral charged surface (i.e., the Debye length, $κ\rightarrow \infty $), interacting via elastic binders (i.e., the stiffness coefficient, $λ\rightarrow 0$) the adhesion mechanics approaches the regime of application of the JKR theory, and in this particular limit, the contact radius scales with the particle radius, according to the scaling law, $R_c\propto R^{\frac{2}{3}}$. We show that adhesion dominates in larger particles with highly charged surface and with resilient binders. Normal stress distribution within the contact area fluctuates with the binder stiffness coefficient, from a maximum at the center to a maximum at the periphery of the region. Surface heterogeneities result in a diminished adhesion with a distinct reduction in the pull off force, larger separation gap, weaker normal stress and limited area of adhesion. These results are in agreement with the published experimental findings.

preprint2012arXiv

Extinction probabilities of branching processes with countably infinitely many types

We present two iterative methods for computing the global and partial extinction probability vectors for Galton-Watson processes with countably infinitely many types. The probabilistic interpretation of these methods involves truncated Galton-Watson processes with finite sets of types and modified progeny generating functions. In addition, we discuss the connection of the convergence norm of the mean progeny matrix with extinction criteria. Finally, we give a sufficient condition for a population to become extinct almost surely even though its population size explodes on the average, which is impossible in a branching process with finitely many types. We conclude with some numerical illustrations for our algorithmic methods.

preprint2012arXiv

On the nature of Phase-Type Poisson distributions

Matrix-form Poisson probability distributions were recently introduced as one matrix generalization of Panjer distributions. We show in this paper that under the constraint that their representation is to be nonnegative, they have a physical interpretation as extensions of PH distributions, and we name this restricted family Phase-type Poisson. We use our physical interpretation to construct an EM algorithm-based estimation procedure.

Giang Nguyen

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Manas: Mining Software Repositories to Assist AutoML

The effectiveness of feature attribution methods and its correlation with automatic evaluation scores

Bayesian estimation of trend components within Markovian regime-switching models for wholesale electricity prices: an application to the South Australian wholesale electricity market

ContCap: A scalable framework for continual image captioning

Dissecting Catastrophic Forgetting in Continual Learning by Deep Visualization

Dynamic Node Embeddings from Edge Streams

Estimation of Markovian-regime-switching models with independent regimes

Repairing Deep Neural Networks: Fix Patterns and Challenges

Ligand mediated adhesive mechanics of two deformed spheres

Extinction probabilities of branching processes with countably infinitely many types

On the nature of Phase-Type Poisson distributions