Source author record

Agus Sudjianto

Agus Sudjianto appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Applications Artificial Intelligence Computation Computation and Language math.OC q-fin.GN

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Explaining Adverse Actions in Credit Decisions Using Shapley Decomposition

When a financial institution declines an application for credit, an adverse action (AA) is said to occur. The applicant is then entitled to an explanation for the negative decision. This paper focuses on credit decisions based on a predictive model for probability of default and proposes a methodology for AA explanation. The problem involves identifying the important predictors responsible for the negative decision and is straightforward when the underlying model is additive. However, it becomes non-trivial even for linear models with interactions. We consider models with low-order interactions and develop a simple and intuitive approach based on first principles. We then show how the methodology generalizes to the well-known Shapely decomposition and the recently proposed concept of Baseline Shapley (B-Shap). Unlike other Shapley techniques in the literature for local interpretability of machine learning results, B-Shap is computationally tractable since it involves just function evaluations. An illustrative case study is used to demonstrate the usefulness of the method. The paper also discusses situations with highly correlated predictors and desirable properties of fitted models in the credit-lending context, such as monotonicity and continuity.

preprint2022arXiv

Traversing the Local Polytopes of ReLU Neural Networks: A Unified Approach for Network Verification

Although neural networks (NNs) with ReLU activation functions have found success in a wide range of applications, their adoption in risk-sensitive settings has been limited by the concerns on robustness and interpretability. Previous works to examine robustness and to improve interpretability partially exploited the piecewise linear function form of ReLU NNs. In this paper, we explore the unique topological structure that ReLU NNs create in the input space, identifying the adjacency among the partitioned local polytopes and developing a traversing algorithm based on this adjacency. Our polytope traversing algorithm can be adapted to verify a wide range of network properties related to robustness and interpretability, providing an unified approach to examine the network behavior. As the traversing algorithm explicitly visits all local polytopes, it returns a clear and full picture of the network behavior within the traversed region. The time and space complexity of the traversing algorithm is determined by the number of a ReLU NN's partitioning hyperplanes passing through the traversing region.

preprint2020arXiv

Adaptive Explainable Neural Networks (AxNNs)

While machine learning techniques have been successfully applied in several fields, the black-box nature of the models presents challenges for interpreting and explaining the results. We develop a new framework called Adaptive Explainable Neural Networks (AxNN) for achieving the dual goals of good predictive performance and model interpretability. For predictive performance, we build a structured neural network made up of ensembles of generalized additive model networks and additive index models (through explainable neural networks) using a two-stage process. This can be done using either a boosting or a stacking ensemble. For interpretability, we show how to decompose the results of AxNN into main effects and higher-order interaction effects. The computations are inherited from Google's open source tool AdaNet and can be efficiently accelerated by training with distributed computing. The results are illustrated on simulated and real datasets.

preprint2020arXiv

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

Network initialization is the first and critical step for training neural networks. In this paper, we propose a novel network initialization scheme based on the celebrated Stein's identity. By viewing multi-layer feedforward neural networks as cascades of multi-index models, the projection weights to the first hidden layer are initialized using eigenvectors of the cross-moment matrix between the input's second-order score function and the response. The input data is then forward propagated to the next layer and such a procedure can be repeated until all the hidden layers are initialized. Finally, the weights for the output layer are initialized by generalized linear modeling. Such a proposed SteinGLM method is shown through extensive numerical results to be much faster and more accurate than other popular methods commonly used for training neural networks.

preprint2020arXiv

Model Robustness with Text Classification: Semantic-preserving adversarial attacks

We propose algorithms to create adversarial attacks to assess model robustness in text classification problems. They can be used to create white box attacks and black box attacks while at the same time preserving the semantics and syntax of the original text. The attacks cause significant number of flips in white-box setting and same rule based can be used in black-box setting. In a black-box setting, the attacks created are able to reverse decisions of transformer based architectures.

preprint2020arXiv

Supervised Machine Learning Techniques: An Overview with Applications to Banking

This article provides an overview of Supervised Machine Learning (SML) with a focus on applications to banking. The SML techniques covered include Bagging (Random Forest or RF), Boosting (Gradient Boosting Machine or GBM) and Neural Networks (NNs). We begin with an introduction to ML tasks and techniques. This is followed by a description of: i) tree-based ensemble algorithms including Bagging with RF and Boosting with GBMs, ii) Feedforward NNs, iii) a discussion of hyper-parameter optimization techniques, and iv) machine learning interpretability. The paper concludes with a comparison of the features of different ML algorithms. Examples taken from credit risk modeling in banking are used throughout the paper to illustrate the techniques and interpret the results of the algorithms.

preprint2020arXiv

Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms

Supervised Machine Learning (SML) algorithms, such as Gradient Boosting, Random Forest, and Neural Networks, have become popular in recent years due to their superior predictive performance over traditional statistical methods. However, their complexity makes the results hard to interpret without additional tools. There has been a lot of recent work in developing global and local diagnostics for interpreting SML models. In this paper, we propose a locally-interpretable model that takes the fitted ML response surface, partitions the predictor space using model-based regression trees, and fits interpretable main-effects models at each of the nodes. We adapt the algorithm to be efficient in dealing with high-dimensional predictors. While the main focus is on interpretability, the resulting surrogate model also has reasonably good predictive performance.

preprint2013arXiv

Modelling time and vintage variability in retail credit portfolios: the decomposition approach

In this paper, we consider the problem of modelling historical data on retail credit portfolio performance, with a view to forecasting future performance, and facilitating strategic decision making. We consider a situation, common in practice, where accounts with common origination date (typically month) are aggregated into a single vintage for analysis, and the data for analysis consists of a time series of a univariate portfolio performance variable (for example, the proportion of defaulting accounts) for each vintage over successive time periods since origination. An invaluable management tool for understanding portfolio behaviour can be obtained by decomposing the data series nonparametrically into components of exogenous variability (E), maturity (time since origination; M) and vintage (V), referred to as an EMV model. For example, identification of a good macroeconomic model is the key to effective forecasting, particularly in applications such as stress testing, and identification of this can be facilitated by investigation of the macroeconomic component of an EMV decomposition. We show that care needs to be taken with such a decomposition, drawing parallels with the Age-Period-Cohort approach, common in demography, epidemiology and sociology. We develop a practical decomposition strategy, and illustrate our approach using data extracted from a credit card portfolio.

Agus Sudjianto

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Explaining Adverse Actions in Credit Decisions Using Shapley Decomposition

Traversing the Local Polytopes of ReLU Neural Networks: A Unified Approach for Network Verification

Adaptive Explainable Neural Networks (AxNNs)

An Effective and Efficient Initialization Scheme for Training Multi-layer Feedforward Neural Networks

Model Robustness with Text Classification: Semantic-preserving adversarial attacks

Supervised Machine Learning Techniques: An Overview with Applications to Banking

Surrogate Locally-Interpretable Models with Supervised Machine Learning Algorithms

Modelling time and vintage variability in retail credit portfolios: the decomposition approach