Researcher profile

Bao-Gang Hu

Bao-Gang Hu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - Emerging
12works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

12 published item(s)

preprint2021arXiv

A design of human-like robust AI machines in object identification

This is a perspective paper inspired from the study of Turing Test proposed by A.M. Turing (23 June 1912 - 7 June 1954) in 1950. Following one important implication of Turing Test for enabling a machine with a human-like behavior or performance, we define human-like robustness (HLR) for AI machines. The objective of the new definition aims to enforce AI machines with HLR, including to evaluate them in terms of HLR. A specific task is discussed only on object identification, because it is the most common task for every person in daily life. Similar to the perspective, or design, position by Turing, we provide a solution of how to achieve HLR AI machines without constructing them and conducting real experiments. The solution should consists of three important features in the machines. The first feature of HLR machines is to utilize common sense from humans for realizing a causal inference. The second feature is to make a decision from a semantic space for having interpretations to the decision. The third feature is to include a "human-in-the-loop" setting for advancing HLR machines. We show an "identification game" using proposed design of HLR machines. The present paper shows an attempt to learn and explore further from Turing Test towards the design of human-like AI machines.

preprint2016arXiv

Locally Imposing Function for Generalized Constraint Neural Networks - A Study on Equality Constraints

This work is a further study on the Generalized Constraint Neural Network (GCNN) model [1], [2]. Two challenges are encountered in the study, that is, to embed any type of prior information and to select its imposing schemes. The work focuses on the second challenge and studies a new constraint imposing scheme for equality constraints. A new method called locally imposing function (LIF) is proposed to provide a local correction to the GCNN prediction function, which therefore falls within Locally Imposing Scheme (LIS). In comparison, the conventional Lagrange multiplier method is considered as Globally Imposing Scheme (GIS) because its added constraint term exhibits a global impact to its objective function. Two advantages are gained from LIS over GIS. First, LIS enables constraints to fire locally and explicitly in the domain only where they need on the prediction function. Second, constraints can be implemented within a network setting directly. We attempt to interpret several constraint methods graphically from a viewpoint of the locality principle. Numerical examples confirm the advantages of the proposed method. In solving boundary value problems with Dirichlet and Neumann constraints, the GCNN model with LIF is possible to achieve an exact satisfaction of the constraints.

preprint2016arXiv

Self-Paced Learning: an Implicit Regularization Perspective

Self-paced learning (SPL) mimics the cognitive mechanism of humans and animals that gradually learns from easy to hard samples. One key issue in SPL is to obtain better weighting strategy that is determined by minimizer function. Existing methods usually pursue this by artificially designing the explicit form of SPL regularizer. In this paper, we focus on the minimizer function, and study a group of new regularizer, named self-paced implicit regularizer that is deduced from robust loss function. Based on the convex conjugacy theory, the minimizer function for self-paced implicit regularizer can be directly learned from the latent loss function, while the analytic form of the regularizer can be even known. A general framework (named SPL-IR) for SPL is developed accordingly. We demonstrate that the learning procedure of SPL-IR is associated with latent robust loss functions, thus can provide some theoretical inspirations for its working mechanism. We further analyze the relation between SPL-IR and half-quadratic optimization. Finally, we implement SPL-IR to both supervised and unsupervised tasks, and experimental results corroborate our ideas and demonstrate the correctness and effectiveness of implicit regularizers.

preprint2015arXiv

Improving Image Restoration with Soft-Rounding

Several important classes of images such as text, barcode and pattern images have the property that pixels can only take a distinct subset of values. This knowledge can benefit the restoration of such images, but it has not been widely considered in current restoration methods. In this work, we describe an effective and efficient approach to incorporate the knowledge of distinct pixel values of the pristine images into the general regularized least squares restoration framework. We introduce a new regularizer that attains zero at the designated pixel values and becomes a quadratic penalty function in the intervals between them. When incorporated into the regularized least squares restoration framework, this regularizer leads to a simple and efficient step that resembles and extends the rounding operation, which we term as soft-rounding. We apply the soft-rounding enhanced solution to the restoration of binary text/barcode images and pattern images with multiple distinct pixel values. Experimental results show that soft-rounding enhanced restoration methods achieve significant improvement in both visual quality and quantitative measures (PSNR and SSIM). Furthermore, we show that this regularizer can also benefit the restoration of general natural images.

preprint2015arXiv

Information Theory and its Relation to Machine Learning

In this position paper, I first describe a new perspective on machine learning (ML) by four basic problems (or levels), namely, "What to learn?", "How to learn?", "What to evaluate?", and "What to adjust?". The paper stresses more on the first level of "What to learn?", or "Learning Target Selection". Towards this primary problem within the four levels, I briefly review the existing studies about the connection between information theoretical learning (ITL [1]) and machine learning. A theorem is given on the relation between the empirically-defined similarity measure and information measures. Finally, a conjecture is proposed for pursuing a unified mathematical interpretation to learning target selection.

preprint2014arXiv

A study on cost behaviors of binary classification measures in class-imbalanced problems

This work investigates into cost behaviors of binary classification measures in a background of class-imbalanced problems. Twelve performance measures are studied, such as F measure, G-means in terms of accuracy rates, and of recall and precision, balance error rate (BER), Matthews correlation coefficient (MCC), Kappa coefficient, etc. A new perspective is presented for those measures by revealing their cost functions with respect to the class imbalance ratio. Basically, they are described by four types of cost functions. The functions provides a theoretical understanding why some measures are suitable for dealing with class-imbalanced problems. Based on their cost functions, we are able to conclude that G-means of accuracy rates and BER are suitable measures because they show "proper" cost behaviors in terms of "a misclassification from a small class will cause a greater cost than that from a large class". On the contrary, F1 measure, G-means of recall and precision, MCC and Kappa coefficient measures do not produce such behaviors so that they are unsuitable to serve our goal in dealing with the problems properly.

preprint2014arXiv

Unsupervised Ranking of Multi-Attribute Objects Based on Principal Curves

Unsupervised ranking faces one critical challenge in evaluation applications, that is, no ground truth is available. When PageRank and its variants show a good solution in related subjects, they are applicable only for ranking from link-structure data. In this work, we focus on unsupervised ranking from multi-attribute data which is also common in evaluation tasks. To overcome the challenge, we propose five essential meta-rules for the design and assessment of unsupervised ranking approaches: scale and translation invariance, strict monotonicity, linear/nonlinear capacities, smoothness, and explicitness of parameter size. These meta-rules are regarded as high level knowledge for unsupervised ranking tasks. Inspired by the works in [8] and [14], we propose a ranking principal curve (RPC) model, which learns a one-dimensional manifold function to perform unsupervised ranking tasks on multi-attribute observations. Furthermore, the RPC is modeled to be a cubic Bézier curve with control points restricted in the interior of a hypercube, thereby complying with all the five meta-rules to infer a reasonable ranking list. With control points as the model parameters, one is able to understand the learned manifold and to interpret the ranking list semantically. Numerical experiments of the presented RPC model are conducted on two open datasets of different ranking applications. In comparison with the state-of-the-art approaches, the new model is able to show more reasonable ranking lists.

preprint2013arXiv

A New Approach of Deriving Bounds between Entropy and Error from Joint Distribution: Case Study for Binary Classifications

The existing upper and lower bounds between entropy and error are mostly derived through an inequality means without linking to joint distributions. In fact, from either theoretical or application viewpoint, there exists a need to achieve a complete set of interpretations to the bounds in relation to joint distributions. For this reason, in this work we propose a new approach of deriving the bounds between entropy and error from a joint distribution. The specific case study is given on binary classifications, which can justify the need of the proposed approach. Two basic types of classification errors are investigated, namely, the Bayesian and non-Bayesian errors. For both errors, we derive the closed-form expressions of upper bound and lower bound in relation to joint distributions. The solutions show that Fano's lower bound is an exact bound for any type of errors in a relation diagram of "Error Probability vs. Conditional Entropy". A new upper bound for the Bayesian error is derived with respect to the minimum prior probability, which is generally tighter than Kovalevskij's upper bound.

preprint2013arXiv

A New Strategy of Cost-Free Learning in the Class Imbalance Problem

In this work, we define cost-free learning (CFL) formally in comparison with cost-sensitive learning (CSL). The main difference between them is that a CFL approach seeks optimal classification results without requiring any cost information, even in the class imbalance problem. In fact, several CFL approaches exist in the related studies, such as sampling and some criteria-based pproaches. However, to our best knowledge, none of the existing CFL and CSL approaches are able to process the abstaining classifications properly when no information is given about errors and rejects. Based on information theory, we propose a novel CFL which seeks to maximize normalized mutual information of the targets and the decision outputs of classifiers. Using the strategy, we can deal with binary/multi-class classifications with/without abstaining. Significant features are observed from the new strategy. While the degree of class imbalance is changing, the proposed strategy is able to balance the errors and rejects accordingly and automatically. Another advantage of the strategy is its ability of deriving optimal rejection thresholds for abstaining classifications and the "equivalent" costs in binary classifications. The connection between rejection thresholds and ROC curve is explored. Empirical investigation is made on several benchmark data sets in comparison with other existing approaches. The classification results demonstrate a promising perspective of the strategy in machine learning.

preprint2012arXiv

Analytical Bounds between Entropy and Error Probability in Binary Classifications

The existing upper and lower bounds between entropy and error probability are mostly derived from the inequality of the entropy relations, which could introduce approximations into the analysis. We derive analytical bounds based on the closed-form solutions of conditional entropy without involving any approximation. Two basic types of classification errors are investigated in the context of binary classification problems, namely, Bayesian and non-Bayesian errors. We theoretically confirm that Fano's lower bound is an exact lower bound for any types of classifier in a relation diagram of "error probability vs. conditional entropy". The analytical upper bounds are achieved with respect to the minimum prior probability, which are tighter than Kovalevskij's upper bound.

preprint2012arXiv

What are the Differences between Bayesian Classifiers and Mutual-Information Classifiers?

In this study, both Bayesian classifiers and mutual information classifiers are examined for binary classifications with or without a reject option. The general decision rules in terms of distinctions on error types and reject types are derived for Bayesian classifiers. A formal analysis is conducted to reveal the parameter redundancy of cost terms when abstaining classifications are enforced. The redundancy implies an intrinsic problem of "non-consistency" for interpreting cost terms. If no data is given to the cost terms, we demonstrate the weakness of Bayesian classifiers in class-imbalanced classifications. On the contrary, mutual-information classifiers are able to provide an objective solution from the given data, which shows a reasonable balance among error types and reject types. Numerical examples of using two types of classifiers are given for confirming the theoretical differences, including the extremely-class-imbalanced cases. Finally, we briefly summarize the Bayesian classifiers and mutual-information classifiers in terms of their application advantages, respectively.

preprint2011arXiv

Information-Theoretic Measures for Objective Evaluation of Classifications

This work presents a systematic study of objective evaluations of abstaining classifications using Information-Theoretic Measures (ITMs). First, we define objective measures for which they do not depend on any free parameter. This definition provides technical simplicity for examining "objectivity" or "subjectivity" directly to classification evaluations. Second, we propose twenty four normalized ITMs, derived from either mutual information, divergence, or cross-entropy, for investigation. Contrary to conventional performance measures that apply empirical formulas based on users' intuitions or preferences, the ITMs are theoretically more sound for realizing objective evaluations of classifications. We apply them to distinguish "error types" and "reject types" in binary classifications without the need for input data of cost terms. Third, to better understand and select the ITMs, we suggest three desirable features for classification assessment measures, which appear more crucial and appealing from the viewpoint of classification applications. Using these features as "meta-measures", we can reveal the advantages and limitations of ITMs from a higher level of evaluation knowledge. Numerical examples are given to corroborate our claims and compare the differences among the proposed measures. The best measure is selected in terms of the meta-measures, and its specific properties regarding error types and reject types are analytically derived.