Researcher profile

Kay Chen Tan

Kay Chen Tan contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
20works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

20 published item(s)

preprint2026arXiv

A Systematic Survey on Large Language Models for Evolutionary Optimization: From Modeling to Solving

Large Language Models (LLMs) possess substantial reasoning capabilities and are increasingly applied to optimization tasks, particularly in synergy with evolutionary computation. However, while recent surveys have explored specific aspects of this domain, they lack an integrative perspective that connects problem modeling with solving workflows. To address this gap, we present a systematic review of recent developments and organize them within a structured framework. First, we classify existing research into two primary stages: LLMs for optimization modeling and LLMs for optimization solving. Second, we divide the latter into three paradigms based on the role of the LLM: stand-alone optimizers, low-level components embedded within algorithms, and high-level managers for algorithm selection and generation. Third, for each category, we analyze representative methods, distill technical challenges, and examine their interplay with traditional approaches. Finally, we review interdisciplinary applications across the natural sciences, engineering, and machine learning. Based on this analysis, we highlight key limitations and point toward future directions for developing self-evolving agentic ecosystems. An up-to-date collection of related literature is maintained at https://github.com/ishmael233/LLM4OPT.

preprint2026arXiv

Explainable Molecular Property Prediction: Aligning Chemical Concepts with Predictions via Language Models

Providing explainable molecular property predictions is critical for many scientific domains, such as drug discovery and material science. Though transformer-based language models have shown great potential in accurate molecular property prediction, they neither provide chemically meaningful explanations nor faithfully reveal the molecular structure-property relationships. In this work, we develop a framework for explainable molecular property prediction based on language models, dubbed as Lamole, which can provide chemical concepts-aligned explanations. We take a string-based molecular representation -- Group SELFIES -- as input tokens to pretrain and fine-tune our Lamole, as it provides chemically meaningful semantics. By disentangling the information flows of Lamole, we propose combining self-attention weights and gradients for better quantification of each chemically meaningful substructure's impact on the model's output. To make the explanations more faithfully respect the structure-property relationship, we then carefully craft a marginal loss to explicitly optimize the explanations to be able to align with the chemists' annotations. We bridge the manifold hypothesis with the elaborated marginal loss to prove that the loss can align the explanations with the tangent space of the data manifold, leading to concept-aligned explanations. Experimental results over six mutagenicity datasets and one hepatotoxicity dataset demonstrate Lamole can achieve comparable classification accuracy and boost the explanation accuracy by up to 14.3%, being the state-of-the-art in explainable molecular property prediction.

preprint2026arXiv

VAR-MATH: Probing True Mathematical Reasoning in LLMS via Symbolic Multi-Instance Benchmarks

Recent advances in reinforcement learning (RL) have led to substantial improvements in the mathematical reasoning abilities of LLMs, as measured by standard benchmarks. Yet these gains often persist even when models are trained with flawed signals, such as random or inverted rewards. This raises a fundamental question: do such improvements reflect genuine reasoning, or are they merely artifacts of overfitting to benchmark-specific patterns? To answer this question, we adopt an evaluation-centric perspective and highlight two critical shortcomings in existing protocols. First, benchmark contamination arises because test problems are publicly available, thereby increasing the risk of data leakage. Second, evaluation fragility results from reliance on single-instance assessments, which are sensitive to stochastic outputs and fail to capture reasoning consistency. These limitations suggest the need for a new evaluation paradigm that can probe reasoning ability beyond memorization and one-off success. As response, we propose VAR-MATH, a symbolic evaluation framework that converts fixed numerical problems into parameterized templates and requires models to solve multiple instantiations of each. This design enforces consistency across structurally equivalent variants, mitigates contamination, and enhances robustness through bootstrapped metrics. We apply VAR-MATH to transform three popular benchmarks, AMC23, AIME24, and AIME25, into their symbolic counterparts, VAR-AMC23, VAR-AIME24, and VAR-AIME25. Experimental results show substantial performance drops for RL-trained models on these variabilized benchmarks, especially for smaller models, with average declines of 47.9\% on AMC23, 58.8\% on AIME24, and 72.9\% on AIME25. These findings indicate that some existing RL methods rely on superficial heuristics and fail to generalize beyond specific numerical forms.

preprint2024arXiv

A Hybrid Neural Coding Approach for Pattern Recognition with Spiking Neural Networks

Recently, brain-inspired spiking neural networks (SNNs) have demonstrated promising capabilities in solving pattern recognition tasks. However, these SNNs are grounded on homogeneous neurons that utilize a uniform neural coding for information representation. Given that each neural coding scheme possesses its own merits and drawbacks, these SNNs encounter challenges in achieving optimal performance such as accuracy, response time, efficiency, and robustness, all of which are crucial for practical applications. In this study, we argue that SNN architectures should be holistically designed to incorporate heterogeneous coding schemes. As an initial exploration in this direction, we propose a hybrid neural coding and learning framework, which encompasses a neural coding zoo with diverse neural coding schemes discovered in neuroscience. Additionally, it incorporates a flexible neural coding assignment strategy to accommodate task-specific requirements, along with novel layer-wise learning methods to effectively implement hybrid coding SNNs. We demonstrate the superiority of the proposed framework on image classification and sound localization tasks. Specifically, the proposed hybrid coding SNNs achieve comparable accuracy to state-of-the-art SNNs, while exhibiting significantly reduced inference latency and energy consumption, as well as high noise robustness. This study yields valuable insights into hybrid neural coding designs, paving the way for developing high-performance neuromorphic systems.

preprint2024arXiv

Towards Multi-Objective High-Dimensional Feature Selection via Evolutionary Multitasking

Evolutionary Multitasking (EMT) paradigm, an emerging research topic in evolutionary computation, has been successfully applied in solving high-dimensional feature selection (FS) problems recently. However, existing EMT-based FS methods suffer from several limitations, such as a single mode of multitask generation, conducting the same generic evolutionary search for all tasks, relying on implicit transfer mechanisms through sole solution encodings, and employing single-objective transformation, which result in inadequate knowledge acquisition, exploitation, and transfer. To this end, this paper develops a novel EMT framework for multiobjective high-dimensional feature selection problems, namely MO-FSEMT. In particular, multiple auxiliary tasks are constructed by distinct formulation methods to provide diverse search spaces and information representations and then simultaneously addressed with the original task through a multi-slover-based multitask optimization scheme. Each task has an independent population with task-specific representations and is solved using separate evolutionary solvers with different biases and search preferences. A task-specific knowledge transfer mechanism is designed to leverage the advantage information of each task, enabling the discovery and effective transmission of high-quality solutions during the search process. Comprehensive experimental results demonstrate that our MO-FSEMT framework can achieve overall superior performance compared to the state-of-the-art FS methods on 26 datasets. Moreover, the ablation studies verify the contributions of different components of the proposed MO-FSEMT.

preprint2023arXiv

Differentiable Search of Accurate and Robust Architectures

Deep neural networks (DNNs) are found to be vulnerable to adversarial attacks, and various methods have been proposed for the defense. Among these methods, adversarial training has been drawing increasing attention because of its simplicity and effectiveness. However, the performance of the adversarial training is greatly limited by the architectures of target DNNs, which often makes the resulting DNNs with poor accuracy and unsatisfactory robustness. To address this problem, we propose DSARA to automatically search for the neural architectures that are accurate and robust after adversarial training. In particular, we design a novel cell-based search space specially for adversarial training, which improves the accuracy and the robustness upper bound of the searched architectures by carefully designing the placement of the cells and the proportional relationship of the filter numbers. Then we propose a two-stage search strategy to search for both accurate and robust neural architectures. At the first stage, the architecture parameters are optimized to minimize the adversarial loss, which makes full use of the effectiveness of the adversarial training in enhancing the robustness. At the second stage, the architecture parameters are optimized to minimize both the natural loss and the adversarial loss utilizing the proposed multi-objective adversarial training method, so that the searched neural architectures are both accurate and robust. We evaluate the proposed algorithm under natural data and various adversarial attacks, which reveals the superiority of the proposed method in terms of both accurate and robust architectures. We also conclude that accurate and robust neural architectures tend to deploy very different structures near the input and the output, which has great practical significance on both hand-crafting and automatically designing of accurate and robust neural architectures.

preprint2022arXiv

A Survey on Evolutionary Neural Architecture Search

Deep Neural Networks (DNNs) have achieved great success in many applications. The architectures of DNNs play a crucial role in their performance, which is usually manually designed with rich expertise. However, such a design process is labour intensive because of the trial-and-error process, and also not easy to realize due to the rare expertise in practice. Neural Architecture Search (NAS) is a type of technology that can design the architectures automatically. Among different methods to realize NAS, Evolutionary Computation (EC) methods have recently gained much attention and success. Unfortunately, there has not yet been a comprehensive summary of the EC-based NAS algorithms. This paper reviews over 200 papers of most recent EC-based NAS methods in light of the core components, to systematically discuss their design principles as well as justifications on the design. Furthermore, current challenges and issues are also discussed to identify future research in this emerging field.

preprint2022arXiv

Architecture Augmentation for Performance Predictor Based on Graph Isomorphism

Neural Architecture Search (NAS) can automatically design architectures for deep neural networks (DNNs) and has become one of the hottest research topics in the current machine learning community. However, NAS is often computationally expensive because a large number of DNNs require to be trained for obtaining performance during the search process. Performance predictors can greatly alleviate the prohibitive cost of NAS by directly predicting the performance of DNNs. However, building satisfactory performance predictors highly depends on enough trained DNN architectures, which are difficult to obtain in most scenarios. To solve this critical issue, we propose an effective DNN architecture augmentation method named GIAug in this paper. Specifically, we first propose a mechanism based on graph isomorphism, which has the merit of efficiently generating a factorial of $\boldsymbol n$ (i.e., $\boldsymbol n!$) diverse annotated architectures upon a single architecture having $\boldsymbol n$ nodes. In addition, we also design a generic method to encode the architectures into the form suitable to most prediction models. As a result, GIAug can be flexibly utilized by various existing performance predictors-based NAS algorithms. We perform extensive experiments on CIFAR-10 and ImageNet benchmark datasets on small-, medium- and large-scale search space. The experiments show that GIAug can significantly enhance the performance of most state-of-the-art peer predictors. In addition, GIAug can save three magnitude order of computation cost at most on ImageNet yet with similar performance when compared with state-of-the-art NAS algorithms.

preprint2022arXiv

Unravelling Distance-Dependent Inter-Site Interactions and Magnetic Transition Effects of Heteronuclear Single Atom Catalysts on Electrochemical Oxygen Reduction

Inter-site interactions between single atom catalysts (SACs) in the high loading regime are critical to tuning the catalytic performance. However, the understanding on such interactions and their distance dependent effects remains elusive, especially for the heteronuclear SACs. In this study, we reveal the effects of the distance-dependent inter-site interaction on the catalytic performance of SACs. Using the density functional theory calculations, we systematically investigate the heteronuclear iron and cobalt single atoms co-supported on the nitrogen-doped graphene (FeN4-C and CoN4-C) for oxygen reduction reaction (ORR). We find that as the distance between Fe and Co SACs decreases, FeN4-C exhibits a reduced catalytic activity, which can be mitigated by the presence of an axial hydroxyl ligand, whereas the activity of CoN4-C shows a volcano-like evolution with the optimum reached at the intermediate distance. We further unravel that the transition towards the high-spin state upon adsorption of ORR intermediate adsorbates is responsible for the decreased activity of both FeN4-C and CoN4-C at short inter-site distance. Such high-spin state transition is also found to significantly shift the linear relation between hydroxyl (*OH) and hydroperoxyl (*OOH) adsorbates. These findings not only shed light on the SAC-specific effect of the distance-dependent inter-site interaction between heteronuclear SACs, but also pave a way towards shifting the long-standing linear relations observed in multiple-electron chemical reactions.

preprint2021arXiv

Manifold Interpolation for Large-Scale Multi-Objective Optimization via Generative Adversarial Networks

Large-scale multiobjective optimization problems (LSMOPs) are characterized as involving hundreds or even thousands of decision variables and multiple conflicting objectives. An excellent algorithm for solving LSMOPs should find Pareto-optimal solutions with diversity and escape from local optima in the large-scale search space. Previous research has shown that these optimal solutions are uniformly distributed on the manifold structure in the low-dimensional space. However, traditional evolutionary algorithms for solving LSMOPs have some deficiencies in dealing with this structural manifold, resulting in poor diversity, local optima, and inefficient searches. In this work, a generative adversarial network (GAN)-based manifold interpolation framework is proposed to learn the manifold and generate high-quality solutions on this manifold, thereby improving the performance of evolutionary algorithms. We compare the proposed algorithm with several state-of-the-art algorithms on large-scale multiobjective benchmark functions. Experimental results have demonstrated the significant improvements achieved by this framework in solving LSMOPs.

preprint2021arXiv

Multi-Space Evolutionary Search for Large-Scale Optimization

In recent years, to improve the evolutionary algorithms used to solve optimization problems involving a large number of decision variables, many attempts have been made to simplify the problem solution space of a given problem for the evolutionary search. In the literature, the existing approaches can generally be categorized as decomposition-based methods and dimension-reduction-based methods. The former decomposes a large-scale problem into several smaller subproblems, while the latter transforms the original high-dimensional solution space into a low-dimensional space. However, it is worth noting that a given large-scale optimization problem may not always be decomposable, and it is also difficult to guarantee that the global optimum of the original problem is preserved in the reduced low-dimensional problem space. This paper thus proposes a new search paradigm, namely the multi-space evolutionary search, to enhance the existing evolutionary search methods for solving large-scale optimization problems. In contrast to existing approaches that perform an evolutionary search in a single search space, the proposed paradigm is designed to conduct a search in multiple solution spaces that are derived from the given problem, each possessing a unique landscape. The proposed paradigm makes no assumptions about the large-scale optimization problem of interest, such as that the problem is decomposable or that a certain relationship exists among the decision variables. To verify the efficacy of the proposed paradigm, comprehensive empirical studies in comparison to four state-of-the-art algorithms were conducted using the CEC2013 large-scale benchmark problems.

preprint2020arXiv

A Tandem Learning Rule for Effective Training and Rapid Inference of Deep Spiking Neural Networks

Spiking neural networks (SNNs) represent the most prominent biologically inspired computing model for neuromorphic computing (NC) architectures. However, due to the non-differentiable nature of spiking neuronal functions, the standard error back-propagation algorithm is not directly applicable to SNNs. In this work, we propose a tandem learning framework, that consists of an SNN and an Artificial Neural Network (ANN) coupled through weight sharing. The ANN is an auxiliary structure that facilitates the error back-propagation for the training of the SNN at the spike-train level. To this end, we consider the spike count as the discrete neural representation in the SNN, and design ANN neuronal activation function that can effectively approximate the spike count of the coupled SNN. The proposed tandem learning rule demonstrates competitive pattern recognition and regression capabilities on both the conventional frame-based and event-based vision datasets, with at least an order of magnitude reduced inference time and total synaptic operations over other state-of-the-art SNN implementations. Therefore, the proposed tandem learning rule offers a novel solution to training efficient, low latency, and high accuracy deep SNNs with low computing resources.

preprint2020arXiv

BiLO-CPDP: Bi-Level Programming for Automated Model Discovery in Cross-Project Defect Prediction

Cross-Project Defect Prediction (CPDP), which borrows data from similar projects by combining a transfer learner with a classifier, have emerged as a promising way to predict software defects when the available data about the target project is insufficient. How-ever, developing such a model is challenge because it is difficult to determine the right combination of transfer learner and classifier along with their optimal hyper-parameter settings. In this paper, we propose a tool, dubbedBiLO-CPDP, which is the first of its kind to formulate the automated CPDP model discovery from the perspective of bi-level programming. In particular, the bi-level programming proceeds the optimization with two nested levels in a hierarchical manner. Specifically, the upper-level optimization routine is designed to search for the right combination of transfer learner and classifier while the nested lower-level optimization routine aims to optimize the corresponding hyper-parameter settings.To evaluateBiLO-CPDP, we conduct experiments on 20 projects to compare it with a total of 21 existing CPDP techniques, along with its single-level optimization variant and Auto-Sklearn, a state-of-the-art automated machine learning tool. Empirical results show that BiLO-CPDP champions better prediction performance than all other 21 existing CPDP techniques on 70% of the projects, while be-ing overwhelmingly superior to Auto-Sklearn and its single-level optimization variant on all cases. Furthermore, the unique bi-level formalization inBiLO-CPDP also permits to allocate more budget to the upper-level, which significantly boosts the performance.

preprint2020arXiv

Constructing Accurate and Efficient Deep Spiking Neural Networks with Double-threshold and Augmented Schemes

Spiking neural networks (SNNs) are considered as a potential candidate to overcome current challenges such as the high-power consumption encountered by artificial neural networks (ANNs), however there is still a gap between them with respect to the recognition accuracy on practical tasks. A conversion strategy was thus introduced recently to bridge this gap by mapping a trained ANN to an SNN. However, it is still unclear that to what extent this obtained SNN can benefit both the accuracy advantage from ANN and high efficiency from the spike-based paradigm of computation. In this paper, we propose two new conversion methods, namely TerMapping and AugMapping. The TerMapping is a straightforward extension of a typical threshold-balancing method with a double-threshold scheme, while the AugMapping additionally incorporates a new scheme of augmented spike that employs a spike coefficient to carry the number of typical all-or-nothing spikes occurring at a time step. We examine the performance of our methods based on MNIST, Fashion-MNIST and CIFAR10 datasets. The results show that the proposed double-threshold scheme can effectively improve accuracies of the converted SNNs. More importantly, the proposed AugMapping is more advantageous for constructing accurate, fast and efficient deep SNNs as compared to other state-of-the-art approaches. Our study therefore provides new approaches for further integration of advanced techniques in ANNs to improve the performance of SNNs, which could be of great merit to applied developments with spike-based neuromorphic computing.

preprint2020arXiv

Evolutionary Multi-Objective Optimization Driven by Generative Adversarial Networks

Recently, more and more works have proposed to drive evolutionary algorithms using machine learning models.Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models.Since it usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality.To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs).At each generation of the proposed algorithm, the parent solutions are first classified into \emph{real} and \emph{fake} samples to train the GANs; then the offspring solutions are sampled by the trained GANs.Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data.The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables.Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm.

preprint2020arXiv

Evolutionary Multiobjective Optimization Driven by Generative Adversarial Networks (GANs)

Recently, increasing works have proposed to drive evolutionary algorithms using machine learning models. Usually, the performance of such model based evolutionary algorithms is highly dependent on the training qualities of the adopted models. Since it usually requires a certain amount of data (i.e. the candidate solutions generated by the algorithms) for model training, the performance deteriorates rapidly with the increase of the problem scales, due to the curse of dimensionality. To address this issue, we propose a multi-objective evolutionary algorithm driven by the generative adversarial networks (GANs). At each generation of the proposed algorithm, the parent solutions are first classified into real and fake samples to train the GANs; then the offspring solutions are sampled by the trained GANs. Thanks to the powerful generative ability of the GANs, our proposed algorithm is capable of generating promising offspring solutions in high-dimensional decision space with limited training data. The proposed algorithm is tested on 10 benchmark problems with up to 200 decision variables. Experimental results on these test problems demonstrate the effectiveness of the proposed algorithm.

preprint2020arXiv

Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks

Spiking neural networks (SNNs) have shown clear advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency, due to their event-driven nature and sparse communication. However, the training of deep SNNs is not straightforward. In this paper, we propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition, which is referred to as progressive tandem learning of deep SNNs. By studying the equivalence between ANNs and SNNs in the discrete representation space, a primitive network conversion method is introduced that takes full advantage of spike count to approximate the activation value of analog neurons. To compensate for the approximation errors arising from the primitive network conversion, we further introduce a layer-wise learning method with an adaptive training scheduler to fine-tune the network weights. The progressive tandem learning framework also allows hardware constraints, such as limited weight precision and fan-in connections, to be progressively imposed during training. The SNNs thus trained have demonstrated remarkable classification and regression capabilities on large-scale object recognition, image reconstruction, and speech separation tasks, while requiring at least an order of magnitude reduced inference time and synaptic operations than other state-of-the-art SNN implementations. It, therefore, opens up a myriad of opportunities for pervasive mobile and embedded devices with a limited power budget.

preprint2020arXiv

Synaptic Learning with Augmented Spikes

Traditional neuron models use analog values for information representation and computation, while all-or-nothing spikes are employed in the spiking ones. With a more brain-like processing paradigm, spiking neurons are more promising for improvements on efficiency and computational capability. They extend the computation of traditional neurons with an additional dimension of time carried by all-or-nothing spikes. Could one benefit from both the accuracy of analog values and the time-processing capability of spikes? In this paper, we introduce a concept of augmented spikes to carry complementary information with spike coefficients in addition to spike latencies. New augmented spiking neuron model and synaptic learning rules are proposed to process and learn patterns of augmented spikes. We provide systematic insight into the properties and characteristics of our methods, including classification of augmented spike patterns, learning capacity, construction of causality, feature detection, robustness and applicability to practical tasks such as acoustic and visual pattern recognition. The remarkable results highlight the effectiveness and potential merits of our methods. Importantly, our augmented approaches are versatile and can be easily generalized to other spike-based systems, contributing to a potential development for them including neuromorphic computing.

preprint2020arXiv

Towards Efficient Processing and Learning with Spikes: New Approaches for Multi-Spike Learning

Spikes are the currency in central nervous systems for information transmission and processing. They are also believed to play an essential role in low-power consumption of the biological systems, whose efficiency attracts increasing attentions to the field of neuromorphic computing. However, efficient processing and learning of discrete spikes still remains as a challenging problem. In this paper, we make our contributions towards this direction. A simplified spiking neuron model is firstly introduced with effects of both synaptic input and firing output on membrane potential being modeled with an impulse function. An event-driven scheme is then presented to further improve the processing efficiency. Based on the neuron model, we propose two new multi-spike learning rules which demonstrate better performance over other baselines on various tasks including association, classification, feature detection. In addition to efficiency, our learning rules demonstrate a high robustness against strong noise of different types. They can also be generalized to different spike coding schemes for the classification task, and notably single neuron is capable of solving multi-category classifications with our learning rules. In the feature detection task, we re-examine the ability of unsupervised STDP with its limitations being presented, and find a new phenomenon of losing selectivity. In contrast, our proposed learning rules can reliably solve the task over a wide range of conditions without specific constraints being applied. Moreover, our rules can not only detect features but also discriminate them. The improved performance of our methods would contribute to neuromorphic computing as a preferable choice.

preprint2020arXiv

Understanding the Automated Parameter Optimization on Transfer Learning for CPDP: An Empirical Study

Data-driven defect prediction has become increasingly important in software engineering process. Since it is not uncommon that data from a software project is insufficient for training a reliable defect prediction model, transfer learning that borrows data/knowledge from other projects to facilitate the model building at the current project, namely cross-project defect prediction (CPDP), is naturally plausible. Most CPDP techniques involve two major steps, i.e., transfer learning and classification, each of which has at least one parameter to be tuned to achieve their optimal performance. This practice fits well with the purpose of automated parameter optimization. However, there is a lack of thorough understanding about what are the impacts of automated parameter optimization on various CPDP techniques. In this paper, we present the first empirical study that looks into such impacts on 62 CPDP techniques, 13 of which are chosen from the existing CPDP literature while the other 49 ones have not been explored before. We build defect prediction models over 20 real-world software projects that are of different scales and characteristics. Our findings demonstrate that: (1) Automated parameter optimization substantially improves the defect prediction performance of 77\% CPDP techniques with a manageable computational cost. Thus more efforts on this aspect are required in future CPDP studies. (2) Transfer learning is of ultimate importance in CPDP. Given a tight computational budget, it is more cost-effective to focus on optimizing the parameter configuration of transfer learning algorithms (3) The research on CPDP is far from mature where it is "not difficult" to find a better alternative by making a combination of existing transfer learning and classification techniques. This finding provides important insights about the future design of CPDP techniques.