Source author record

Shinichi Shirakawa

Shinichi Shirakawa appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Neural and Evolutionary Computing Machine Learning math.OC Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

6works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CatCMA with Margin for Single- and Multi-Objective Mixed-Variable Black-Box Optimization

This study focuses on mixed-variable black-box optimization (MV-BBO), addressing continuous, integer, and categorical variables. Many real-world MV-BBO problems involve dependencies among these different types of variables, requiring efficient methods to optimize them simultaneously. Recently, stochastic optimization methods leveraging the mechanism of the covariance matrix adaptation evolution strategy have shown promising results in mixed-integer or mixed-category optimization. However, such methods cannot handle the three types of variables simultaneously. In this study, we propose CatCMA with Margin (CatCMAwM), a stochastic optimization method for MV-BBO that jointly optimizes continuous, integer, and categorical variables. CatCMAwM is developed by incorporating novel integer handling into CatCMA, a mixed-category black-box optimization method employing a joint distribution of multivariate Gaussian and categorical distributions. The proposed integer handling is carefully designed by reviewing existing integer handling and following the design principles of CatCMA. Furthermore, we extend CatCMAwM to multi-objective MV-BBO by instantiating it within the Sofomore framework, obtaining a multi-objective optimizer termed COMO-CatCMA with Margin (COMO-CatCMAwM). Numerical experiments on single-objective MV-BBO problems show that CatCMAwM effectively handles the three types of variables, outperforming state-of-the-art Bayesian optimization methods and baselines that simply incorporate existing integer handlings into CatCMA. Moreover, on bi-objective MV-BBO benchmarks, COMO-CatCMAwM achieves competitive or superior hypervolume compared to representative baselines.

preprint2023arXiv

CMA-ES with Margin: Lower-Bounding Marginal Probability for Mixed-Integer Black-Box Optimization

This study targets the mixed-integer black-box optimization (MI-BBO) problem where continuous and integer variables should be optimized simultaneously. The CMA-ES, our focus in this study, is a population-based stochastic search method that samples solution candidates from a multivariate Gaussian distribution (MGD), which shows excellent performance in continuous BBO. The parameters of MGD, mean and (co)variance, are updated based on the evaluation value of candidate solutions in the CMA-ES. If the CMA-ES is applied to the MI-BBO with straightforward discretization, however, the variance corresponding to the integer variables becomes much smaller than the granularity of the discretization before reaching the optimal solution, which leads to the stagnation of the optimization. In particular, when binary variables are included in the problem, this stagnation more likely occurs because the granularity of the discretization becomes wider, and the existing modification to the CMA-ES does not address this stagnation. To overcome these limitations, we propose a simple modification of the CMA-ES based on lower-bounding the marginal probabilities associated with the generation of integer variables in the MGD. The numerical experiments on the MI-BBO benchmark problems demonstrate the efficiency and robustness of the proposed method.

preprint2022arXiv

A Two-phase Framework with a Bézier Simplex-based Interpolation Method for Computationally Expensive Multi-objective Optimization

This paper proposes a two-phase framework with a Bézier simplex-based interpolation method (TPB) for computationally expensive multi-objective optimization. The first phase in TPB aims to approximate a few Pareto optimal solutions by optimizing a sequence of single-objective scalar problems. The first phase in TPB can fully exploit a state-of-the-art single-objective derivative-free optimizer. The second phase in TPB utilizes a Bézier simplex model to interpolate the solutions obtained in the first phase. The second phase in TPB fully exploits the fact that a Bézier simplex model can approximate the Pareto optimal solution set by exploiting its simplex structure when a given problem is simplicial. We investigate the performance of TPB on the 55 bi-objective BBOB problems. The results show that TPB performs significantly better than HMO-CMA-ES and some state-of-the-art meta-model-based optimizers.

preprint2022arXiv

Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling

Neural architecture search (NAS) aims to automate architecture design processes and improve the performance of deep neural networks. Platform-aware NAS methods consider both performance and complexity and can find well-performing architectures with low computational resources. Although ordinary NAS methods result in tremendous computational costs owing to the repetition of model training, one-shot NAS, which trains the weights of a supernetwork containing all candidate architectures only once during the search process, has been reported to result in a lower search cost. This study focuses on the architecture complexity-aware one-shot NAS that optimizes the objective function composed of the weighted sum of two metrics, such as the predictive performance and number of parameters. In existing methods, the architecture search process must be run multiple times with different coefficients of the weighted sum to obtain multiple architectures with different complexities. This study aims at reducing the search cost associated with finding multiple architectures. The proposed method uses multiple distributions to generate architectures with different complexities and updates each distribution using the samples obtained from multiple distributions based on importance sampling. The proposed method allows us to obtain multiple architectures with different complexities in a single architecture search, resulting in reducing the search cost. The proposed method is applied to the architecture search of convolutional neural networks on the CIAFR-10 and ImageNet datasets. Consequently, compared with baseline methods, the proposed method finds multiple architectures with varying complexities while requiring less computational effort.

preprint2022arXiv

Neural Architecture Search for Improving Latency-Accuracy Trade-off in Split Computing

This paper proposes a neural architecture search (NAS) method for split computing. Split computing is an emerging machine-learning inference technique that addresses the privacy and latency challenges of deploying deep learning in IoT systems. In split computing, neural network models are separated and cooperatively processed using edge servers and IoT devices via networks. Thus, the architecture of the neural network model significantly impacts the communication payload size, model accuracy, and computational load. In this paper, we address the challenge of optimizing neural network architecture for split computing. To this end, we proposed NASC, which jointly explores optimal model architecture and a split point to achieve higher accuracy while meeting latency requirements (i.e., smaller total latency of computation and communication than a certain threshold). NASC employs a one-shot NAS that does not require repeating model training for a computationally efficient architecture search. Our performance evaluation using hardware (HW)-NAS-Bench of benchmark data demonstrates that the proposed NASC can improve the ``communication latency and model accuracy" trade-off, i.e., reduce the latency by approximately 40-60% from the baseline, with slight accuracy degradation.

preprint2019arXiv

Controlling Model Complexity in Probabilistic Model-Based Dynamic Optimization of Neural Network Structures

A method of simultaneously optimizing both the structure of neural networks and the connection weights in a single training loop can reduce the enormous computational cost of neural architecture search. We focus on the probabilistic model-based dynamic neural network structure optimization that considers the probability distribution of structure parameters and simultaneously optimizes both the distribution parameters and connection weights based on gradient methods. Since the existing algorithm searches for the structures that only minimize the training loss, this method might find overly complicated structures. In this paper, we propose the introduction of a penalty term to control the model complexity of obtained structures. We formulate a penalty term using the number of weights or units and derive its analytical natural gradient. The proposed method minimizes the objective function injected the penalty term based on the stochastic gradient descent. We apply the proposed method in the unit selection of a fully-connected neural network and the connection selection of a convolutional neural network. The experimental results show that the proposed method can control model complexity while maintaining performance.

Shinichi Shirakawa

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

CatCMA with Margin for Single- and Multi-Objective Mixed-Variable Black-Box Optimization

CMA-ES with Margin: Lower-Bounding Marginal Probability for Mixed-Integer Black-Box Optimization

A Two-phase Framework with a Bézier Simplex-based Interpolation Method for Computationally Expensive Multi-objective Optimization

Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling

Neural Architecture Search for Improving Latency-Accuracy Trade-off in Split Computing

Controlling Model Complexity in Probabilistic Model-Based Dynamic Optimization of Neural Network Structures