Source author record

Andrea Lodi

Andrea Lodi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Machine Learning Artificial Intelligence eess.SP Neural and Evolutionary Computing Computer Vision Cryptography and Security Discrete Mathematics eess.SY Information Theory math.IT Systems and Control

Catalog footprint

What is connected

19works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Solving Max-Cut to Global Optimality via Feasibility-Preserving Graph Neural Networks

Exact solution of hard combinatorial optimization problems often relies on strong convex relaxations, but solving these relaxations repeatedly inside a branch-and-bound algorithm can be prohibitively expensive. Hence, we consider this challenge for Max-Cut, where branch and bound commonly uses semidefinite programming (SDP) relaxations to bound subproblems. We propose a Max-Cut-specific graph neural network that serves as a principled, lightweight neural proxy for these SDP solvers and can be plugged directly into an exact branch-and-bound framework. The proposed architecture has update steps of complexity $\mathcal{O}(n^2 + ne)$, and predicts both primal- and dual-feasible SDP solutions. The primal SDP solutions yield feasible Max-Cut solutions via the Goemans--Williamson algorithm. In addition, it is trained in a self-supervised fashion without requiring solved SDP relaxations as labels. Empirically, we show that our architecture can substantially reduce the cost of bounding in exact Max-Cut solving by up to $10.6 \times$ compared with using the state-of-the-art SDP solver Mosek. Our work highlights the potential of learned, validity-preserving surrogates for accelerating exact optimization over structured convex relaxations.

preprint2023arXiv

Explainable prediction of Qcodes for NOTAMs using column generation

A NOtice To AirMen (NOTAM) contains important flight route related information. To search and filter them, NOTAMs are grouped into categories called QCodes. In this paper, we develop a tool to predict, with some explanations, a Qcode for a NOTAM. We present a way to extend the interpretable binary classification using column generation proposed in Dash, Gunluk, and Wei (2018) to a multiclass text classification method. We describe the techniques used to tackle the issues related to one vs-rest classification, such as multiple outputs and class imbalances. Furthermore, we introduce some heuristics, including the use of a CP-SAT solver for the subproblems, to reduce the training time. Finally, we show that our approach compares favorably with state-of-the-art machine learning algorithms like Linear SVM and small neural networks while adding the needed interpretability component.

preprint2022arXiv

A Stochastic Proximal Method for Nonsmooth Regularized Finite Sum Optimization

We consider the problem of training a deep neural network with nonsmooth regularization to retrieve a sparse and efficient sub-structure. Our regularizer is only assumed to be lower semi-continuous and prox-bounded. We combine an adaptive quadratic regularization approach with proximal stochastic gradient principles to derive a new solver, called SR2, whose convergence and worst-case complexity are established without knowledge or approximation of the gradient's Lipschitz constant. We formulate a stopping criteria that ensures an appropriate first-order stationarity measure converges to zero under certain conditions. We establish a worst-case iteration complexity of $\mathcal{O}(ε^{-2})$ that matches those of related methods like ProxGEN, where the learning rate is assumed to be related to the Lipschitz constant. Our experiments on network instances trained on CIFAR-10 and CIFAR-100 with $\ell_1$ and $\ell_0$ regularizations show that SR2 consistently achieves higher sparsity and accuracy than related methods such as ProxGEN and ProxSGD.

preprint2022arXiv

Cardinality Minimization, Constraints, and Regularization: A Survey

We survey optimization problems that involve the cardinality of variable vectors in constraints or the objective function. We provide a unified viewpoint on the general problem classes and models, and give concrete examples from diverse application fields such as signal and image processing, portfolio selection, or machine learning. The paper discusses general-purpose modeling techniques and broadly applicable as well as problem-specific exact and heuristic solution approaches. While our perspective is that of mathematical optimization, a main goal of this work is to reach out to and build bridges between the different communities in which cardinality optimization problems are frequently encountered. In particular, we highlight that modern mixed-integer programming, which is often regarded as impractical due to commonly unsatisfactory behavior of black-box solvers applied to generic problem formulations, can in fact produce provably high-quality or even optimal solutions for cardinality optimization problems, even in large-scale real-world settings. Achieving such performance typically draws on the merits of problem-specific knowledge that may stem from different fields of application and, e.g., shed light on structural properties of a model or its solutions, or lead to the development of efficient heuristics; we also provide some illustrative examples.

preprint2022arXiv

Deep Neural Networks pruning via the Structured Perspective Regularization

In Machine Learning, Artificial Neural Networks (ANNs) are a very powerful tool, broadly used in many applications. Often, the selected (deep) architectures include many layers, and therefore a large amount of parameters, which makes training, storage and inference expensive. This motivated a stream of research about compressing the original networks into smaller ones without excessively sacrificing performances. Among the many proposed compression approaches, one of the most popular is \emph{pruning}, whereby entire elements of the ANN (links, nodes, channels, \ldots) and the corresponding weights are deleted. Since the nature of the problem is inherently combinatorial (what elements to prune and what not), we propose a new pruning method based on Operational Research tools. We start from a natural Mixed-Integer-Programming model for the problem, and we use the Perspective Reformulation technique to strengthen its continuous relaxation. Projecting away the indicator variables from this reformulation yields a new regularization term, which we call the Structured Perspective Regularization, that leads to structured pruning of the initial architecture. We test our method on some ResNet architectures applied to CIFAR-10, CIFAR-100 and ImageNet datasets, obtaining competitive performances w.r.t.~the state of the art for structured pruning.

preprint2022arXiv

Lookback for Learning to Branch

The expressive and computationally inexpensive bipartite Graph Neural Networks (GNN) have been shown to be an important component of deep learning based Mixed-Integer Linear Program (MILP) solvers. Recent works have demonstrated the effectiveness of such GNNs in replacing the branching (variable selection) heuristic in branch-and-bound (B&B) solvers. These GNNs are trained, offline and on a collection of MILPs, to imitate a very good but computationally expensive branching heuristic, strong branching. Given that B&B results in a tree of sub-MILPs, we ask (a) whether there are strong dependencies exhibited by the target heuristic among the neighboring nodes of the B&B tree, and (b) if so, whether we can incorporate them in our training procedure. Specifically, we find that with the strong branching heuristic, a child node's best choice was often the parent's second-best choice. We call this the "lookback" phenomenon. Surprisingly, the typical branching GNN of Gasse et al. (2019) often misses this simple "answer". To imitate the target behavior more closely by incorporating the lookback phenomenon in GNNs, we propose two methods: (a) target smoothing for the standard cross-entropy loss function, and (b) adding a Parent-as-Target (PAT) Lookback regularizer term. Finally, we propose a model selection framework to incorporate harder-to-formulate objectives such as solving time in the final models. Through extensive experimentation on standard benchmark instances, we show that our proposal results in up to 22% decrease in the size of the B&B tree and up to 15% improvement in the solving times.

preprint2022arXiv

Machine-learning-based arc selection for constrained shortest path problems in column generation

Column generation is an iterative method used to solve a variety of optimization problems. It decomposes the problem into two parts: a master problem, and one or more pricing problems (PP). The total computing time taken by the method is divided between these two parts. In routing or scheduling applications, the problems are mostly defined on a network, and the PP is usually an NP-hard shortest path problem with resource constraints. In this work, we propose a new heuristic pricing algorithm based on machine learning. By taking advantage of the data collected during previous executions, the objective is to reduce the size of the network and accelerate the PP, keeping only the arcs that have a high chance to be part of the linear relaxation solution. The method has been applied to two specific problems: the vehicle and crew scheduling problem in public transit and the vehicle routing problem with time windows. Reductions in computational time of up to 40% can be obtained.

preprint2022arXiv

MIP-GNN: A Data-Driven Framework for Guiding Combinatorial Solvers

Mixed-integer programming (MIP) technology offers a generic way of formulating and solving combinatorial optimization problems. While generally reliable, state-of-the-art MIP solvers base many crucial decisions on hand-crafted heuristics, largely ignoring common patterns within a given instance distribution of the problem of interest. Here, we propose MIP-GNN, a general framework for enhancing such solvers with data-driven insights. By encoding the variable-constraint interactions of a given mixed-integer linear program (MILP) as a bipartite graph, we leverage state-of-the-art graph neural network architectures to predict variable biases, i.e., component-wise averages of (near) optimal solutions, indicating how likely a variable will be set to 0 or 1 in (near) optimal solutions of binary MILPs. In turn, the predicted biases stemming from a single, once-trained model are used to guide the solver, replacing heuristic components. We integrate MIP-GNN into a state-of-the-art MIP solver, applying it to tasks such as node selection and warm-starting, showing significant improvements compared to the default setting of the solver on two classes of challenging binary MILPs.

preprint2022arXiv

Revisiting local branching with a machine learning lens

Finding high-quality solutions to mixed-integer linear programming problems (MILPs) is of great importance for many practical applications. In this respect, the refinement heuristic local branching (LB) has been proposed to produce improving solutions and has been highly influential for the development of local search methods in MILP. The algorithm iteratively explores a sequence of solution neighborhoods defined by the so-called local branching constraint, namely, a linear inequality limiting the distance from a reference solution. For a LB algorithm, the choice of the neighborhood size is critical to performance. In this work, we study the relation between the size of the search neighborhood and the behavior of the underlying LB algorithm, and we devise a leaning based framework for predicting the best size for the specific instance to be solved. Furthermore, we have also investigated the relation between the time limit for exploring the LB neighborhood and the actual performance of LB scheme, and devised a strategy for adapting the time limit. We computationally show that the neighborhood size and time limit can indeed be learned, leading to improved performances and that the overall algorithm generalizes well both with respect to the instance size and, remarkably, across instances.

preprint2022arXiv

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning as a new approach for solving combinatorial problems, either directly as solvers or by enhancing exact solvers. Based on this context, the ML4CO aims at improving state-of-the-art combinatorial optimization solvers by replacing key heuristic components. The competition featured three challenging tasks: finding the best feasible solution, producing the tightest optimality certificate, and giving an appropriate solver configuration. Three realistic datasets were considered: balanced item placement, workload apportionment, and maritime inventory routing. This last dataset was kept anonymous for the contestants.

preprint2021arXiv

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

This paper offers a methodological contribution at the intersection of machine learning and operations research. Namely, we propose a methodology to quickly predict tactical solutions to a given operational problem. In this context, the tactical solution is less detailed than the operational one but it has to be computed in very short time and under imperfect information. The problem is of importance in various applications where tactical and operational planning problems are interrelated and information about the operational problem is revealed over time. This is for instance the case in certain capacity planning and demand management systems. We formulate the problem as a two-stage optimal prediction stochastic program whose solution we predict with a supervised machine learning algorithm. The training data set consists of a large number of deterministic (second stage) problems generated by controlled probabilistic sampling. The labels are computed based on solutions to the deterministic problems (solved independently and offline) employing appropriate aggregation and subselection methods to address uncertainty. Results on our motivating application in load planning for rail transportation show that deep learning algorithms produce highly accurate predictions in very short computing time (milliseconds or less). The prediction accuracy is comparable to solutions computed by sample average approximation of the stochastic program.

preprint2021arXiv

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

This paper offers a methodological contribution at the intersection of machine learning and operations research. Namely, we propose a methodology to quickly predict expected tactical descriptions of operational solutions (TDOSs). The problem we address occurs in the context of two-stage stochastic programming where the second stage is demanding computationally. We aim to predict at a high speed the expected TDOS associated with the second stage problem, conditionally on the first stage variables. This may be used in support of the solution to the overall two-stage problem by avoiding the online generation of multiple second stage scenarios and solutions. We formulate the tactical prediction problem as a stochastic optimal prediction program, whose solution we approximate with supervised machine learning. The training dataset consists of a large number of deterministic operational problems generated by controlled probabilistic sampling. The labels are computed based on solutions to these problems (solved independently and offline), employing appropriate aggregation and subselection methods to address uncertainty. Results on our motivating application on load planning for rail transportation show that deep learning models produce accurate predictions in very short computing time (milliseconds or less). The predictive accuracy is close to the lower bounds calculated based on sample average approximation of the stochastic prediction programs.

preprint2021arXiv

Predicting the probability distribution of bus travel time to move towards reliable planning of public transport services

An important aspect of the quality of a public transport service is its reliability, which is defined as the invariability of the service attributes. Preventive measures taken during planning can reduce risks of unreliability throughout operations. In order to tackle reliability during the service planning phase, a key piece of information is the long-term prediction of the density of the travel time, which conveys the uncertainty of travel times. We introduce a reliable approach to one of the problems of service planning in public transport, namely the Multiple Depot Vehicle Scheduling Problem (MDVSP), which takes as input a set of trips and the probability density function (p.d.f.) of the travel time of each trip in order to output delay-tolerant vehicle schedules. This work empirically compares probabilistic models for the prediction of the conditional p.d.f. of the travel time, as a first step towards reliable MDVSP solutions. Two types of probabilistic models, namely similarity-based density estimation models and a smoothed Logistic Regression for probabilistic classification model, are compared on a dataset of more than 41,000 trips and 50 bus routes of the city of Montréal. The result of a vast majority of probabilistic models outperforms that of a Random Forests model, which is not inherently probabilistic, thus highlighting the added value of modeling the conditional p.d.f. of the travel time with probabilistic models. A similarity-based density estimation model using a $k$ Nearest Neighbors method and a Kernel Density Estimation predicted the best estimate of the true conditional p.d.f. on this dataset.

preprint2021arXiv

The Covering-Assignment Problem for Swarm-powered Ad-hoc Clouds: A Distributed 3D Mapping Use-case

The popularity of drones is rapidly increasing across the different sectors of the economy. Aerial capabilities and relatively low costs make drones the perfect solution to improve the efficiency of those operations that are typically carried out by humans (e.g., building inspection, photo collection). The potential of drone applications can be pushed even further when they are operated in fleets and in a fully autonomous manner, acting de facto as a drone swarm. Besides automating field operations, a drone swarm can serve as an ad-hoc cloud infrastructure built on top of computing and storage resources available across the swarm members and other connected elements. Even in the absence of Internet connectivity, this cloud can serve the workloads generated by the swarm members themselves, as well as by the field agents operating within the area of interest. By considering the practical example of a swarm-powered 3D reconstruction application, we present a new optimization problem for the efficient generation and execution, on top of swarm-powered ad-hoc cloud infrastructure, of multi-node computing workloads subject to data geolocation and clustering constraints. The objective is the minimization of the overall computing times, including both networking delays caused by the inter-drone data transmission and computation delays. We prove that the problem is NP-hard and present two combinatorial formulations to model it. Computational results on the solution of the formulations show that one of them can be used to solve, within the configured time-limit, more than 50% of the considered real-world instances involving up to two hundred images and six drones.

preprint2020arXiv

Change Point Detection by Cross-Entropy Maximization

Many offline unsupervised change point detection algorithms rely on minimizing a penalized sum of segment-wise costs. We extend this framework by proposing to minimize a sum of discrepancies between segments. In particular, we propose to select the change points so as to maximize the cross-entropy between successive segments, balanced by a penalty for introducing new change points. We propose a dynamic programming algorithm to solve this problem and analyze its complexity. Experiments on two challenging datasets demonstrate the advantages of our method compared to three state-of-the-art approaches.

preprint2020arXiv

Design and implementation of a modular interior-point solver for linear optimization

This paper introduces the algorithmic design and implementation of Tulip, an open-source interior-point solver for linear optimization. It implements a regularized homogeneous interior-point algorithm with multiple centrality corrections, and therefore handles unbounded and infeasible problems. The solver is written in Julia, thus allowing for a flexible and efficient implementation: Tulip's algorithmic framework is fully disentangled from linear algebra implementations and from a model's arithmetic. In particular, this allows to seamlessly integrate specialized routines for structured problems. Extensive computational results are reported. We find that Tulip is competitive with open-source interior-point solvers on the H. Mittelmann's benchmark of barrier linear programming solvers. Furthermore, we design specialized linear algebra routines for structured master problems in the context of Dantzig-Wolfe decomposition. These routines yield a tenfold speedup on large and dense instances that arise in power systems operation and two-stage stochastic programming, thereby outperforming state-of-the-art commercial interior point method solvers. Finally, we illustrate Tulip's ability to use different levels of arithmetic precision by solving problems in extended precision.

preprint2020arXiv

Disjunctive cuts in Mixed-Integer Conic Optimization

This paper studies disjunctive cutting planes in Mixed-Integer Conic Programming. Building on conic duality, we formulate a cut-generating conic program for separating disjunctive cuts, and investigate the impact of the normalization condition on its resolution. In particular, we show that a careful selection of normalization guarantees its solvability and conic strong duality. Then, we highlight the shortcomings of separating conic-infeasible points in an outer-approximation context, and propose conic extensions to the classical lifting and monoidal strengthening procedures. Finally, we assess the computational behavior of various normalization conditions in terms of gap closed, computing time and cut sparsity. In the process, we show that our approach is competitive with the internal lift-and-project cuts of a state-of-the-art solver.

preprint2020arXiv

Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon

This paper surveys the recent attempts, both from the machine learning and operations research communities, at leveraging machine learning to solve combinatorial optimization problems. Given the hard nature of these problems, state-of-the-art algorithms rely on handcrafted heuristics for making decisions that are otherwise too expensive to compute or mathematically not well defined. Thus, machine learning looks like a natural candidate to make such decisions in a more principled and optimized way. We advocate for pushing further the integration of machine learning and combinatorial optimization and detail a methodology to do so. A main point of the paper is seeing generic optimization problems as data points and inquiring what is the relevant distribution of problems to use for learning on a given task.

preprint2020arXiv

Reinforcement Learning Based Penetration Testing of a Microgrid Control Algorithm

Microgrids (MGs) are small-scale power systems which interconnect distributed energy resources and loads within clearly defined regions. However, the digital infrastructure used in an MG to relay sensory information and perform control commands can potentially be compromised due to a cyberattack from a capable adversary. An MG operator is interested in knowing the inherent vulnerabilities in their system and should regularly perform Penetration Testing (PT) activities to prepare for such an event. PT generally involves looking for defensive coverage blindspots in software and hardware infrastructure, however the logic in control algorithms which act upon sensory information should also be considered in PT activities. This paper demonstrates a case study of PT for an MG control algorithm by using Reinforcement Learning (RL) to uncover malicious input which compromises the effectiveness of the controller. Through trial-and-error episodic interactions with a simulated MG, we train an RL agent to find malicious input which reduces the effectiveness of the MG controller.

Andrea Lodi

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Solving Max-Cut to Global Optimality via Feasibility-Preserving Graph Neural Networks

Explainable prediction of Qcodes for NOTAMs using column generation

A Stochastic Proximal Method for Nonsmooth Regularized Finite Sum Optimization

Cardinality Minimization, Constraints, and Regularization: A Survey

Deep Neural Networks pruning via the Structured Perspective Regularization

Lookback for Learning to Branch

Machine-learning-based arc selection for constrained shortest path problems in column generation

MIP-GNN: A Data-Driven Framework for Guiding Combinatorial Solvers

Revisiting local branching with a machine learning lens

The Machine Learning for Combinatorial Optimization Competition (ML4CO): Results and Insights

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

Predicting the probability distribution of bus travel time to move towards reliable planning of public transport services

The Covering-Assignment Problem for Swarm-powered Ad-hoc Clouds: A Distributed 3D Mapping Use-case

Change Point Detection by Cross-Entropy Maximization

Design and implementation of a modular interior-point solver for linear optimization

Disjunctive cuts in Mixed-Integer Conic Optimization

Machine Learning for Combinatorial Optimization: a Methodological Tour d'Horizon

Reinforcement Learning Based Penetration Testing of a Microgrid Control Algorithm