Source author record

Marco Molinaro

Marco Molinaro appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.OC Data Structures and Algorithms astro-ph.SR Machine Learning astro-ph.IM astro-ph.EP Artificial Intelligence astro-ph.CO Computational Complexity Discrete Mathematics math.CO

Catalog footprint

What is connected

32works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Online Rack Placement in Large-Scale Data Centers: Online Sampling Optimization and Deployment

This paper optimizes the configuration of large-scale data centers toward cost-effective, reliable and sustainable cloud supply chains. The problem involves placing incoming racks of servers within a data center to maximize demand coverage given space, power and cooling restrictions. We formulate an online integer optimization model to support rack placement decisions. We propose a tractable online sampling optimization (OSO) approach to multi-stage stochastic optimization, which approximates unknown parameters with a sample path and re-optimizes decisions dynamically. We prove that OSO achieves a strong competitive ratio in canonical online resource allocation problems and sublinear regret in the online batched bin packing problem. Theoretical and computational results show it can outperform mean-based certainty-equivalent resolving heuristics. Our algorithm has been packaged into a software solution deployed across Microsoft's data centers, contributing an interactive decision-making process at the human-machine interface. Using deployment data, econometric tests suggest that adoption of the solution has a negative and statistically significant impact on power stranding, estimated at 1-3 percentage point. At the scale of cloud computing, these improvements in data center performance result in significant cost savings and environmental benefits.

preprint2026arXiv

Online Scheduling for LLM Inference with KV Cache Constraints

Large Language Model (LLM) inference, where a trained model generates text one word at a time in response to user prompts, is a computationally intensive process requiring efficient scheduling to optimize latency and resource utilization. A key challenge in LLM inference is the management of the Key-Value (KV) cache, which reduces redundant computations but introduces memory constraints. In this work, we model LLM inference with KV cache constraints theoretically and propose a novel batching and scheduling algorithm that minimizes inference latency while effectively managing the KV cache's memory. More specifically, we make the following contributions. First, to evaluate the performance of online algorithms for scheduling in LLM inference, we introduce a hindsight optimal benchmark, formulated as an integer program that computes the minimum total inference latency under full future information. Second, we prove that no deterministic online algorithm can achieve a constant competitive ratio when the arrival process is arbitrary. Third, motivated by the computational intractability of solving the integer program at scale, we propose a polynomial-time online scheduling algorithm and show that under certain conditions it can achieve a constant competitive ratio. We also demonstrate our algorithm's strong empirical performance by comparing it to the hindsight optimal in a synthetic dataset. Finally, we conduct empirical evaluations on a real-world public LLM inference dataset, simulating the Llama2-70B model on A100 GPUs, and show that our algorithm significantly outperforms the benchmark algorithms. Overall, our results offer a path toward more sustainable and cost-effective LLM deployment.

preprint2026arXiv

OptiMind: Teaching LLMs to Think Like Optimization Experts

Mathematical programming -- the task of expressing operations and decision-making problems in precise mathematical language -- is fundamental across domains, yet remains a skill-intensive process requiring operations research expertise. Recent advances in large language models for complex reasoning have spurred interest in automating this task, translating natural language into executable optimization models. Current approaches, however, achieve limited accuracy, hindered by scarce and noisy training data without leveraging domain knowledge. In this work, we systematically integrate optimization expertise to improve formulation accuracy for mixed-integer linear programming, a key family of mathematical programs. Our OptiMind framework leverages semi-automated, class-based error analysis to guide both training and inference, explicitly preventing common mistakes within each optimization class. Our resulting fine-tuned LLM significantly improves formulation accuracy by 20.7% across multiple optimization benchmarks, with consistent gains under test-time scaling methods such as self-consistency and multi-turn feedback, enabling further progress toward robust LLM-assisted optimization formulation.

preprint2026arXiv

Sample Complexity of Stochastic Optimization with Integer Variables

We establish sample complexity results for stochastic optimization over the integers, especially with a view to understand the complexity with respect to the corresponding continuous optimization problem. We show that integer optimization can sometimes require strictly more samples and sometimes strictly smaller number of samples, depending on the structure of the objective and constraints. 1. For Lipschitz objectives over subsets of the $\ell_\infty$ ball, the statistical complexity of general stochastic mixed-integer, nonlinear, nonconvex optimization is exactly the same as stochastic linear optimization with just bound constraints. 2. For Lipschitz objectives over subsets of the $\ell_2$ ball, we show that integer optimization can require strictly *smaller* sample size compared to the continuous setting in a certain regime. To get to this result, we also establish tight sample complexity results for nonconvex continuous stochastic optimization which, to the best of our knowledge, do not appear in prior work. 3. For strongly convex, smooth objectives, integer optimization has high statistical complexity compared to the continuous setting. In particular, we show that integer optimization requires $Ω(1/ε^2)$ samples to report an $ε$-approximate solution, compared to the well-known $O(1/ε)$ sample complexity from the continuous optimization literature.

preprint2022arXiv

IBIS-A: The IBIS data Archive. High resolution observations of the solar photosphere and chromosphere with contextual data

The IBIS data Archive (IBIS-A) stores data acquired with the Interferometric BIdimensional Spectropolarimeter (IBIS), which was operated at the Dunn Solar Telescope of the US National Solar Observatory from 2003 to 2019. The instrument provided series of high-resolution narrowband spectropolarimetric imaging observations of the photosphere and chromosphere in the range 5800$-$8600 Å~ and co-temporal broadband observations in the same spectral range and with the same field of view of the polarimetric data. We present the data currently stored in IBIS-A, as well as the interface utilized to explore such data and facilitate its scientific exploitation. To this purpose we also describe the use of IBIS-A data in recent and undergoing studies relevant to solar physics and space weather research. IBIS-A includes raw and calibrated observations, as well as science-ready data. The latter comprise maps of the circular, linear, and net circular polarization, and of the magnetic and velocity fields derived for a significant fraction of the series available in the archive. IBIS-A furthermore contains links to observations complementary to the IBIS data, such as co-temporal high-resolution observations of the solar atmosphere available from the instruments onboard the Hinode and IRIS satellites, and full-disc multiband images from INAF solar telescopes. IBIS-A currently consists of 30 TB of data taken with IBIS during 28 observing campaigns performed in 2008 and from 2012 to 2019 on 159 days. Metadata and movies of each calibrated and science-ready series are also available to help users evaluating observing conditions. IBIS-A represents a unique resource for investigating the plasma processes in the solar atmosphere and the solar origin of space weather events.

preprint2022arXiv

Lipschitz Selectors may not Yield Competitive Algorithms for Convex Body Chasing

The current best algorithms for convex body chasing problem in online algorithms use the notion of the Steiner point of a convex set. In particular, the algorithm which always moves to the Steiner point of the request set is $O(d)$ competitive for nested convex body chasing, and this is optimal among memoryless algorithms [Bubeck et al. 2020]. A memoryless algorithm coincides with the notion of a selector in functional analysis. The Steiner point is noted for being Lipschitz with respect to the Hausdorff metric, and for achieving the minimal Lipschitz constant possible. It is natural to ask whether every selector with this Lipschitz property yields a competitive algorithm for nested convex body chasing. We answer this question in the negative by exhibiting a selector which yields a non-competitive algorithm for nested convex body chasing but is Lipschitz with respect to Hausdorff distance. Furthermore, we show that being Lipschitz with respect to an $L_p$-type analog to the Hausdorff distance is sufficient to guarantee competitiveness if and only if $p=1$.

preprint2022arXiv

Lower Bounds on the Size of General Branch-and-Bound Trees

A \emph{general branch-and-bound tree} is a branch-and-bound tree which is allowed to use general disjunctions of the form $π^{\top} x \leq π_0 \,\vee\, π^{\top}x \geq π_0 + 1$, where $π$ is an integer vector and $π_0$ is an integer scalar, to create child nodes. We construct a packing instance, a set covering instance, and a Traveling Salesman Problem instance, such that any general branch-and-bound tree that solves these instances must be of exponential size. We also verify that an exponential lower bound on the size of general branch-and-bound trees persists when we add Gaussian noise to the coefficients of the cross polytope, thus showing that polynomial-size "smoothed analysis" upper bound is not possible. The results in this paper can be viewed as the branch-and-bound analog of the seminal paper by Chvátal et al. \cite{chvatal1989cutting}, who proved lower bounds for the Chvátal-Gomory rank.

preprint2022arXiv

Online Demand Scheduling with Failovers

Motivated by cloud computing applications, we study the problem of how to optimally deploy new hardware subject to both power and robustness constraints. To model the situation observed in large-scale data centers, we introduce the Online Demand Scheduling with Failover problem. There are $m$ identical devices with capacity constraints. Demands come one-by-one and, to be robust against a device failure, need to be assigned to a pair of devices. When a device fails (in a failover scenario), each demand assigned to it is rerouted to its paired device (which may now run at increased capacity). The goal is to assign demands to the devices to maximize the total utilization subject to both the normal capacity constraints as well as these novel failover constraints. These latter constraints introduce new decision tradeoffs not present in classic assignment problems such as the Multiple Knapsack problem and AdWords. In the worst-case model, we design a deterministic $\approx \frac{1}{2}$-competitive algorithm, and show this is essentially tight. To circumvent this constant-factor loss, which in the context of big cloud providers represents substantial capital losses, we consider the stochastic arrival model, where all demands come i.i.d. from an unknown distribution. In this model we design an algorithm that achieves a sub-linear additive regret (i.e. as OPT or $m$ increases, the multiplicative competitive ratio goes to $1$). This requires a combination of different techniques, including a configuration LP with a non-trivial post-processing step and an online monotone matching procedure introduced by Rhee and Talagrand.

preprint2022arXiv

Solving sparse principal component analysis with global support

Sparse principal component analysis with global support (SPCAgs), is the problem of finding the top-$r$ leading principal components such that all these principal components are linear combinations of a common subset of at most $k$ variables. SPCAgs is a popular dimension reduction tool in statistics that enhances interpretability compared to regular principal component analysis (PCA). Methods for solving SPCAgs in the literature are either greedy heuristics (in the special case of $r = 1$) with guarantees under restrictive statistical models or algorithms with stationary point convergence for some regularized reformulation of SPCAgs. Crucially, none of the existing computational methods can efficiently guarantee the quality of the solutions obtained by comparing them against dual bounds. In this work, we first propose a convex relaxation based on operator norms that provably approximates the feasible region of SPCAgs within a $c_1 + c_2 \sqrt{\log r} = O(\sqrt{\log r})$ factor for some constants $c_1, c_2$. To prove this result, we use a novel random sparsification procedure that uses the Pietsch-Grothendieck factorization theorem and may be of independent interest. We also propose a simpler relaxation that is second-order cone representable and gives a $(2\sqrt{r})$-approximation for the feasible region. Using these relaxations, we then propose a convex integer program that provides a dual bound for the optimal value of SPCAgs. Moreover, it also has worst-case guarantees: it is within a multiplicative/additive factor of the original optimal value, and the multiplicative factor is $O(\log r)$ or $O(r)$ depending on the relaxation used. Finally, we conduct computational experiments that show that our convex integer program provides, within a reasonable time, good upper bounds that are typically significantly better than the natural baselines.

preprint2022arXiv

Time-Constrained Learning

Consider a scenario in which we have a huge labeled dataset ${\cal D}$ and a limited time to train some given learner using ${\cal D}$. Since we may not be able to use the whole dataset, how should we proceed? Questions of this nature motivate the definition of the Time-Constrained Learning Task (TCL): Given a dataset ${\cal D}$ sampled from an unknown distribution $μ$, a learner ${\cal L}$ and a time limit $T$, the goal is to obtain in at most $T$ units of time the classification model with highest possible accuracy w.r.t. to $μ$, among those that can be built by ${\cal L}$ using the dataset ${\cal D}$. We propose TCT, an algorithm for the TCL task designed based that on principles from Machine Teaching. We present an experimental study involving 5 different Learners and 20 datasets where we show that TCT consistently outperforms two other algorithms: the first is a Teacher for black-box learners proposed in [Dasgupta et al., ICML 19] and the second is a natural adaptation of random sampling for the TCL setting. We also compare TCT with Stochastic Gradient Descent training -- our method is again consistently better. While our work is primarily practical, we also show that a stripped-down version of TCT has provable guarantees. Under reasonable assumptions, the time our algorithm takes to achieve a certain accuracy is never much bigger than the time it takes the batch teacher (which sends a single batch of examples) to achieve similar accuracy, and in some case it is almost exponentially better.

preprint2021arXiv

Spreading the word -- current status of VO tutorials and schools

With some telescopes standing still, now more than ever simple access to archival data is vital for astronomers and they need to know how to go about it. Within European Virtual Observatory (VO) projects, such as AIDA (2008-2010), ICE (2010-2012), CoSADIE (2013-2015), ASTERICS (2015-2018) and ESCAPE (since 2019), we have been offering Virtual Observatory schools for many years. The aim of these schools are twofold: teaching (early career) researchers about the functionalities and possibilities within the Virtual Observatory and collecting feedback from the astronomical community. In addition to the VO schools on the European level, different national teams have also put effort into VO dissemination. The team at the Centre de Données astronomiques de Strasbourg (CDS) started to explore more and new ways to interact with the community: a series of blog posts on AstroBetter.com or a lunch time session at the virtual EAS meeting 2020. The Spanish VO has conducted virtual VO schools. GAVO has supported online archive workshops and maintains their Virtual Observatory Text Treasures. In this paper, we present the different formats in more detail, and report on the resulting interaction with the community as well as the estimated reach.

preprint2020arXiv

Exo-MerCat: a merged exoplanet catalog with Virtual Observatory connection

The heterogeneity of papers dealing with the discovery and characterization of exoplanets makes every attempt to maintain a uniform exoplanet catalog almost impossible. Four sources currently available online (NASA Exoplanet Archive, Exoplanet Orbit Database, Exoplanet Encyclopaedia, and Open Exoplanet Catalogue) are commonly used by the community, but they can hardly be compared, due to discrepancies in notations and selection criteria. Exo-MerCat is a Python code that collects and selects the most precise measurement for all interesting planetary and orbital parameters contained in the four databases, accounting for the presence of multiple aliases for the same target. It can download information about the host star as well by the use of Virtual Observatory ConeSearch connections to the major archives such as SIMBAD and those available in VizieR. A Graphical User Interface is provided to filter data based on the user's constraints and generate automatic plots that are commonly used in the exoplanetary community. With Exo-MerCat, we retrieved a unique catalog that merges information from the four main databases, standardizing the output and handling notation differences issues. Exo-MerCat can correct as many issues that prevent a direct correspondence between multiple items in the four databases as possible, with the available data. The catalog is available as a VO resource for everyone to use and it is periodically updated, according to the update rates of the source catalogs.

preprint2020arXiv

Knapsack Secretary with Bursty Adversary

The random-order or secretary model is one of the most popular beyond-worst case model for online algorithms. While it avoids the pessimism of the traditional adversarial model, in practice we cannot expect the input to be presented in perfectly random order. This has motivated research on ``best of both worlds'' (algorithms with good performance on both purely stochastic and purely adversarial inputs), or even better, on inputs that are a mix of both stochastic and adversarial parts. Unfortunately the latter seems much harder to achieve and very few results of this type are known. Towards advancing our understanding of designing such robust algorithms, we propose a random-order model with bursts of adversarial time steps. The assumption of burstiness of unexpected patterns is reasonable in many contexts, since changes (e.g. spike in a demand for a good) are often triggered by a common external event. We then consider the Knapsack Secretary problem in this model: there is a knapsack of size $k$ (e.g., available quantity of a good), and in each of the $n$ time steps an item comes with its value and size in $[0,1]$ and the algorithm needs to make an irrevocable decision whether to accept or reject the item. We design an algorithm that gives an approximation of $1 - \tilde{O}(Γ/k)$ when the adversarial time steps can be covered by $Γ\ge \sqrt{k}$ intervals of size $\tilde{O}(\frac{n}{k})$. In particular, setting $Γ= \sqrt{k}$ gives a $(1 - O(\frac{\ln^2 k}{\sqrt{k}}))$-approximation that is resistant to up to a $\frac{\ln^2 k}{\sqrt{k}}$-fraction of the items being adversarial, which is almost optimal even in the absence of adversarial items. Also, setting $Γ= \tildeΩ(k)$ gives a constant approximation that is resistant to up to a constant fraction of items being adversarial.

preprint2020arXiv

Sparse PSD approximation of the PSD cone

While semidefinite programming (SDP) problems are polynomially solvable in theory, it is often difficult to solve large SDP instances in practice. One technique to address this issue is to relax the global positive-semidefiniteness (PSD) constraint and only enforce PSD-ness on smaller $k\times k$ principal submatrices --- we call this the sparse SDP relaxation. Surprisingly, it has been observed empirically that in some cases this approach appears to produce bounds that are close to the optimal objective function value of the original SDP. In this paper, we formally attempt to compare the strength of the sparse SDP relaxation vis-à-vis the original SDP from a theoretical perspective. In order to simplify the question, we arrive at a data independent version of it, where we compare the sizes of SDP cone and the $k$-PSD closure, which is the cone of matrices where PSD-ness is enforced on all $k\times k$ principal submatrices. In particular, we investigate the question of how far a matrix of unit Frobenius norm in the $k$-PSD closure can be from the SDP cone. We provide two incomparable upper bounds on this farthest distance as a function of $k$ and $n$. We also provide matching lower bounds, which show that the upper bounds are tight within a constant in different regimes of $k$ and $n$. Other than linear algebra techniques, we extensively use probabilistic methods to arrive at these bounds. One of the lower bounds is obtained by observing a connection between matrices in the $k$-PSD closure and matrices satisfying the restricted isometry property (RIP).

preprint2020arXiv

The GAPS Programme at TNG XXVIII -- A pair of hot-Neptunes orbiting the young star TOI-942

Both young stars and multi-planet systems are primary objects that allow us to study, understand and constrain planetary formation and evolution theories. We validate the physical nature of two Neptune-type planets transiting TOI-942 (TYC 5909-319-1), a previously unacknowledged young star (50+30-20 Myr) observed by the TESS space mission in Sector 5. Thanks to a comprehensive stellar characterization, TESS light curve modelling and precise radial-velocity measurements, we validated the planetary nature of the TESS candidate and detect an additional transiting planet in the system on a larger orbit. From photometric and spectroscopic observations we performed an exhaustive stellar characterization and derived the main stellar parameters. TOI-942 is a relatively active K2.5V star (logR'hk = -4.17+-0.01) with rotation period Prot = 3.39+-0.01 days, a projected rotation velocity vsini=13.8+-0.5 km/s and a radius of ~0.9 Rsun. We found that the inner planet, TOI-942b, has an orbital period Pb=4.3263+-0.0011 days, a radius Rb=4.242-0.313+0.376 Rearth and a mass upper limit of 16 Mearth at 1-sigma confidence level. The outer planet, TOI-942c, has an orbital period Pc=10.1605-0.0053+0.0056 days, a radius Rc=4.793-0.351+0.410 Rearth and a mass upper limit of 37 Mearth at 1-sigma confidence level.

preprint2016arXiv

Aggregation-based cutting-planes for packing and covering integer programs

In this paper, we study the strength of Chvatal-Gomory (CG) cuts and more generally aggregation cuts for packing and covering integer programs (IPs). Aggregation cuts are obtained as follows: Given an IP formulation, we first generate a single implied inequality using aggregation of the original constraints, then obtain the integer hull of the set defined by this single inequality with variable bounds, and finally use the inequalities describing the integer hull as cutting-planes. Our first main result is to show that for packing and covering IPs, the CG and aggregation closures can be 2-approximated by simply generating the respective closures for each of the original formulation constraints, without using any aggregations. On the other hand, we use computational experiments to show that aggregation cuts can be arbitrarily stronger than cuts from individual constraints for general IPs. The proof of the above stated results for the case of covering IPs with bounds require the development of some new structural results, which may be of independent interest. Finally, we examine the strength of cuts based on k different aggregation inequalities simultaneously, the so-called multi-row cuts, and show that every packing or covering IP with a large integrality gap also has a large k-aggregation closure rank. In particular, this rank is always at least of the order of the logarithm of the integrality gap.

preprint2016arXiv

Analysis of Sparse Cutting-planes for Sparse MILPs with Applications to Stochastic MILPs

In this paper, we present an analysis of the strength of sparse cutting-planes for mixed integer linear programs (MILP) with sparse formulations. We examine three kinds of problems: packing problems, covering problems, and more general MILPs with the only assumption that the objective function is non-negative. Given a MILP instance of one of these three types, assume that we decide on the support of cutting-planes to be used and the strongest inequalities on these supports are added to the linear programming relaxation. Call the optimal objective function value of the linear programming relaxation together with these cuts as $z^{cut}$. We present bounds on the ratio of $z^{cut}$ and the optimal objective function value of the MILP that depends only on the sparsity structure of the constraint matrix and the support of sparse cuts selected, that is, these bounds are completely data independent. These results also shed light on the strength of scenario-specific cuts for two stage stochastic MILPs.

preprint2016arXiv

Improving the Randomization Step in Feasibility Pump

Feasibility pump (FP) is a successful primal heuristic for mixed-integer linear programs (MILP). The algorithm consists of three main components: rounding fractional solution to a mixed-integer one, projection of infeasible solutions to the LP relaxation, and a randomization step used when the algorithm stalls. While many generalizations and improvements to the original Feasibility Pump have been proposed, they mainly focus on the rounding and projection steps. We start a more in-depth study of the randomization step in Feasibility Pump. For that, we propose a new randomization step based on the WalkSAT algorithm for solving SAT instances. First, we provide theoretical analyses that show the potential of this randomization step; to the best of our knowledge, this is the first time any theoretical analysis of running-time of Feasibility Pump or its variants has been conducted. Moreover, we also conduct computational experiments incorporating the proposed modification into a state-of-the-art Feasibility Pump code that reinforce the practical value of the new randomization step.

preprint2016arXiv

Online and Random-order Load Balancing Simultaneously

We consider the problem of online load balancing under lp-norms: sequential jobs need to be assigned to one of the machines and the goal is to minimize the lp-norm of the machine loads. This generalizes the classical problem of scheduling for makespan minimization (case l_infty) and has been thoroughly studied. However, despite the recent push for beyond worst-case analyses, no such results are known for this problem. In this paper we provide algorithms with simultaneous guarantees for the worst-case model as well as for the random-order (i.e. secretary) model, where an arbitrary set of jobs comes in random order. First, we show that the greedy algorithm (with restart), known to have optimal O(p) worst-case guarantee, also has a (typically) improved random-order guarantee. However, the behavior of this algorithm in the random-order model degrades with p. We then propose algorithm SIMULTANEOUSLB that has simultaneously optimal guarantees (within constants) in both worst-case and random-order models. In particular, the random-order guarantee of SIMULTANEOUSLB improves as p increases. One of the main components is a new algorithm with improved regret for Online Linear Optimization (OLO) over the non-negative vectors in the lq ball. Interestingly, this OLO algorithm is also used to prove a purely probabilistic inequality that controls the correlations arising in the random-order model, a common source of difficulty for the analysis. Another important component used in both SIMULTANEOUSLB and our OLO algorithm is a smoothing of the lp-norm that may be of independent interest. This smoothness property allows us to see algorithm SIMULTANEOUSLB as essentially a greedy one in the worst-case model and as a primal-dual one in the random-order model, which is instrumental for its simultaneous guarantees.

preprint2016arXiv

VIALACTEA knowledge base homogenizing access to Milky Way data

The VIALACTEA project has a work package dedicated to Tools and Infrastructure and, inside it, a task for the Database and Virtual Observatory Infrastructure. This task aims at providing an infrastructure to store all the resources needed by the, more purposely, scientific work packages of the project itself. This infrastructure includes a combination of: storage facilities, relational databases and web services on top of them, and has taken, as a whole, the name of VIALACTEA Knowledge Base (VLKB). This contribution illustrates the current status of this VLKB. It details the set of data resources put together; describes the database that allows data discovery through VO inspired metadata maintenance; illustrates the discovery, cutout and access services built on top of the former two for the users to exploit the data content.

preprint2015arXiv

Advanced Environment for Knowledge Discovery in the VIALACTEA Project

The VIALACTEA project aims at building a predictive model of star formation in our galaxy. We present the innovative integrated framework and the main technologies and methodologies to reach this ambitious goal.

preprint2015arXiv

How the Experts Algorithm Can Help Solve LPs Online

We consider the problem of solving packing/covering LPs online, when the columns of the constraint matrix are presented in random order. This problem has received much attention and the main focus is to figure out how large the right-hand sides of the LPs have to be (compared to the entries on the left-hand side of the constraints) to allow $(1+ε)$-approximations online. It is known that the right-hand sides have to be $Ω(ε^{-2} \log m)$ times the left-hand sides, where $m$ is the number of constraints. In this paper we give a primal-dual algorithm that achieve this bound for mixed packing/covering LPs. Our algorithms construct dual solutions using a regret-minimizing online learning algorithm in a black-box fashion, and use them to construct primal solutions. The adversarial guarantee that holds for the constructed duals helps us to take care of most of the correlations that arise in the algorithm; the remaining correlations are handled via martingale concentration and maximal inequalities. These ideas lead to conceptually simple and modular algorithms, which we hope will be useful in other contexts.

preprint2014arXiv

How Good Are Sparse Cutting-Planes?

Sparse cutting-planes are often the ones used in mixed-integer programing (MIP) solvers, since they help in solving the linear programs encountered during branch-&-bound more efficiently. However, how well can we approximate the integer hull by just using sparse cutting-planes? In order to understand this question better, given a polyope $P$ (e.g. the integer hull of a MIP), let $P^k$ be its best approximation using cuts with at most $k$ non-zero coefficients. We consider $d(P, P^k) = \max_{x \in P^k} \left(min_{y \in P} \| x - y\|\right)$ as a measure of the quality of sparse cuts. In our first result, we present general upper bounds on $d(P, P^k)$ which depend on the number of vertices in the polytope and exhibits three phases as $k$ increases. Our bounds imply that if $P$ has polynomially many vertices, using half sparsity already approximates it very well. Second, we present a lower bound on $d(P, P^k)$ for random polytopes that show that the upper bounds are quite tight. Third, we show that for a class of hard packing IPs, sparse cutting-planes do not approximate the integer hull well, that is $d(P, P^k)$ is large for such instances unless $k$ is very close to $n$. Finally, we show that using sparse cutting-planes in extended formulations is at least as good as using them in the original polyhedron, and give an example where the former is actually much better.

preprint2014arXiv

Mixed-integer Quadratic Programming is in NP

Mixed-integer quadratic programming is the problem of optimizing a quadratic function over points in a polyhedral set where some of the components are restricted to be integral. In this paper, we prove that the decision version of mixed-integer quadratic programming is in NP, thereby showing that it is NP-complete. This is established by showing that if the decision version of mixed-integer quadratic programming is feasible, then there exists a solution of polynomial size. This result generalizes and unifies classical results that quadratic programming is in NP and integer linear programming is in NP.

preprint2014arXiv

Some lower bounds on sparse outer approximations of polytopes

Motivated by the need to better understand the properties of sparse cutting-planes used in mixed integer programming solvers, the paper [2] studied the idealized problem of how well a polytope is approximated by the use of sparse valid inequalities. As an extension to this work, we study the following less idealized questions in this paper: (1) Are there integer programs, such that sparse inequalities do not approximate the integer hull well even when added to a linear programming relaxation? (2) Are there polytopes, where the quality of approximation by sparse inequalities cannot be significantly improved by adding a budgeted number of arbitrary (possibly dense) valid inequalities? (3) Are there polytopes that are difficult to approximate under every rotation? (4) Are there polytopes that are difficult to approximate in all directions using sparse inequalities? We answer each of the above questions in the positive.

preprint2013arXiv

A statistical analysis of circumstellar material in Type Ia supernovae

A key tracer of the elusive progenitor systems of Type Ia supernovae (SNe Ia) is the detection of narrow blueshifted time-varying Na I D absorption lines, interpreted as evidence of circumstellar material (CSM) surrounding the progenitor system. The origin of this material is controversial, but the simplest explanation is that it results from previous mass loss in a system containing a white dwarf and a non-degenerate companion star. We present new single-epoch intermediate-resolution spectra of 17 low-redshift SNe Ia taken with XShooter on the ESO Very Large Telescope. Combining this sample with events from the literature, we confirm an excess (~20 per cent) of SNe Ia displaying blueshifted narrow Na I D absorption features compared to non-blueshifted Na I D features. The host galaxies of SNe Ia displaying blueshifted absorption profiles are skewed towards later-type galaxies, compared to SNe Ia that show no Na I D absorption, and SNe Ia displaying blueshifted narrow Na I D absorption features have broader light curves. The strength of the Na I D absorption is stronger in SNe Ia displaying blueshifted Na I D absorption features than those without blueshifted features, and the strength of the blueshifted Na I D is correlated with the B-V colour of the SN at maximum light. This strongly suggests the absorbing material is local to the SN. In the context of the progenitor systems of SNe Ia, we discuss the significance of these findings and other recent observational evidence on the nature of SN Ia progenitors. We present a summary that suggests there are at least two distinct populations of normal, cosmologically useful SNe Ia.

preprint2013arXiv

SN 2009ip á la PESSTO: No evidence for core-collapse yet

We present observations of the interacting transient SN 2009ip, from the start of the outburst in October 2012 until the end of the 2012 observing season. The transient reached a peak of $M_V$=-17.7 mag before fading rapidly, with a total integrated luminosity of 1.9$\times10^{49}$ erg over the period of August-December 2012. The optical and near infrared spectra are dominated by narrow emission lines, signaling a dense circumstellar environment, together with multiple components of broad emission and absorption in H and He at velocities between 0.5-1.2$\times10^4$ km s$^{-1}$\. We see no evidence for nucleosynthesized material in SN 2009ip, even in late-time pseudo-nebular spectra. We set a limit of $<$0.02 M$_{\odot}$\ on the mass of any synthesized $^{56}$Ni from the late time lightcurve. A simple model for the narrow Balmer lines is presented, and used to derive number densities for the circumstellar medium of between $\sim 10^{9}-10^{10}$ cm$^{-3}$. Our near-infrared data does not show any excess at longer wavelengths. Our last data, taken in December 2012, shows that SN 2009ip has spectroscopically evolved to something quite similar to its appearance in late 2009, albeit with higher velocities. It is possible that neither of the eruptive and high luminosity events of SN 2009ip were induced by a core-collapse. We show that the peak and total integrated luminosity can be due to the efficient conversion of kinetic energy from colliding ejecta, and that around 0.05-0.1 M$_{\odot}$\ of material moving at 0.5-1$\times10^4$ km s$^{-1}$\ could comfortably produce the observed luminosity. The ejection of multiple shells, lack of evidence for nucleosynthesied elements and broad nebular lines, are all consistent with the pulsational-pair instability scenario. In this case the progenitor star may still exist, and will be observed after the current outburst fades.

preprint2012arXiv

Geometry of Online Packing Linear Programs

We consider packing LP's with $m$ rows where all constraint coefficients are normalized to be in the unit interval. The n columns arrive in random order and the goal is to set the corresponding decision variables irrevocably when they arrive so as to obtain a feasible solution maximizing the expected reward. Previous (1 - ε)-competitive algorithms require the right-hand side of the LP to be Omega((m/ε^2) log (n/ε)), a bound that worsens with the number of columns and rows. However, the dependence on the number of columns is not required in the single-row case and known lower bounds for the general case are also independent of n. Our goal is to understand whether the dependence on n is required in the multi-row case, making it fundamentally harder than the single-row version. We refute this by exhibiting an algorithm which is (1 - ε)-competitive as long as the right-hand sides are Omega((m^2/ε^2) log (m/ε)). Our techniques refine previous PAC-learning based approaches which interpret the online decisions as linear classifications of the columns based on sampled dual prices. The key ingredient of our improvement comes from a non-standard covering argument together with the realization that only when the columns of the LP belong to few 1-d subspaces we can obtain small such covers; bounding the size of the cover constructed also relies on the geometry of linear classifiers. General packing LP's are handled by perturbing the input columns, which can be seen as making the learning problem more robust.

preprint2011arXiv

A (k+1)-Slope Theorem for the k-Dimensional Infinite Group Relaxation

We prove that any minimal valid function for the k-dimensional infinite group relaxation that is piecewise linear with at most k+1 slopes and does not factor through a linear map with non-trivial kernel is extreme. This generalizes a theorem of Gomory and Johnson for k=1, and Cornuejols and Molinaro for k=2.

preprint2011arXiv

Approximation Algorithms for Correlated Knapsacks and Non-Martingale Bandits

In the stochastic knapsack problem, we are given a knapsack of size B, and a set of jobs whose sizes and rewards are drawn from a known probability distribution. However, we know the actual size and reward only when the job completes. How should we schedule jobs to maximize the expected total reward? We know O(1)-approximations when we assume that (i) rewards and sizes are independent random variables, and (ii) we cannot prematurely cancel jobs. What can we say when either or both of these assumptions are changed? The stochastic knapsack problem is of interest in its own right, but techniques developed for it are applicable to other stochastic packing problems. Indeed, ideas for this problem have been useful for budgeted learning problems, where one is given several arms which evolve in a specified stochastic fashion with each pull, and the goal is to pull the arms a total of B times to maximize the reward obtained. Much recent work on this problem focus on the case when the evolution of the arms follows a martingale, i.e., when the expected reward from the future is the same as the reward at the current state. What can we say when the rewards do not form a martingale? In this paper, we give constant-factor approximation algorithms for the stochastic knapsack problem with correlations and/or cancellations, and also for budgeted learning problems where the martingale condition is not satisfied. Indeed, we can show that previously proposed LP relaxations have large integrality gaps. We propose new time-indexed LP relaxations, and convert the fractional solutions into distributions over strategies, and then use the LP values and the time ordering information from these strategies to devise a randomized adaptive scheduling algorithm. We hope our LP formulation and decomposition methods may provide a new way to address other correlated bandit problems with more general contexts.

preprint2011arXiv

The Query-commit Problem

In the query-commit problem we are given a graph where edges have distinct probabilities of existing. It is possible to query the edges of the graph, and if the queried edge exists then its endpoints are irrevocably matched. The goal is to find a querying strategy which maximizes the expected size of the matching obtained. This stochastic matching setup is motivated by applications in kidney exchanges and online dating. In this paper we address the query-commit problem from both theoretical and experimental perspectives. First, we show that a simple class of edges can be queried without compromising the optimality of the strategy. This property is then used to obtain in polynomial time an optimal querying strategy when the input graph is sparse. Next we turn our attentions to the kidney exchange application, focusing on instances modeled over real data from existing exchange programs. We prove that, as the number of nodes grows, almost every instance admits a strategy which matches almost all nodes. This result supports the intuition that more exchanges are possible on a larger pool of patient/donors and gives theoretical justification for unifying the existing exchange programs. Finally, we evaluate experimentally different querying strategies over kidney exchange instances. We show that even very simple heuristics perform fairly well, being within 1.5% of an optimal clairvoyant strategy, that knows in advance the edges in the graph. In such a time-sensitive application, this result motivates the use of committing strategies.

preprint2010arXiv

Capacitated Vehicle Routing with Non-Uniform Speeds

The capacitated vehicle routing problem (CVRP) involves distributing (identical) items from a depot to a set of demand locations, using a single capacitated vehicle. We study a generalization of this problem to the setting of multiple vehicles having non-uniform speeds (that we call Heterogenous CVRP), and present a constant-factor approximation algorithm. The technical heart of our result lies in achieving a constant approximation to the following TSP variant (called Heterogenous TSP). Given a metric denoting distances between vertices, a depot r containing k vehicles with possibly different speeds, the goal is to find a tour for each vehicle (starting and ending at r), so that every vertex is covered in some tour and the maximum completion time is minimized. This problem is precisely Heterogenous CVRP when vehicles are uncapacitated. The presence of non-uniform speeds introduces difficulties for employing standard tour-splitting techniques. In order to get a better understanding of this technique in our context, we appeal to ideas from the 2-approximation for scheduling in parallel machine of Lenstra et al.. This motivates the introduction of a new approximate MST construction called Level-Prim, which is related to Light Approximate Shortest-path Trees. The last component of our algorithm involves partitioning the Level-Prim tree and matching the resulting parts to vehicles. This decomposition is more subtle than usual since now we need to enforce correlation between the size of the parts and their distances to the depot.

Marco Molinaro

What is connected

Connect this record

See the researcher in context

Building this map preview

32 published item(s)

Online Rack Placement in Large-Scale Data Centers: Online Sampling Optimization and Deployment

Online Scheduling for LLM Inference with KV Cache Constraints

OptiMind: Teaching LLMs to Think Like Optimization Experts

Sample Complexity of Stochastic Optimization with Integer Variables

IBIS-A: The IBIS data Archive. High resolution observations of the solar photosphere and chromosphere with contextual data

Lipschitz Selectors may not Yield Competitive Algorithms for Convex Body Chasing

Lower Bounds on the Size of General Branch-and-Bound Trees

Online Demand Scheduling with Failovers

Solving sparse principal component analysis with global support

Time-Constrained Learning

Spreading the word -- current status of VO tutorials and schools

Exo-MerCat: a merged exoplanet catalog with Virtual Observatory connection

Knapsack Secretary with Bursty Adversary

Sparse PSD approximation of the PSD cone

The GAPS Programme at TNG XXVIII -- A pair of hot-Neptunes orbiting the young star TOI-942

Aggregation-based cutting-planes for packing and covering integer programs

Analysis of Sparse Cutting-planes for Sparse MILPs with Applications to Stochastic MILPs

Improving the Randomization Step in Feasibility Pump

Online and Random-order Load Balancing Simultaneously

VIALACTEA knowledge base homogenizing access to Milky Way data

Advanced Environment for Knowledge Discovery in the VIALACTEA Project

How the Experts Algorithm Can Help Solve LPs Online

How Good Are Sparse Cutting-Planes?

Mixed-integer Quadratic Programming is in NP

Some lower bounds on sparse outer approximations of polytopes

A statistical analysis of circumstellar material in Type Ia supernovae

SN 2009ip á la PESSTO: No evidence for core-collapse yet

Geometry of Online Packing Linear Programs

A (k+1)-Slope Theorem for the k-Dimensional Infinite Group Relaxation

Approximation Algorithms for Correlated Knapsacks and Non-Martingale Bandits

The Query-commit Problem

Capacitated Vehicle Routing with Non-Uniform Speeds