Source author record

Borivoje Nikolic

Borivoje Nikolic appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence eess.SP Information Theory math.IT Networking and Internet Architecture

Catalog footprint

What is connected

3works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Rethinking Network Topologies for Cost-Effective Mixture-of-Experts LLM Serving

Mixture-of-experts (MoE) architectures have turned LLM serving into a cluster-scale workload in which communication consumes a considerable portion of LLM serving runtime. This has prompted industry to invest heavily in expensive high-bandwidth scale-up networks. We question whether such costly infrastructure is strictly necessary. We present the first systematic cross-layer analysis of network cost-effectiveness for MoE LLM serving, comparing four representative XPU (e.g., GPU/TPU) topologies (scale-up, scale-out, 3D torus, and 3D full-mesh). We find that lower-cost switchless topologies are more cost-effective than the scale-up topology across all serving scenarios explored, improving cost-effectiveness by 20.6-56.2%. In particular, the 3D full-mesh topology is Pareto-optimal in terms of the performance-cost tradeoff. We also find that current scale-up link bandwidths are over-provisioned: reducing the link bandwidth improves throughput per cost by up to 27%. A forward-looking analysis of upcoming GPU generations indicates that the cost-performance advantage of switchless networks will likely persist.

preprint2020arXiv

AutoCkt: Deep Reinforcement Learning of Analog Circuit Designs

Domain specialization under energy constraints in deeply-scaled CMOS has been driving the need for agile development of Systems on a Chip (SoCs). While digital subsystems have design flows that are conducive to rapid iterations from specification to layout, analog and mixed-signal modules face the challenge of a long human-in-the-middle iteration loop that requires expert intuition to verify that post-layout circuit parameters meet the original design specification. Existing automated solutions that optimize circuit parameters for a given target design specification have limitations of being schematic-only, inaccurate, sample-inefficient or not generalizable. This work presents AutoCkt, a machine learning optimization framework trained using deep reinforcement learning that not only finds post-layout circuit parameters for a given target specification, but also gains knowledge about the entire design space through a sparse subsampling technique. Our results show that for multiple circuit topologies, AutoCkt is able to converge and meet all target specifications on at least 96.3% of tested design goals in schematic simulation, on average 40X faster than a traditional genetic algorithm. Using the Berkeley Analog Generator, AutoCkt is able to design 40 LVS passed operational amplifiers in 68 hours, 9.6X faster than the state-of-the-art when considering layout parasitics.

preprint2012arXiv

Coding and System Design for Quantize-Map-and-Forward Relaying

In this paper we develop a low-complexity coding scheme and system design framework for the half duplex relay channel based on the Quantize-Map-and-Forward (QMF) relay- ing scheme. The proposed framework allows linear complexity operations at all network terminals. We propose the use of binary LDPC codes for encoding at the source and LDGM codes for mapping at the relay. We express joint decoding at the destination as a belief propagation algorithm over a factor graph. This graph has the LDPC and LDGM codes as subgraphs connected via probabilistic constraints that model the QMF relay operations. We show that this coding framework extends naturally to the high SNR regime using bit interleaved coded modulation (BICM). We develop density evolution analysis tools for this factor graph and demonstrate the design of practical codes for the half-duplex relay channel that perform within 1dB of information theoretic QMF threshold.