Source author record

Yijun Li

Yijun Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning math-ph math.MP Genomics hep-th Artificial Intelligence astro-ph.CO eess.IV gr-qc Graphics math.OC

Catalog footprint

What is connected

13works

12topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

DecisionLLM: Large Language Models for Long Sequence Decision Exploration

Long-sequence decision-making, which is usually addressed through reinforcement learning (RL), is a critical component for optimizing strategic operations in dynamic environments, such as real-time bidding in computational advertising. The Decision Transformer (DT) introduced a powerful paradigm by framing RL as an autoregressive sequence modeling problem. Concurrently, Large Language Models (LLMs) have demonstrated remarkable success in complex reasoning and planning tasks. This inspires us whether LLMs, which share the same Transformer foundation, but operate at a much larger scale, can unlock new levels of performance in long-horizon sequential decision-making problem. This work investigates the application of LLMs to offline decision making tasks. A fundamental challenge in this domain is the LLMs' inherent inability to interpret continuous values, as they lack a native understanding of numerical magnitude and order when values are represented as text strings. To address this, we propose treating trajectories as a distinct modality. By learning to align trajectory data with natural language task descriptions, our model can autoregressively predict future decisions within a cohesive framework we term DecisionLLM. We establish a set of scaling laws governing this paradigm, demonstrating that performance hinges on three factors: model scale, data volume, and data quality. In offline experimental benchmarks and bidding scenarios, DecisionLLM achieves strong performance. Specifically, DecisionLLM-3B outperforms the traditional Decision Transformer (DT) by 69.4 on Maze2D umaze-v1 and by 0.085 on AuctionNet. It extends the AIGB paradigm and points to promising directions for future exploration in online bidding.

preprint2022arXiv

3D-FM GAN: Towards 3D-Controllable Face Manipulation

3D-controllable portrait synthesis has significantly advanced, thanks to breakthroughs in generative adversarial networks (GANs). However, it is still challenging to manipulate existing face images with precise 3D control. While concatenating GAN inversion and a 3D-aware, noise-to-image GAN is a straight-forward solution, it is inefficient and may lead to noticeable drop in editing quality. To fill this gap, we propose 3D-FM GAN, a novel conditional GAN framework designed specifically for 3D-controllable face manipulation, and does not require any tuning after the end-to-end learning phase. By carefully encoding both the input face image and a physically-based rendering of 3D edits into a StyleGAN's latent spaces, our image generator provides high-quality, identity-preserved, 3D-controllable face manipulation. To effectively learn such novel framework, we develop two essential training strategies and a novel multiplicative co-modulation architecture that improves significantly upon naive schemes. With extensive evaluations, we show that our method outperforms the prior arts on various tasks, with better editability, stronger identity preservation, and higher photo-realism. In addition, we demonstrate a better generalizability of our design on large pose editing and out-of-domain images.

preprint2022arXiv

Computational Methods for Single-Cell Multi-Omics Integration and Alignment

Recently developed technologies to generate single-cell genomic data have made a revolutionary impact in the field of biology. Multi-omics assays offer even greater opportunities to understand cellular states and biological processes. However, the problem of integrating different -omics data with very different dimensionality and statistical properties remains quite challenging. A growing body of computational tools are being developed for this task, leveraging ideas ranging from machine translation to the theory of networks and representing a new frontier on the interface of biology and data science. Our goal in this review paper is to provide a comprehensive, up-to-date survey of computational techniques for the integration of multi-omics and alignment of multiple modalities of genomics data in the single cell research field.

preprint2022arXiv

Emerging Artificial Intelligence Applications in Spatial Transcriptomics Analysis

Spatial transcriptomics (ST) has advanced significantly in the last few years. Such advancement comes with the urgent need for novel computational methods to handle the unique challenges of ST data analysis. Many artificial intelligence (AI) methods have been developed to utilize various machine learning and deep learning techniques for computational ST analysis. This review provides a comprehensive and up-to-date survey of current AI methods for ST analysis.

preprint2022arXiv

Self-adaptive randomized constructive heuristics for the multi-item capacitated lot sizing problem

The Capacitated Lot-Sizing Problem (CLSP) and its variants are important and challenging optimization problems. Constructive heuristics are known to be the most intuitive and fastest methods for finding good feasible solutions for the CLSPs and therefore are often used as a subroutine in building more sophisticated exact or metaheuristic approaches. Classical constructive heuristics, such as period-by-period heuristics and lot elimination heuristics, are widely used by researchers. This paper introduces four perturbation strategies to the period-by-period and lot elimination heuristics to further improve the solution quality. We propose a new procedure to automatically adjust the parameters of the randomized period-by-period (RPP) heuristics. The procedure is proved to offer better solutions with reduced computation times by improving time-consuming parameter tuning phase. Combinations of the self-adaptive RPP heuristics with Tabu search and lot elimination heuristics are tested to be effective. Computational experiments provided high-quality solutions with a 0.88% average optimality gap on benchmark instances of 12 periods and 12 items, and an optimality gap within 1.2% for the instances with 24 periods and 24 items.

preprint2022arXiv

Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

Existing GAN inversion and editing methods work well for aligned objects with a clean background, such as portraits and animal faces, but often struggle for more difficult categories with complex scene layouts and object occlusions, such as cars, animals, and outdoor images. We propose a new method to invert and edit such complex images in the latent space of GANs, such as StyleGAN2. Our key idea is to explore inversion with a collection of layers, spatially adapting the inversion process to the difficulty of the image. We learn to predict the "invertibility" of different image segments and project each segment into a latent layer. Easier regions can be inverted into an earlier layer in the generator's latent space, while more challenging regions can be inverted into a later feature space. Experiments show that our method obtains better inversion results compared to the recent approaches on complex categories, while maintaining downstream editability. Please refer to our project page at https://www.cs.cmu.edu/~SAMInversion.

preprint2020arXiv

Collaborative Distillation for Ultra-Resolution Universal Style Transfer

Universal style transfer methods typically leverage rich representations from deep Convolutional Neural Network (CNN) models (e.g., VGG-19) pre-trained on large collections of images. Despite the effectiveness, its application is heavily constrained by the large model size to handle ultra-resolution images given limited memory. In this work, we present a new knowledge distillation method (named Collaborative Distillation) for encoder-decoder based neural style transfer to reduce the convolutional filters. The main idea is underpinned by a finding that the encoder-decoder pairs construct an exclusive collaborative relationship, which is regarded as a new kind of knowledge for style transfer models. Moreover, to overcome the feature size mismatch when applying collaborative distillation, a linear embedding loss is introduced to drive the student network to learn a linear embedding of the teacher's features. Extensive experiments show the effectiveness of our method when applied to different universal style transfer approaches (WCT and AdaIN), even if the model size is reduced by 15.5 times. Especially, on WCT with the compressed models, we achieve ultra-resolution (over 40 megapixels) universal style transfer on a 12GB GPU for the first time. Further experiments on optimization-based stylization scheme show the generality of our algorithm on different stylization paradigms. Our code and trained models are available at https://github.com/mingsun-tse/collaborative-distillation.

preprint2020arXiv

Learning to Caricature via Semantic Shape Transform

Caricature is an artistic drawing created to abstract or exaggerate facial features of a person. Rendering visually pleasing caricatures is a difficult task that requires professional skills, and thus it is of great interest to design a method to automatically generate such drawings. To deal with large shape changes, we propose an algorithm based on a semantic shape transform to produce diverse and plausible shape exaggerations. Specifically, we predict pixel-wise semantic correspondences and perform image warping on the input photo to achieve dense shape transformation. We show that the proposed framework is able to render visually pleasing shape exaggerations while maintaining their facial structures. In addition, our model allows users to manipulate the shape via the semantic map. We demonstrate the effectiveness of our approach on a large photograph-caricature benchmark dataset with comparisons to the state-of-the-art methods.

preprint2020arXiv

Modeling Artistic Workflows for Image Generation and Editing

People often create art by following an artistic workflow involving multiple stages that inform the overall design. If an artist wishes to modify an earlier decision, significant work may be required to propagate this new decision forward to the final artwork. Motivated by the above observations, we propose a generative model that follows a given artistic workflow, enabling both multi-stage image generation as well as multi-stage image editing of an existing piece of art. Furthermore, for the editing scenario, we introduce an optimization process along with learning-based regularization to ensure the edited image produced by the model closely aligns with the originally provided image. Qualitative and quantitative results on three different artistic datasets demonstrate the effectiveness of the proposed framework on both image generation and editing tasks.

preprint2015arXiv

Robust High Quality Image Guided Depth Upsampling

Time-of-Flight (ToF) depth sensing camera is able to obtain depth maps at a high frame rate. However, its low resolution and sensitivity to the noise are always a concern. A popular solution is upsampling the obtained noisy low resolution depth map with the guidance of the companion high resolution color image. However, due to the constrains in the existing upsampling models, the high resolution depth map obtained in such way may suffer from either texture copy artifacts or blur of depth discontinuity. In this paper, a novel optimization framework is proposed with the brand new data term and smoothness term. The comprehensive experiments using both synthetic data and real data show that the proposed method well tackles the problem of texture copy artifacts and blur of depth discontinuity. It also demonstrates sufficient robustness to the noise. Moreover, a data driven scheme is proposed to adaptively estimate the parameter in the upsampling optimization framework. The encouraging performance is maintained even in the case of large upsampling e.g. $8\times$ and $16\times$.

preprint2014arXiv

Exact Kink Solitons in Skyrme Crystals

We present an explicit integration of the kink soliton equation obtained in a recent interesting study of the classical Skyrme model where the field configurations are of a generalized hedgehog form which is of a domain-wall type. We also show that in such a reduced one-dimensional setting the first-order and second-order equations are equivalent. Consequently, in such a context, all finite-energy solitons are BPS type and precisely known.

preprint2014arXiv

Friedmann's Equations in All Dimensions and Chebyshev's Theorem

This short but systematic work demonstrates a link between Chebyshev's theorem and the explicit integration in cosmological time $t$ and conformal time $η$ of the Friedmann equations in all dimensions and with an arbitrary cosmological constant $Λ$. More precisely, it is shown that for spatially flat universes an explicit integration in $t$ may always be carried out, and that, in the non-flat situation and when $Λ$ is zero and the ratio $w$ of the pressure and energy density in the barotropic equation of state of the perfect-fluid universe is rational, an explicit integration may be carried out if and only if the dimension $n$ of space and $w$ obey some specific relations among an infinite family. The situation for explicit integration in $η$ is complementary to that in $t$. More precisely, it is shown in the flat-universe case with $Λ\neq0$ that an explicit integration in $η$ can be carried out if and only if $w$ and $n$ obey similar relations among a well-defined family which we specify, and that, when $Λ=0$, an explicit integration can always be carried out whether the space is flat, closed, or open. We also show that our method may be used to study more realistic cosmological situations when the equation of state is nonlinear.

preprint2012arXiv

Exact kink solitons in a monopole confinement problem

We explicitly construct all kink solitons arising in the recent study of Auzzi, Bolognesi, and Shifman of a monopole confinement problem in ${\cal N}=2$ supersymmetric QCD. In particular, we show that all finite-energy kink solitons must be BPS.

Yijun Li

What is connected

Connect this record

See the researcher in context

Building this map preview

13 published item(s)

DecisionLLM: Large Language Models for Long Sequence Decision Exploration

3D-FM GAN: Towards 3D-Controllable Face Manipulation

Computational Methods for Single-Cell Multi-Omics Integration and Alignment

Emerging Artificial Intelligence Applications in Spatial Transcriptomics Analysis

Self-adaptive randomized constructive heuristics for the multi-item capacitated lot sizing problem

Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing

Collaborative Distillation for Ultra-Resolution Universal Style Transfer

Learning to Caricature via Semantic Shape Transform

Modeling Artistic Workflows for Image Generation and Editing

Robust High Quality Image Guided Depth Upsampling

Exact Kink Solitons in Skyrme Crystals

Friedmann's Equations in All Dimensions and Chebyshev's Theorem

Exact kink solitons in a monopole confinement problem