Source author record

Jinyoung Park

Jinyoung Park appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.CO Machine Learning Artificial Intelligence Computer Vision Data Structures and Algorithms

Catalog footprint

What is connected

7works

5topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

A refined graph container lemma and applications to the hard-core model on bipartite expanders

We establish a refined version of a graph container lemma due to Galvin and discuss several applications related to the hard-core model on bipartite expander graphs. Given a graph $G$ and $λ>0$, the hard-core model on $G$ at activity $λ$ is the probability distribution $μ_{G,λ}$ on independent sets in $G$ given by $μ_{G,λ}(I)\propto λ^{|I|}$. As one of our main applications, we show that the hard-core model at activity $λ$ on the hypercube $Q_d$ exhibits a `structured phase' for $λ= Ω( \log^2 d/d^{1/2})$ in the following sense: in a typical sample from $μ_{Q_d,λ}$, most vertices are contained in one side of the bipartition of $Q_d$. This improves upon a result of Galvin which establishes the same for $λ=Ω(\log d/ d^{1/3})$. As another application, we establish a fully polynomial-time approximation scheme (FPTAS) for the hard-core model on a $d$-regular bipartite $α$-expander, with $α>0$ fixed, when $λ= Ω( \log^2 d/d^{1/2})$. This improves upon the bound $λ=Ω(\log d/ d^{1/4})$ due to the first author, Perkins and Potukuchi. We discuss similar improvements to results of Galvin-Tetali, Balogh-Garcia-Li and Kronenberg-Spinka.

preprint2026arXiv

Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration

Large language models (LLMs) have demonstrated their instruction-following capabilities and achieved powerful performance on various tasks. Inspired by their success, recent works in the molecular domain have led to the development of large molecular language models (LMLMs) that integrate 1D molecular strings or 2D molecular graphs into the language models. However, existing LMLMs often suffer from hallucination and limited robustness, largely due to inadequate integration of diverse molecular modalities such as 1D sequences, 2D molecular graphs, and 3D conformations. To address these limitations, we propose CoLLaMo, a large language model-based molecular assistant equipped with a multi-level molecular modality-collaborative projector. The relation-aware modality-collaborative attention mechanism in the projector facilitates fine-grained and relation-guided information exchange between atoms by incorporating 2D structural and 3D spatial relations. Furthermore, we present a molecule-centric new automatic measurement, including a hallucination assessment metric and GPT-based caption quality evaluation to address the limitations of token-based generic evaluation metrics (i.e., BLEU) widely used in assessing molecular comprehension of LMLMs. Our extensive experiments demonstrate that our CoLLaMo enhances the molecular modality generalization capabilities of LMLMs, achieving the best performance on multiple tasks, including molecule captioning, computed property QA, descriptive property QA, motif counting, and IUPAC name prediction.

preprint2026arXiv

On the number of antichains in $\{0,1,2\}^n$

We provide precise asymptotics for the number of antichains in the poset $\{0,1,2\}^n$, answering a question of Sapozhenko. Finding improved estimates for this number was also a problem suggested by Noel, Scott, and Sudakov, who obtained asymptotics for the logarithm of the number. Key ingredients for the proof include a graph-container lemma to bound the number of expanding sets in a class of irregular graphs, isoperimetric inequalities for generalizations of the Boolean lattice, and methods from statistical physics based on the cluster expansion.

preprint2022arXiv

Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos

Natural Language Video Grounding (NLVG) aims to localize time segments in an untrimmed video according to sentence queries. In this work, we present a new paradigm named Explore-And-Match for NLVG that seamlessly unifies the strengths of two streams of NLVG methods: proposal-free and proposal-based; the former explores the search space to find time segments directly, and the latter matches the predefined time segments with ground truths. To achieve this, we formulate NLVG as a set prediction problem and design an end-to-end trainable Language Video Transformer (LVTR) that can enjoy two favorable properties, which are rich contextualization power and parallel decoding. We train LVTR with two losses. First, temporal localization loss allows time segments of all queries to regress targets (explore). Second, set guidance loss couples every query with their respective target (match). To our surprise, we found that training schedule shows divide-and-conquer-like pattern: time segments are first diversified regardless of the target, then coupled with each target, and fine-tuned to the target again. Moreover, LVTR is highly efficient and effective: it infers faster than previous baselines (by 2X or more) and sets competitive results on two NLVG benchmarks (ActivityCaptions and Charades-STA). Codes are available at https://github.com/sangminwoo/Explore-And-Match.

preprint2022arXiv

Metropolis-Hastings Data Augmentation for Graph Neural Networks

Graph Neural Networks (GNNs) often suffer from weak-generalization due to sparsely labeled data despite their promising results on various graph-based tasks. Data augmentation is a prevalent remedy to improve the generalization ability of models in many domains. However, due to the non-Euclidean nature of data space and the dependencies between samples, designing effective augmentation on graphs is challenging. In this paper, we propose a novel framework Metropolis-Hastings Data Augmentation (MH-Aug) that draws augmented graphs from an explicit target distribution for semi-supervised learning. MH-Aug produces a sequence of augmented graphs from the target distribution enables flexible control of the strength and diversity of augmentation. Since the direct sampling from the complex target distribution is challenging, we adopt the Metropolis-Hastings algorithm to obtain the augmented samples. We also propose a simple and effective semi-supervised learning strategy with generated samples from MH-Aug. Our extensive experiments demonstrate that MH-Aug can generate a sequence of samples according to the target distribution to significantly improve the performance of GNNs.

preprint2021arXiv

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs

Graph neural networks have shown superior performance in a wide range of applications providing a powerful representation of graph-structured data. Recent works show that the representation can be further improved by auxiliary tasks. However, the auxiliary tasks for heterogeneous graphs, which contain rich semantic information with various types of nodes and edges, have less explored in the literature. In this paper, to learn graph neural networks on heterogeneous graphs we propose a novel self-supervised auxiliary learning method using meta-paths, which are composite relations of multiple edge types. Our proposed method is learning to learn a primary task by predicting meta-paths as auxiliary tasks. This can be viewed as a type of meta-learning. The proposed method can identify an effective combination of auxiliary tasks and automatically balance them to improve the primary task. Our methods can be applied to any graph neural networks in a plug-in manner without manual labeling or additional data. The experiments demonstrate that the proposed method consistently improves the performance of link prediction and node classification on heterogeneous graphs.

preprint2020arXiv

Tuza's Conjecture for random graphs

A celebrated conjecture of Zs. Tuza says that in any (finite) graph, the minimum size of a cover of triangles by edges is at most twice the maximum size of a set of edge-disjoint triangles. Resolving a recent question of Bennett, Dudek, and Zerbib, we show that this is true for random graphs; more precisely: \[ \mbox{for any $p=p(n)$, $\mathbb P(\mbox{$G_{n,p}$ satisfies Tuza's Conjecture})\rightarrow 1 $ (as $n\rightarrow\infty$).} \]

Jinyoung Park

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

A refined graph container lemma and applications to the hard-core model on bipartite expanders

Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration

On the number of antichains in $\{0,1,2\}^n$

Explore-And-Match: Bridging Proposal-Based and Proposal-Free With Transformer for Sentence Grounding in Videos

Metropolis-Hastings Data Augmentation for Graph Neural Networks

Self-supervised Auxiliary Learning with Meta-paths for Heterogeneous Graphs

Tuza's Conjecture for random graphs