Source author record

Seungwon Jeong

Seungwon Jeong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Biological Physics Computation and Language econ.GN Machine Learning physics.optics q-fin.EC Social and Information Networks

Catalog footprint

What is connected

4works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Machine Unlearning for Masked Diffusion Language Models

Recent masked diffusion language models (MDLMs), such as LLaDA and Dream, have achieved performance comparable to autoregressive large language models. Unlike autoregressive models, which generate text sequentially, MDLMs generate text by iteratively denoising masked positions in parallel. During fine-tuning, MDLMs learn to recover responses from masked response states conditioned on a prompt, thereby shifting their predictions from a prompt-masked unconditional distribution toward a prompt-conditional distribution. Despite this distinct generative and fine-tuning mechanism, machine unlearning for MDLMs remains largely unexplored. In this paper, we propose Masked Diffusion Unlearning (MDU), the first unlearning framework for MDLMs, by revisiting the process of learning specific knowledge in terms of diffusion. Specifically, MDU minimizes a forward KL divergence from the prompt-conditional prediction to a prompt-masked unconditional anchor at every masked response position, with a temperature scaling parameter to control the privacy-utility trade-off. Our empirical results on standard benchmarks and MDLM backbones show that MDU achieves high unlearning performance compared to existing LLM unlearning methods. Code is available at https://github.com/leegeoru/MDU.

preprint2022arXiv

Procurements with Bidder Asymmetry in Cost and Risk-Aversion

We propose an empirical method to analyze data from first-price procurements where bidders are asymmetric in their risk-aversion (CRRA) coefficients and distributions of private costs. Our Bayesian approach evaluates the likelihood by solving type-symmetric equilibria using the boundary-value method and integrates out unobserved heterogeneity through data augmentation. We study a new dataset from Russian government procurements focusing on the category of printing papers. We find that there is no unobserved heterogeneity (presumably because the job is routine), but bidders are highly asymmetric in their cost and risk-aversion. Our counterfactual study shows that choosing a type-specific cost-minimizing reserve price marginally reduces the procurement cost; however, inviting one more bidder substantially reduces the cost, by at least 5.5%. Furthermore, incorrectly imposing risk-neutrality would severely mislead inference and policy recommendations, but the bias from imposing homogeneity in risk-aversion is small.

preprint2020arXiv

Posting Bot Detection on Blockchain-based Social Media Platform using Machine Learning Techniques

Steemit is a blockchain-based social media platform, where authors can get author rewards in the form of cryptocurrencies called STEEM and SBD (Steem Blockchain Dollars) if their posts are upvoted. Interestingly, curators (or voters) can also get rewards by voting others' posts, which is called a curation reward. A reward is proportional to a curator's STEEM stakes. Throughout this process, Steemit hopes "good" content will be automatically discovered by users in a decentralized way, which is known as the Proof-of-Brain (PoB). However, there are many bot accounts programmed to post automatically and get rewards, which discourages real human users from creating good content. We call this type of bot a posting bot. While there are many papers that studied bots on traditional centralized social media platforms such as Facebook and Twitter, we are the first to study posting bots on a blockchain-based social media platform. Compared with the bot detection on the usual social media platforms, the features we created have an advantage that posting bots can be detected without limiting the number or length of posts. We can extract the features of posts by clustering distances between blog data or replies. These features are obtained from the Minimum Average Cluster from Clustering Distance between Frequent words and Articles (MAC-CDFA), which is not used in any of the previous social media research. Based on the enriched features, we enhanced the quality of classification tasks. Comparing the F1-scores, the features we created outperformed the features used for bot detection on Facebook and Twitter.

preprint2016arXiv

Simultaneous suppression of scattering and aberration for ultra-high resolution imaging deep within scattering media

Thick biological tissues give rise to not only the scattering of incoming light waves, but also aberrations of the remaining unscattered waves. Due to the inability of existing optical imaging methodologies to overcome both of these problems simultaneously, imaging depth at the sub- micron spatial resolution has remained extremely shallow. Here we present an experimental approach for identifying and eliminating aberrations even in the presence of strong multiple light scattering. For time-gated complex-field maps of reflected waves taken over various illumination channels, we identify two sets of aberration correction maps, one for the illumination path and one for the reflection path, that can preferentially accumulate the unscattered signal waves over the multiple-scattered waves. By performing closed-loop optimization for forward and phase- conjugation processes, we demonstrated a spatial resolution of 600 nm up to the unprecedented imaging depth of 7 scattering mean free paths.