Source author record

Jinkyu Kim

Jinkyu Kim appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Computer Vision Robotics math-ph math.MP Artificial Intelligence Computation and Language math.DS physics.class-ph Quantitative Methods

Catalog footprint

What is connected

12works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling

End-to-end autonomous driving, which bypasses traditional modular pipelines by directly predicting future trajectories from sensor inputs, has recently achieved substantial progress. However, existing methods often overlook the causal inter-dependencies in ego-vehicle planning, ignoring the reciprocal relations between the ego vehicle and surrounding agents. This causal oversight leads to inconsistent and unreliable trajectory predictions, especially in interaction-critical scenarios where ego decisions and neighboring agent behaviors must be reasoned about jointly. To address this limitation, we propose CaAD, a Causality-aware end-to-end Autonomous Driving framework that captures these dependencies within a shared latent scene representation. First, we propose an ego-centric joint-causal modeling module that builds on the marginal prediction branch, and learns causal dependencies between the ego vehicle and interaction-relevant agents. Second, we employ a causality-aware policy alignment stage implemented with joint-mode embeddings to align the stochastic ego policy with planning-oriented closed-loop feedback computed from surrounding traffic and map context. On the Bench2Drive and NAVSIM benchmarks, CaAD demonstrates strong closed-loop planning performance, achieving a Driving Score of 87.53 and Success Rate of 71.81 on Bench2Drive, and a PDMS of 91.1 on NAVSIM. The project page is available at https://moonseokha.github.io/CaAD/.

preprint2022arXiv

An Embedding-Dynamic Approach to Self-supervised Learning

A number of recent self-supervised learning methods have shown impressive performance on image classification and other tasks. A somewhat bewildering variety of techniques have been used, not always with a clear understanding of the reasons for their benefits, especially when used in combination. Here we treat the embeddings of images as point particles and consider model optimization as a dynamic process on this system of particles. Our dynamic model combines an attractive force for similar images, a locally dispersive force to avoid local collapse, and a global dispersive force to achieve a globally-homogeneous distribution of particles. The dynamic perspective highlights the advantage of using a delayed-parameter image embedding (a la BYOL) together with multiple views of the same image. It also uses a purely-dynamic local dispersive force (Brownian motion) that shows improved performance over other methods and does not require knowledge of other particle coordinates. The method is called MSBReg which stands for (i) a Multiview centroid loss, which applies an attractive force to pull different image view embeddings toward their centroid, (ii) a Singular value loss, which pushes the particle system toward spatially homogeneous density, (iii) a Brownian diffusive loss. We evaluate downstream classification performance of MSBReg on ImageNet as well as transfer learning tasks including fine-grained classification, multi-class object classification, object detection, and instance segmentation. In addition, we also show that applying our regularization term to other methods further improves their performance and stabilize the training by preventing a mode collapse.

preprint2022arXiv

Grounding Visual Representations with Texts for Domain Generalization

Reducing the representational discrepancy between source and target domains is a key component to maximize the model generalization. In this work, we advocate for leveraging natural language supervision for the domain generalization task. We introduce two modules to ground visual representations with texts containing typical reasoning of humans: (1) Visual and Textual Joint Embedder and (2) Textual Explanation Generator. The former learns the image-text joint embedding space where we can ground high-level class-discriminative information into the model. The latter leverages an explainable model and generates explanations justifying the rationale behind its decision. To the best of our knowledge, this is the first work to leverage the vision-and-language cross-modality approach for the domain generalization task. Our experiments with a newly created CUB-DG benchmark dataset demonstrate that cross-modality supervision can be successfully used to ground domain-invariant visual representations and improve the model generalization. Furthermore, in the large-scale DomainBed benchmark, our proposed method achieves state-of-the-art results and ranks 1st in average performance for five multi-domain datasets. The dataset and codes are available at https://github.com/mswzeus/GVRT.

preprint2022arXiv

Multi-Level Branched Regularization for Federated Learning

A critical challenge of federated learning is data heterogeneity and imbalance across clients, which leads to inconsistency between local networks and unstable convergence of global models. To alleviate the limitations, we propose a novel architectural regularization technique that constructs multiple auxiliary branches in each local model by grafting local and global subnetworks at several different levels and that learns the representations of the main pathway in the local model congruent to the auxiliary hybrid pathways via online knowledge distillation. The proposed technique is effective to robustify the global model even in the non-iid setting and is applicable to various federated learning frameworks conveniently without incurring extra communication costs. We perform comprehensive empirical studies and demonstrate remarkable performance gains in terms of accuracy and efficiency compared to existing methods. The source code is available at our project page.

preprint2022arXiv

Occupancy Flow Fields for Motion Forecasting in Autonomous Driving

We propose Occupancy Flow Fields, a new representation for motion forecasting of multiple agents, an important task in autonomous driving. Our representation is a spatio-temporal grid with each grid cell containing both the probability of the cell being occupied by any agent, and a two-dimensional flow vector representing the direction and magnitude of the motion in that cell. Our method successfully mitigates shortcomings of the two most commonly-used representations for motion forecasting: trajectory sets and occupancy grids. Although occupancy grids efficiently represent the probabilistic location of many agents jointly, they do not capture agent motion and lose the agent identities. To this end, we propose a deep learning architecture that generates Occupancy Flow Fields with the help of a new flow trace loss that establishes consistency between the occupancy and flow predictions. We demonstrate the effectiveness of our approach using three metrics on occupancy prediction, motion estimation, and agent ID recovery. In addition, we introduce the problem of predicting speculative agents, which are currently-occluded agents that may appear in the future through dis-occlusion or by entering the field of view. We report experimental results on a large in-house autonomous driving dataset and the public INTERACTION dataset, and show that our model outperforms state-of-the-art models.

preprint2022arXiv

StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving

We introduce a motion forecasting (behavior prediction) method that meets the latency requirements for autonomous driving in dense urban environments without sacrificing accuracy. A whole-scene sparse input representation allows StopNet to scale to predicting trajectories for hundreds of road agents with reliable latency. In addition to predicting trajectories, our scene encoder lends itself to predicting whole-scene probabilistic occupancy grids, a complementary output representation suitable for busy urban environments. Occupancy grids allow the AV to reason collectively about the behavior of groups of agents without processing their individual trajectories. We demonstrate the effectiveness of our sparse input representation and our model in terms of computation and accuracy over three datasets. We further show that co-training consistent trajectory and occupancy predictions improves upon state-of-the-art performance under standard metrics.

preprint2020arXiv

Attentional Bottleneck: Towards an Interpretable Deep Driving Network

Deep neural networks are a key component of behavior prediction and motion generation for self-driving cars. One of their main drawbacks is a lack of transparency: they should provide easy to interpret rationales for what triggers certain behaviors. We propose an architecture called Attentional Bottleneck with the goal of improving transparency. Our key idea is to combine visual attention, which identifies what aspects of the input the model is using, with an information bottleneck that enables the model to only use aspects of the input which are important. This not only provides sparse and interpretable attention maps (e.g. focusing only on specific vehicles in the scene), but it adds this transparency at no cost to model accuracy. In fact, we find slight improvements in accuracy when applying Attentional Bottleneck to the ChauffeurNet model, whereas we find that the accuracy deteriorates with a traditional visual attention model.

preprint2013arXiv

Higher order temporal finite element methods through mixed formulations

The EHP and the MCAP provide new rigorous weak variational formalism for a broad range of initial boundary value problems in mathematical physics and mechanics. Both approaches utilize the mixed formulation and lead to the development of various space-time finite element methods. In this paper, their potential when adopting temporally higher order approximations is investigated. The classical single-degree-of-freedom dynamical systems are primarily considered to validate and to investigate the performance of the numerical algorithms developed from both formulations. For the undamped system, all the algorithms are found to be symplectic and unconditionally stable with respect to the time step. On the other hand, for the damped system, the approach is shown to be robust and to be accurate with good convergence characteristics.

preprint2012arXiv

Extended Hamilton's principle

Hamilton's principle is extended to have compatible initial conditions to the strong form. To use a number of computational and theoretical benefits for dynamical systems, the mixed variational formulation is preferred in the systems other than particle systems. With this formulation and the Rayleigh's dissipation function, we could have all the pertinent initial/boundary conditions for both conservative and non-conservative dynamical system. Based upon the extension framework of Hamilton's principle, the numerical method for representative lumped parameter models is also developed through applying Galerkin's method to time domain with the discussion of its numerical properties and simulation results.

preprint2012arXiv

Multiplying decomposition of stress/strain, constitutive/compliance relations, and strain energy

To account for phenomenological theories and a set of invariants, stress and strain are usually decomposed into a pair of pressure and deviatoric stress and a pair of volumetric strain and deviatoric strain. However, the conventional decomposition method only focuses on individual stress and strain, so that cannot be directly applied to either formulation in Finite Element Method (FEM) or Boundary Element Method (BEM). In this paper, a simpler, more general, and widely applicable decomposition is suggested. A new decomposition method adopts multiplying decomposition tensors or matrices to not only stress and strain but also constitutive and compliance relation. With this, we also show its practical usage on FEM and BEM in terms of tensors and matrices.

preprint2011arXiv

HiTRACE: High-throughput robust analysis for capillary electrophoresis

Motivation: Capillary electrophoresis (CE) of nucleic acids is a workhorse technology underlying high-throughput genome analysis and large-scale chemical mapping for nucleic acid structural inference. Despite the wide availability of CE-based instruments, there remain challenges in leveraging their full power for quantitative analysis of RNA and DNA structure, thermodynamics, and kinetics. In particular, the slow rate and poor automation of available analysis tools have bottlenecked a new generation of studies involving hundreds of CE profiles per experiment. Results: We propose a computational method called high-throughput robust analysis for capillary electrophoresis (HiTRACE) to automate the key tasks in large-scale nucleic acid CE analysis, including the profile alignment that has heretofore been a rate-limiting step in the highest throughput experiments. We illustrate the application of HiTRACE on thirteen data sets representing 4 different RNAs, three chemical modification strategies, and up to 480 single mutant variants; the largest data sets each include 87,360 bands. By applying a series of robust dynamic programming algorithms, HiTRACE outperforms prior tools in terms of alignment and fitting quality, as assessed by measures including the correlation between quantified band intensities between replicate data sets. Furthermore, while the smallest of these data sets required 7 to 10 hours of manual intervention using prior approaches, HiTRACE quantitation of even the largest data sets herein was achieved in 3 to 12 minutes. The HiTRACE method therefore resolves a critical barrier to the efficient and accurate analysis of nucleic acid structure in experiments involving tens of thousands of electrophoretic bands.

preprint2011arXiv

The mixed convolved action

A series of stationary principles are developed for dynamical systems by formulating the concept of mixed convolved action, which is written in terms of mixed variables, using temporal convolutions and fractional derivatives. Dynamical systems with discrete and continuous spatial representations are considered as initial applications. In each case, a single scalar functional provides the governing differential equations, along with all the pertinent initial and boundary conditions, as the Euler-Lagrange equations emanating from the stationarity of this mixed convolved action. Both conservative and non-conservative processes can be considered within a common framework, thus resolving a long-standing limitation of variational approaches for dynamical systems. Several results in fractional calculus also are developed.

Jinkyu Kim

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Causality-Aware End-to-End Autonomous Driving via Ego-Centric Joint Scene Modeling

An Embedding-Dynamic Approach to Self-supervised Learning

Grounding Visual Representations with Texts for Domain Generalization

Multi-Level Branched Regularization for Federated Learning

Occupancy Flow Fields for Motion Forecasting in Autonomous Driving

StopNet: Scalable Trajectory and Occupancy Prediction for Urban Autonomous Driving

Attentional Bottleneck: Towards an Interpretable Deep Driving Network

Higher order temporal finite element methods through mixed formulations

Extended Hamilton's principle

Multiplying decomposition of stress/strain, constitutive/compliance relations, and strain energy

HiTRACE: High-throughput robust analysis for capillary electrophoresis

The mixed convolved action