Source author record

Youngjun Lee

Youngjun Lee appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.NA Numerical Analysis astro-ph.HE Machine Learning physics.comp-ph Artificial Intelligence Databases Distributed, Parallel, and Cluster Computing

Catalog footprint

What is connected

5works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

CG-Kit: Code Generation Toolkit for Performant and Maintainable Variants of Source Code Applied to Flash-X Hydrodynamics Simulations

CG-Kit is a new code generation toolkit that we propose as a solution for portability and maintainability for scientific computing applications. The development of CG-Kit is rooted in the urgent need created by the shifting landscape of high-performance computing platforms and the algorithmic complexities of a particular large-scale multiphysics application: Flash-X. This combination leads to unique challenges including handling an existing large code base in Fortran and/or C/C++, subdivision of code into a great variety of units supporting a wide range of physics and numerical methods, different parallelization techniques for distributed- and shared-memory systems and accelerator devices, and heterogeneity of computing platforms requiring coexisting variants of parallel algorithms. The challenges demand that developers determine custom abstractions and granularity for code generation. CG-Kit tackles this with standalone tools that can be combined into highly specific and, we argue, highly effective portability and maintainability tool chains. Here we present the design of our new tools: parametrized source trees, control flow graphs, and recipes. The tools are implemented in Python. Although the tools are agnostic to the programming language of the source code, we focus on C/C++ and Fortran. Code generation experiments demonstrate the generation of variants of parallel algorithms: first, multithreaded variants of the basic AXPY operation (scalar-vector addition and vector-vector multiplication) to introduce the application of CG-Kit tool chains; and second, variants of parallel algorithms within a hydrodynamics solver, called Spark, from Flash-X that operates on block-structured adaptive meshes. In summary, code generated by CG-Kit achieves a reduction by over 60% of the original C/C++/Fortran source code.

preprint2023arXiv

Meta-Query-Net: Resolving Purity-Informativeness Dilemma in Open-set Active Learning

Unlabeled data examples awaiting annotations contain open-set noise inevitably. A few active learning studies have attempted to deal with this open-set noise for sample selection by filtering out the noisy examples. However, because focusing on the purity of examples in a query set leads to overlooking the informativeness of the examples, the best balancing of purity and informativeness remains an important question. In this paper, to solve this purity-informativeness dilemma in open-set active learning, we propose a novel Meta-Query-Net,(MQ-Net) that adaptively finds the best balancing between the two factors. Specifically, by leveraging the multi-round property of active learning, we train MQ-Net using a query set without an additional validation set. Furthermore, a clear dominance relationship between unlabeled examples is effectively captured by MQ-Net through a novel skyline regularization. Extensive experiments on multiple open-set active learning scenarios demonstrate that the proposed MQ-Net achieves 20.14% improvement in terms of accuracy, compared with the state-of-the-art methods.

preprint2022arXiv

Adaptive Model Pooling for Online Deep Anomaly Detection from a Complex Evolving Data Stream

Online anomaly detection from a data stream is critical for the safety and security of many applications but is facing severe challenges due to complex and evolving data streams from IoT devices and cloud-based infrastructures. Unfortunately, existing approaches fall too short for these challenges; online anomaly detection methods bear the burden of handling the complexity while offline deep anomaly detection methods suffer from the evolving data distribution. This paper presents a framework for online deep anomaly detection, ARCUS, which can be instantiated with any autoencoder-based deep anomaly detection methods. It handles the complex and evolving data streams using an adaptive model pooling approach with two novel techniques: concept-driven inference and drift-aware model pool update; the former detects anomalies with a combination of models most appropriate for the complexity, and the latter adapts the model pool dynamically to fit the evolving data streams. In comprehensive experiments with ten data sets which are both high-dimensional and concept-drifted, ARCUS improved the anomaly detection accuracy of the streaming variants of state-of-the-art autoencoder-based methods and that of the state-of-the-art streaming anomaly detection methods by up to 22% and 37%, respectively.

preprint2021arXiv

A recursive system-free single-step temporal discretization method for finite difference methods

Single-stage or single-step high-order temporal discretizations of partial differential equations (PDEs) have shown great promise in delivering high-order accuracy in time with efficient use of computational resources. There has been much success in developing such methods for finite volume method (FVM) discretizations of PDEs. The Picard Integral formulation (PIF) has recently made such single-stage temporal methods accessible for finite difference method (FDM) discretizations. PIF methods rely on the so-called Lax-Wendroff procedures to tightly couple spatial and temporal derivatives through the governing PDE system to construct high-order Taylor series expansions in time. Going to higher than third order in time requires the calculation of Jacobian-like derivative tensor-vector contractions of an increasingly larger degree, greatly adding to the complexity of such schemes. To that end, we present in this paper a method for calculating these tensor contractions through a recursive application of a discrete Jacobian operator that readily and efficiently computes the needed contractions entirely agnostic of the system of partial differential equations (PDEs) being solved.

preprint2020arXiv

A single-step third-order temporal discretization with Jacobian-free and Hessian-free formulations for finite difference methods

Discrete updates of numerical partial differential equations (PDEs) rely on two branches of temporal integration. The first branch is the widely-adopted, traditionally popular approach of the method-of-lines (MOL) formulation, in which multi-stage Runge-Kutta (RK) methods have shown great success in solving ordinary differential equations (ODEs) at high-order accuracy. The clear separation between the temporal and the spatial discretizations of the governing PDEs makes the RK methods highly adaptable. In contrast, the second branch of formulation using the so-called Lax-Wendroff procedure escalates the use of tight couplings between the spatial and temporal derivatives to construct high-order approximations of temporal advancements in the Taylor series expansions. In the last two decades, modern numerical methods have explored the second route extensively and have proposed a set of computationally efficient single-stage, single-step high-order accurate algorithms. In this paper, we present an algorithmic extension of the method called the Picard integration formulation (PIF) that belongs to the second branch of the temporal updates. The extension presented in this paper furnishes ease of calculating the Jacobian and Hessian terms necessary for third-order accuracy in time.