Researcher profile

Jin Song Dong

Jin Song Dong contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

CAMAL: Improving Attention Alignment and Faithfulness with Segmentation Masks

Many vision datasets now provide segmentation masks in addition to annotated images to support a wide range of tasks. In this work, we propose Class Activation Map Attention Learning (CAMAL), an efficient and scalable method that utilizes segmentation masks to improve attention alignment and faithfulness in vision models. Specifically, attention alignment refers to the degree to which a model's attention aligns with ground-truth discriminative regions, while attention faithfulness refers to the degree to which a model's attention influences its decision. Improving both attention alignment and faithfulness is essential for ensuring that model attention is both spatially accurate and causally meaningful. To improve attention alignment and faithfulness in vision models, CAMAL first extracts the model's attention for each image during training and then compares the attention to ground-truth discriminative regions obtained from the corresponding segmentation masks. CAMAL then acts as an auxiliary regularizer, encouraging attention that aligns with ground-truth discriminative regions, while suppressing attention elsewhere. We evaluated CAMAL across two learning paradigms -- Deep Learning (DL) and Deep Reinforcement Learning (DRL) -- and observed consistent, significant improvements in both attention alignment and faithfulness. In particular, CAMAL yields statistically significant gains in attention alignment across all settings, and improves attention faithfulness by over 35% compared to recent work. Moreover, we show that improved attention alignment and faithfulness enhance explainability, while yielding improved or comparable generalization performance without increasing inference cost. These findings demonstrate that the spatial information contained within segmentation masks can be effectively leveraged to guide model attention across learning tasks.

preprint2026arXiv

CuBridge: An LLM-Based Framework for Understanding and Reconstructing High-Performance Attention Kernels

Efficient CUDA implementations of attention mechanisms are critical to modern deep learning systems, yet supporting diverse and evolving attention variants remains challenging. Existing frameworks and compilers trade performance for flexibility, while expert-written kernels achieve high efficiency but are difficult to adapt. Recent work explores large language models (LLMs) for GPU kernel generation, but prior studies report unstable correctness and significant performance gaps for complex operators such as attention. We present CuBridge, an LLM-based framework that adapts expert-written attention kernels through a structured lift-transfer-lower workflow. CuBridge starts from expert-written CUDA attention kernels and lifts them into an executable intermediate representation that makes execution orchestration explicit while abstracting low-level CUDA syntax. Given a user-provided PyTorch specification, CuBridge generates and verifies a target IR program, then reconstructs optimized CUDA code via reference-guided lowering. Across diverse attention variants and GPU platforms, CuBridge consistently produces correct kernels and substantially outperforms general frameworks, compiler-based approaches, and prior LLM-based methods.

preprint2022arXiv

A Prompting-based Approach for Adversarial Example Generation and Robustness Enhancement

Recent years have seen the wide application of NLP models in crucial areas such as finance, medical treatment, and news media, raising concerns of the model robustness and vulnerabilities. In this paper, we propose a novel prompt-based adversarial attack to compromise NLP models and robustness enhancement technique. We first construct malicious prompts for each instance and generate adversarial examples via mask-and-filling under the effect of a malicious purpose. Our attack technique targets the inherent vulnerabilities of NLP models, allowing us to generate samples even without interacting with the victim NLP model, as long as it is based on pre-trained language models (PLMs). Furthermore, we design a prompt-based adversarial training method to improve the robustness of PLMs. As our training method does not actually generate adversarial samples, it can be applied to large-scale training sets efficiently. The experimental results show that our attack method can achieve a high attack success rate with more diverse, fluent and natural adversarial examples. In addition, our robustness enhancement method can significantly improve the robustness of models to resist adversarial attacks. Our work indicates that prompting paradigm has great potential in probing some fundamental flaws of PLMs and fine-tuning them for downstream tasks.

preprint2022arXiv

Geometric Theory for Program Testing

Formal methods for verification of programs are extended to testing of programs. Their combination is intended to lead to benefits in reliable program development, testing, and evolution. Our geometric theory of testing is intended to serve as the specification of a testing environment, included as the last stage of a toolchain that assists professional programmers, amateurs, and students of Computer Science. The testing environment includes an automated algorithm which locates errors in a test that has been run, and assists in correcting them. It does this by displaying, on a monitor screen, a stick diagram of causal chains in the execution of the program under test. The diagram can then be navigated backwards in the familiar style of a satnav following roads on a map. This will reveal selections of places at which the program should be modified to remove the error.

preprint2022arXiv

RegMiner: Towards Constructing a Large Regression Dataset from Code Evolution History

Bug datasets consisting of real-world bugs are important artifacts for researchers and programmers, which lay empirical and experimental foundation for various SE/PL research such as fault localization, software testing, and program repair. All known state-of-the-art datasets are constructed manually, which inevitably limits their scalability, representativeness, and the support for the emerging data-driven research. In this work, we propose an approach to automate the process of harvesting replicable regression bugs from the code evolutionary history. We focus on regression bug dataset, as they (1) manifest how a bug is introduced and fixed (as normal bugs), (2) support regression bug analysis, and (3) incorporate a much stronger specification (i.e., the original passing version) for general bug analysis. Technically, we address an information retrieval problem on code evolution history. Given a code repository, we search for regressions where a test can pass a regression-fixing commit, fail a regressioninducing commit, and pass a working commit. In this work, we address the challenges of (1) identifying potential regression-fixing commits from the code evolution history, (2) migrating the test and its code dependencies over the history, and (3) minimizing the compilation overhead during the regression search. We build our tool, RegMiner, which harvested 537 regressions over 66 projects for 3 weeks, created the largest replicable regression dataset within shortest period, to the best of our knowledge. Moreover, our empirical study on our regression dataset shows a gap between the popular regression fault localization techniques (e.g, delta-debugging) and the real fix, revealing new data-driven research opportunities.

preprint2021arXiv

DeepVisualInsight: Time-Travelling Visualization for Spatio-Temporal Causality of Deep Classification Training

Understanding how the predictions of deep learning models are formed during the training process is crucial to improve model performance and fix model defects, especially when we need to investigate nontrivial training strategies such as active learning, and track the root cause of unexpected training results such as performance degeneration. In this work, we propose a time-travelling visual solution DeepVisualInsight (DVI), aiming to manifest the spatio-temporal causality while training a deep learning image classifier. The spatio-temporal causality demonstrates how the gradient-descent algorithm and various training data sampling techniques can influence and reshape the layout of learnt input representation and the classification boundaries in consecutive epochs. Such causality allows us to observe and analyze the whole learning process in the visible low dimensional space. Technically, we propose four spatial and temporal properties and design our visualization solution to satisfy them. These properties preserve the most important information when inverse-)projecting input samples between the visible low-dimensional and the invisible high-dimensional space, for causal analyses. Our extensive experiments show that, comparing to baseline approaches, we achieve the best visualization performance regarding the spatial/temporal properties and visualization efficiency. Moreover, our case study shows that our visual solution can well reflect the characteristics of various training scenarios, showing good potential of DVI as a debugging tool for analyzing deep learning training processes.

preprint2021arXiv

Extracting Optimal Explanations for Ensemble Trees via Logical Reasoning

Ensemble trees are a popular machine learning model which often yields high prediction performance when analysing structured data. Although individual small decision trees are deemed explainable by nature, an ensemble of large trees is often difficult to understand. In this work, we propose an approach called optimised explanation (OptExplain) that faithfully extracts global explanations of ensemble trees using a combination of logical reasoning, sampling and optimisation. Building on top of this, we propose a method called the profile of equivalent classes (ProClass), which uses MAX-SAT to simplify the explanation even further. Our experimental study on several datasets shows that our approach can provide high-quality explanations to large ensemble trees models, and it betters recent top-performers.

preprint2021arXiv

Repairing Adversarial Texts through Perturbation

It is known that neural networks are subject to attacks through adversarial perturbations, i.e., inputs which are maliciously crafted through perturbations to induce wrong predictions. Furthermore, such attacks are impossible to eliminate, i.e., the adversarial perturbation is still possible after applying mitigation methods such as adversarial training. Multiple approaches have been developed to detect and reject such adversarial inputs, mostly in the image domain. Rejecting suspicious inputs however may not be always feasible or ideal. First, normal inputs may be rejected due to false alarms generated by the detection algorithm. Second, denial-of-service attacks may be conducted by feeding such systems with adversarial inputs. To address the gap, in this work, we propose an approach to automatically repair adversarial texts at runtime. Given a text which is suspected to be adversarial, we novelly apply multiple adversarial perturbation methods in a positive way to identify a repair, i.e., a slightly mutated but semantically equivalent text that the neural network correctly classifies. Our approach has been experimented with multiple models trained for natural language processing tasks and the results show that our approach is effective, i.e., it successfully repairs about 80\% of the adversarial texts. Furthermore, depending on the applied perturbation method, an adversarial text could be repaired in as short as one second on average.

preprint2021arXiv

Self-Checking Deep Neural Networks in Deployment

The widespread adoption of Deep Neural Networks (DNNs) in important domains raises questions about the trustworthiness of DNN outputs. Even a highly accurate DNN will make mistakes some of the time, and in settings like self-driving vehicles these mistakes must be quickly detected and properly dealt with in deployment. Just as our community has developed effective techniques and mechanisms to monitor and check programmed components, we believe it is now necessary to do the same for DNNs. In this paper we present DNN self-checking as a process by which internal DNN layer features are used to check DNN predictions. We detail SelfChecker, a self-checking system that monitors DNN outputs and triggers an alarm if the internal layer features of the model are inconsistent with the final prediction. SelfChecker also provides advice in the form of an alternative prediction. We evaluated SelfChecker on four popular image datasets and three DNN models and found that SelfChecker triggers correct alarms on 60.56% of wrong DNN predictions, and false alarms on 2.04% of correct DNN predictions. This is a substantial improvement over prior work (SELFORACLE, DISSECTOR, and ConfidNet). In experiments with self-driving car scenarios, SelfChecker triggers more correct alarms than SELFORACLE for two DNN models (DAVE-2 and Chauffeur) with comparable false alarms. Our implementation is available as open source.

preprint2020arXiv

Automated synthesis of local time requirement for service composition

Service composition aims at achieving a business goal by composing existing service-based applications or components. The response time of a service is crucial especially in time critical business environments, which is often stated as a clause in service level agreements between service providers and service users. To meet the guaranteed response time requirement of a composite service, it is important to select a feasible set of component services such that their response time will collectively satisfy the response time requirement of the composite service. In this work, we use the BPEL modeling language, that aims at specifying Web services. We extend it with timing parameters, and equip it with a formal semantics. Then, we propose a fully automated approach to synthesize the response time requirement of component services modeled using BPEL, in the form of a constraint on the local response times. The synthesized requirement will guarantee the satisfaction of the global response time requirement, statically or dynamically. We implemented our work into a tool, Selamat, and performed several experiments to evaluate the validity of our approach.