Researcher profile

Kuan-Yu Chen

Kuan-Yu Chen contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2026arXiv

BoostLLM: Boosting-inspired LLM Fine-tuning for Few-shot Tabular Classification

Large language models (LLMs) have recently been adapted to tabular prediction by serializing structured features into natural language, but their performance in low-data regimes remains limited compared to gradient-boosted decision trees (GBDTs). In this work, we revisit the boosting paradigm, traditionally associated with tree ensembles, and ask whether it can be applied as a general training principle for LLM fine-tuning. We propose BoostLLM, a framework that transforms parameter-efficient fine-tuning into a multi-round residual optimization process by training sequential PEFT adapters as weak learners. To incorporate tabular inductive bias, BoostLLM integrates decision-tree paths as a second input view alongside raw features; analysis reveals that the path view acts as a structured teacher in early training steps before the model shifts toward feature-driven representations. Empirically, BoostLLM achieves consistent improvements over standard fine-tuning across multiple LLM backbones and datasets, matching or surpassing XGBoost across a wide range of shot counts and outperforming GPT-4o-based methods with a 4B model. We further show that the framework scales: pairing with stronger tree models and extended boosting horizons yields additional gains under appropriate stabilization. These results suggest that boosting can serve as a general training principle for LLM fine-tuning, particularly in low-data regimes for structured data.

preprint2022arXiv

First Results from the Taiwan Axion Search Experiment with Haloscope at 19.6 $μ$eV

This Letter reports on the first results from the Taiwan Axion Search Experiment with Haloscope, a search for axions using a microwave cavity at frequencies between 4.70750 and 4.79815 GHz. Apart from the non-axion signals, no candidates with a significance more than 3.355 were found. The experiment excludes models with the axion-two-photon coupling $\left|g_{aγγ}\right|\gtrsim 8.2\times 10^{-14}$ GeV$^{-1}$, a factor of eleven above the benchmark KSVZ model, reaching a sensitivity three orders of magnitude better than any existing limits in the mass range 19.4687 < $m_a$ < 19.8436 $μ$eV. It is also the first time that a haloscope-type experiment places constraints on $g_{aγγ}$ in this mass region.

preprint2022arXiv

Non-autoregressive Transformer-based End-to-end ASR using BERT

Transformer-based models have led to significant innovation in classical and practical subjects as varied as speech processing, natural language processing, and computer vision. On top of the Transformer, attention-based end-to-end automatic speech recognition (ASR) models have recently become popular. Specifically, non-autoregressive modeling, which boasts fast inference and performance comparable to conventional autoregressive methods, is an emerging research topic. In the context of natural language processing, the bidirectional encoder representations from Transformers (BERT) model has received widespread attention, partially due to its ability to infer contextualized word representations and to enable superior performance for downstream tasks while needing only simple fine-tuning. Motivated by the success, we intend to view speech recognition as a downstream task of BERT, thus an ASR system is expected to be deduced by performing fine-tuning. Consequently, to not only inherit the advantages of non-autoregressive ASR models but also enjoy the benefits of a pre-trained language model (e.g., BERT), we propose a non-autoregressive Transformer-based end-to-end ASR model based on BERT. We conduct a series of experiments on the AISHELL-1 dataset that demonstrate competitive or superior results for the model when compared to state-of-the-art ASR systems.

preprint2022arXiv

Taiwan Axion Search Experiment with Haloscope: CD102 Analysis Details

This paper presents the analysis of the data acquired during the first physics run of the Taiwan Axion Search Experiment with Haloscope (TASEH), a search for axions using a microwave cavity at frequencies between 4.70750 and 4.79815 GHz. The data were collected from October 13, 2021 to November 15, 2021, and are referred to as the CD102 data. The analysis of the TASEH CD102 data excludes models with the axion-two-photon coupling $|g_{aγγ}| \gtrsim 8.2\times 10^{-14}$ GeV$^{-1}$, a factor of eleven above the benchmark KSVZ model for the mass range 19.4687 < ma < 19.8436 $μ$eV.

preprint2022arXiv

Taiwan Axion Search Experiment with Haloscope: Designs and operations

We report on a holoscope axion search experiment near $19.6\ {\rm μeV}$ from the TASEH collaboration. The experiment is carried out via a frequency-tunable cavity detector with a volume $V = 0.234\ {\rm liter}$ in a magnetic field $B_0 = 8\ {\rm T}$. With a signal receiver that has a system noise temperature $T_{\rm sys} \cong 2.2\ {\rm K}$ and experiment time about 1 month, the search excludes values of the axion-photon coupling constant $g_{\rm aγγ} \gtrsim 8.1 \times 10^{-14} \ {\rm GeV}^{-1}$, a factor of 11 above the KSVZ model, at the 95\% confidence level in the mass range of $19.4687-19.8436\ {\rm μeV}$. We present the experimental setup and procedures to accomplish this search.

preprint2021arXiv

Speech Recognition by Simply Fine-tuning BERT

We propose a simple method for automatic speech recognition (ASR) by fine-tuning BERT, which is a language model (LM) trained on large-scale unlabeled text data and can generate rich contextual representations. Our assumption is that given a history context sequence, a powerful LM can narrow the range of possible choices and the speech signal can be used as a simple clue. Hence, comparing to conventional ASR systems that train a powerful acoustic model (AM) from scratch, we believe that speech recognition is possible by simply fine-tuning a BERT model. As an initial study, we demonstrate the effectiveness of the proposed idea on the AISHELL dataset and show that stacking a very simple AM on top of BERT can yield reasonable performance.

preprint2020arXiv

An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering

In a spoken multiple-choice question answering (SMCQA) task, given a passage, a question, and multiple choices all in the form of speech, the machine needs to pick the correct choice to answer the question. While the audio could contain useful cues for SMCQA, usually only the auto-transcribed text is utilized in system development. Thanks to the large-scaled pre-trained language representation models, such as the bidirectional encoder representations from transformers (BERT), systems with only auto-transcribed text can still achieve a certain level of performance. However, previous studies have evidenced that acoustic-level statistics can offset text inaccuracies caused by the automatic speech recognition systems or representation inadequacy lurking in word embedding generators, thereby making the SMCQA system robust. Along the line of research, this study concentrates on designing a BERT-based SMCQA framework, which not only inherits the advantages of contextualized language representations learned by BERT, but integrates the complementary acoustic-level information distilled from audio with the text-level information. Consequently, an audio-enriched BERT-based SMCQA framework is proposed. A series of experiments demonstrates remarkable improvements in accuracy over selected baselines and SOTA systems on a published Chinese SMCQA dataset.

preprint2020arXiv

Investigation of Sentiment Controllable Chatbot

Conventional seq2seq chatbot models attempt only to find sentences with the highest probabilities conditioned on the input sequences, without considering the sentiment of the output sentences. In this paper, we investigate four models to scale or adjust the sentiment of the chatbot response: a persona-based model, reinforcement learning, a plug and play model, and CycleGAN, all based on the seq2seq model. We also develop machine-evaluated metrics to estimate whether the responses are reasonable given the input. These metrics, together with human evaluation, are used to analyze the performance of the four models in terms of different aspects; reinforcement learning and CycleGAN are shown to be very attractive.