Researcher profile

Jan Christian Blaise Cruz

Jan Christian Blaise Cruz contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - Baseline
5works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Idea First, Code Later: Disentangling Problem Solving from Code Generation in Evaluating LLMs for Competitive Programming

Large Language Models (LLMs) increasingly succeed on competitive programming problems, yet existing evaluations conflate algorithmic reasoning with code-level implementation. We argue that competitive programming is fundamentally a problem-solving task and propose centering natural-language editorials in both solution generation and evaluation. Generating an editorial prior to code improves solve rates for some LLMs, with substantially larger gains when using expertly written gold editorials. However, even with gold editorials, models continue to struggle with implementation, while the gap between generated and gold editorials reveals a persistent problem-solving bottleneck in specifying correct and complete algorithms. Beyond pass/fail metrics, we diagnose reasoning errors by comparing model-generated editorials to gold standards using expert annotations and validate an LLM-as-a-judge protocol for scalable evaluation. We introduce a dataset of 83 ICPC-style problems with gold editorials and full test suites, and evaluate 19 LLMs, arguing that future benchmarks should explicitly separate problem solving from implementation.

preprint2026arXiv

Multilinguality as Sense Adaptation

We approach multilinguality as sense adaptation: aligning latent meaning representations across languages rather than relying solely on shared parameters and scale. In this paper, we introduce SENse-based Symmetric Interlingual Alignment (SENSIA), which adapts a Backpack language model from one language to another by explicitly aligning sense-level mixtures and contextual representations on parallel data, while jointly training a target-language language modeling loss to preserve fluency. Across benchmarks on four typologically diverse languages, SENSIA generally outperforms comparable multilingual alignment methods and achieves competitive accuracy against monolingual from-scratch baselines while using 2-4x less target-language data. Analyses of learned sense geometry indicate that local sense topology and global structure relative to English are largely preserved, and ablations show that the method is robust in terms of design and scale.

preprint2022arXiv

Using Synthetic Data for Conversational Response Generation in Low-resource Settings

Response generation is a task in natural language processing (NLP) where a model is trained to respond to human statements. Conversational response generators take this one step further with the ability to respond within the context of previous responses. While there are existing techniques for training such models, they all require an abundance of conversational data which are not always available for low-resource languages. In this research, we make three contributions. First, we released the first Filipino conversational dataset collected from a popular Philippine online forum, which we named the PEx Conversations Dataset. Second, we introduce a data augmentation (DA) methodology for Filipino data by employing a Tagalog RoBERTa model to increase the size of the existing corpora. Lastly, we published the first Filipino conversational response generator capable of generating responses related to the previous 3 responses. With the supplementary synthetic data, we were able to improve the performance of the response generator by up to 12.2% in BERTScore, 10.7% in perplexity, and 11.7% in content word usage as compared to training with zero synthetic data.

preprint2020arXiv

Establishing Baselines for Text Classification in Low-Resource Languages

While transformer-based finetuning techniques have proven effective in tasks that involve low-resource, low-data environments, a lack of properly established baselines and benchmark datasets make it hard to compare different approaches that are aimed at tackling the low-resource setting. In this work, we provide three contributions. First, we introduce two previously unreleased datasets as benchmark datasets for text classification and low-resource multilabel text classification for the low-resource language Filipino. Second, we pretrain better BERT and DistilBERT models for use within the Filipino setting. Third, we introduce a simple degradation test that benchmarks a model's resistance to performance degradation as the number of training samples are reduced. We analyze our pretrained model's degradation speeds and look towards the use of this method for comparing models aimed at operating within the low-resource setting. We release all our models and datasets for the research community to use.

preprint2020arXiv

Localization of Fake News Detection via Multitask Transfer Learning

The use of the internet as a fast medium of spreading fake news reinforces the need for computational tools that combat it. Techniques that train fake news classifiers exist, but they all assume an abundance of resources including large labeled datasets and expert-curated corpora, which low-resource languages may not have. In this work, we make two main contributions: First, we alleviate resource scarcity by constructing the first expertly-curated benchmark dataset for fake news detection in Filipino, which we call "Fake News Filipino." Second, we benchmark Transfer Learning (TL) techniques and show that they can be used to train robust fake news classifiers from little data, achieving 91% accuracy on our fake news dataset, reducing the error by 14% compared to established few-shot baselines. Furthermore, lifting ideas from multitask learning, we show that augmenting transformer-based transfer techniques with auxiliary language modeling losses improves their performance by adapting to writing style. Using this, we improve TL performance by 4-6%, achieving an accuracy of 96% on our best model. Lastly, we show that our method generalizes well to different types of news articles, including political news, entertainment news, and opinion articles.