Source author record

Haowei Hua

Haowei Hua appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language cond-mat.mtrl-sci Machine Learning

Catalog footprint

What is connected

2works

3topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Empirical Comparison of Encoder-Based Language Models and Feature-Based Supervised Machine Learning Approaches to Automated Scoring of Long Essays

Long context may impose challenges for encoder-only language models in text processing, specifically for automated scoring of essays. This study trained several commonly used encoder-based language models for automated scoring of long essays. The performance of these trained models was evaluated and compared with the ensemble models built upon the base language models with a token limit of 512?. The experimented models include BERT-based models (BERT, RoBERTa, DistilBERT, and DeBERTa), ensemble models integrating embeddings from multiple encoder models, and ensemble models of feature-based supervised machine learning models, including Gradient-Boosted Decision Trees, eXtreme Gradient Boosting, and Light Gradient Boosting Machine. We trained, validated, and tested each model on a dataset of 17,307 essays, with an 80%/10%/10% split, and evaluated model performance using Quadratic Weighted Kappa. This study revealed that an ensemble-of-embeddings model that combines multiple pre-trained language model representations with gradient-boosting classifier as the ensemble model significantly outperforms individual language models at scoring long essays.

preprint2026arXiv

Scalable Dielectric Tensor Predictions for Inorganic Materials using Equivariant Graph Neural Networks

Accurate prediction of dielectric tensors is essential for accelerating the discovery of next-generation inorganic dielectric materials. Existing machine learning approaches, such as equivariant graph neural networks, typically rely on specially-designed network architectures to enforce O(3) equivariance. However, to preserve equivariance, these specially-designed models restrict the update of equivariant features during message passing to linear transformations or gated equivariant nonlinearities. The inability to implicitly characterize more complex nonlinear structures may reduce the predictive accuracy of the model. In this study, we introduce a frame-averaging-based approach to achieve equivariant dielectric tensor prediction. We propose GoeCTP, an O(3)-equivariant framework that predicts dielectric tensors without imposing any structural restrictions on the backbone network. We benchmark its performance against several state-of-the-art models and further employ it for large-scale virtual screening of thermodynamically stable materials from the Materials Project database. GoeCTP successfully identifies various promising candidates, such as Zr(InBr$_3$)$_2$ (band gap $E_g = 2.41$ eV, dielectric constant $\overline{\varepsilon} = 194.72$) and SeI$_2$ (anisotropy ratio $α_r = 96.763$), demonstrating its accuracy and efficiency in accelerating the discovery of advanced inorganic dielectric materials.

Haowei Hua

What is connected

Connect this record

See the researcher in context

Building this map preview

2 published item(s)

Empirical Comparison of Encoder-Based Language Models and Feature-Based Supervised Machine Learning Approaches to Automated Scoring of Long Essays

Scalable Dielectric Tensor Predictions for Inorganic Materials using Equivariant Graph Neural Networks