Source author record

Chaowei Yang

Chaowei Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

physics.soc-ph Artificial Intelligence Populations and Evolution Social and Information Networks

Catalog footprint

What is connected

4works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Benchmarking Small Language Models and Small Reasoning Language Models on System Log Severity Classification

System logs are crucial for monitoring and diagnosing modern computing infrastructure, but their scale and complexity require reliable and efficient automated interpretation. Since severity levels are predefined metadata in system log messages, having a model merely classify them offers limited standalone practical value, revealing little about its underlying ability to interpret system logs. We argue that severity classification is more informative when treated as a benchmark for probing runtime log comprehension rather than as an end task. Using real-world journalctl data from Linux production servers, we evaluate nine small language models (SLMs) and small reasoning language models (SRLMs) under zero-shot, few-shot, and retrieval-augmented generation (RAG) prompting. The results reveal strong stratification. Qwen3-4B achieves the highest accuracy at 95.64% with RAG, while Gemma3-1B improves from 20.25% under few-shot prompting to 85.28% with RAG. Notably, the tiny Qwen3-0.6B reaches 88.12% accuracy despite weak performance without retrieval. In contrast, several SRLMs, including Qwen3-1.7B and DeepSeek-R1-Distill-Qwen-1.5B, degrade substantially when paired with RAG. Efficiency measurements further separate models: most Gemma and Llama variants complete inference in under 1.2 seconds per log, whereas Phi-4-Mini-Reasoning exceeds 228 seconds per log while achieving <10% accuracy. These findings suggest that (1) architectural design, (2) training objectives, and (3) the ability to integrate retrieved context under strict output constraints jointly determine performance. By emphasizing small, deployable models, this benchmark aligns with real-time requirements of digital twin (DT) systems and shows that severity classification serves as a lens for evaluating model competence and real-time deployability, with implications for root cause analysis (RCA) and broader DT integration.

preprint2021arXiv

Condition Sensing for Electricity Infrastructure in Disasters by Mining Public Topics from Social Media

Timely and reliable sensing of infrastructure conditions is critical in disaster management for planning effective infrastructure restorations. Social media, a near real-time information source, has been widely used in disasters for forming timely situational awareness. Yet, using social media to sense electricity infrastructure conditions has not been explored. This study aims to address the research gap through mining public topics from social media. To achieve this purpose, we proposed a systematic and customized approach wherein (1) electricity-related social media data is extracted by the classifier developed based on Bidirectional Encoder Representations from Transformers (BERT); and (2) public topics are modeled with unigrams, bigrams, and trigrams to incorporate the formulaic expressions of infrastructure conditions in social media. Electricity infrastructures in Florida impacted by Hurricane Irma are studied for illustration and demonstration. Results show that the proposed approach is capable of sensing the temporal evolutions and geographic differences of electricity infrastructure conditions.

preprint2020arXiv

Spatiotemporal Patterns of COVID-19 Impact on Human Activities and Environment in China Using Nighttime Light and Air Quality Data

In order to analyze the impact of COVID-19 on people's lives, activities and the natural environment, this paper investigates the spatial and temporal characteristics of Night Time Light (NTL) radiance and Air Quality Index (AQI) before and during the pandemic in mainland China. Our results show that the monthly average NTL brightness is much lower during the quarantine period than before. This study categorizes NTL into three classes: residential area, transportation and public facilities and commercial centers, with NTL radiance ranges of 5-20, 20-40 and greater than 40 nW/(cm*cm*sr), respectively. We found that the Number Of Pixels (NOP) with NTL detection increased in the residential area and decreased in the commercial centers for most of the provinces after the shutdown, while transportation and public facilities generally stayed the same. More specifically, we examined these factors in Wuhan, where the first confirmed cases were reported, and where the earliest quarantine measures were taken. Observations and analysis of pixels associated with commercial centers were observed to have lower NTL radiance values, indicating a dimming behavior, while residential area pixels recorded increased levels of brightness, after the beginning of the lockdown. The study also discovered a significant decreasing trend in the daily average AQI for the whole country, with cleaner air in most provinces during February and March, compared to January 2020. In conclusion, the outbreak and spread of COVID-19 has had a crucial impact on people's daily lives and activity ranges through the increased implementation of lockdown and quarantine policies. On the other hand, the air quality of China has improved with the reduction of non-essential industries and motor vehicle usage.

preprint2020arXiv

Taking the pulse of COVID-19: A spatiotemporal perspective

The sudden outbreak of the Coronavirus disease (COVID-19) swept across the world in early 2020, triggering the lockdowns of several billion people across many countries, including China, Spain, India, the U.K., Italy, France, Germany, and most states of the U.S. The transmission of the virus accelerated rapidly with the most confirmed cases in the U.S., and New York City became an epicenter of the pandemic by the end of March. In response to this national and global emergency, the NSF Spatiotemporal Innovation Center brought together a taskforce of international researchers and assembled implemented strategies to rapidly respond to this crisis, for supporting research, saving lives, and protecting the health of global citizens. This perspective paper presents our collective view on the global health emergency and our effort in collecting, analyzing, and sharing relevant data on global policy and government responses, geospatial indicators of the outbreak and evolving forecasts; in developing research capabilities and mitigation measures with global scientists, promoting collaborative research on outbreak dynamics, and reflecting on the dynamic responses from human societies.

Chaowei Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Benchmarking Small Language Models and Small Reasoning Language Models on System Log Severity Classification

Condition Sensing for Electricity Infrastructure in Disasters by Mining Public Topics from Social Media

Spatiotemporal Patterns of COVID-19 Impact on Human Activities and Environment in China Using Nighttime Light and Air Quality Data

Taking the pulse of COVID-19: A spatiotemporal perspective