Source author record

Chenxing Zhong

Chenxing Zhong appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Software Engineering Artificial Intelligence

Catalog footprint

What is connected

2works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

An Execution-Verified Multi-Language Benchmark for Code Semantic Reasoning

Evaluating whether large language models (LLMs) can recover execution-relevant program structure, rather than only produce code that passes tests, remains an open problem. Existing code benchmarks emphasize test-passing outputs, from standalone programming tasks (HumanEval, MBPP, LiveCodeBench) to repository repair (SWE-Bench); this is useful, but offers limited diagnostic signal about which program semantics a model can recover from source. We introduce TraceEval, to our knowledge the first execution-verified, multi-language benchmark for code semantic reasoning: recovering a program's runtime call structure from source code. Unlike prior call-graph benchmarks that rely on static-tool output or hand-annotated ground truth, every positive edge in TraceEval is mechanically witnessed by validation execution, eliminating annotator disagreement and label noise for observed behavior. TraceEval consists of (i) 10,583 real-world programs (2,129 test, 8,454 train) extracted from 1,600+ open-source repositories across Python, JavaScript, and Java via an LLM-assisted harness-generation pipeline with tracer validation; and (ii) a reproducible pipeline that converts any open-source repository into new verified benchmark instances. We evaluate 10 LLMs at zero-shot on the held-out test split. The strongest model, Claude-Opus-4.6, reaches an average F1 of 72.9% across the three languages. To demonstrate the train split's utility as a supervision substrate, we fine-tune the Qwen2.5-Coder family on it: lifts of up to +55.6 F1 bring tuned Qwen2.5-Coder-32B to 71.2%, within 1.7 F1 of zero-shot Claude-Opus-4.6. We release the benchmark, pipeline, baselines, and a datasheet at https://github.com/yikun-li/TraceEva

preprint2022arXiv

A Cross-Company Ethnographic Study on Software Teams for DevOps and Microservices: Organization, Benefits, and Issues

Context: DevOps and microservices are acknowledged to be important new paradigms to tackle contemporary software demands and provide capabilities for rapid and reliable software development. Industrial reports show that they are quickly adopted together in massive software companies. However, because of the technical and organizational requirements, many difficulties against efficient implementation of the both emerge in real software teams. Objectives: This study aims to discover the organization, benefits and issues of software teams using DevOps & microservices from an immersive perspective. Method: An ethnographic study was carried out in three companies with different business, size, products, customers, and degree of globalization. All the three companies claimed their adoption of DevOps and microservices. Seven months (cumulative) of participant observations and nine interviews with practitioners were conducted to collect the data of software teams related to DevOps and microservices. A cross-company empirical investigation using grounded theory was done by analyzing the archive data. Results: The adoption of DevOps and microservices brings benefits to rapid delivery, ability improvements and burden reduction, whilst the high cost and lack of practical guidance were emerged. Moreover, our observations and interviews reflect that in software teams, the relationship between DevOps and microservices is not significant, which differs from the relationship described in the previous studies. Four lessons for practitioners and four implications for researchers were discussed based on our findings. Conclusion: Our findings contribute to the understanding of the organization, benefits and issues of adopting DevOps and microservices from an immersive perspective of software teams.