Source author record

Jingwen Zhang

Jingwen Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence astro-ph.SR Machine Learning Software Engineering astro-ph.EP Computation and Language Computer Science and Game Theory Computer Vision cs.CY Graphics Human-Computer Interaction Multiagent Systems Neurons and Cognition physics.soc-ph Populations and Evolution Quantitative Methods Social and Information Networks

Catalog footprint

What is connected

11works

17topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

SSR: Safeguarding Staking Rewards by Defining and Detecting Logical Defects in DeFi Staking

Decentralized Finance (DeFi) staking is one of the most prominent applications within the DeFi ecosystem, where DeFi projects enable users to stake tokens on the platform and reward participants with additional tokens. However, logical defects in DeFi staking could enable attackers to claim unwarranted rewards by manipulating reward amounts, repeatedly claiming rewards, or engaging in other malicious actions. To mitigate these threats, we conducted the first study focused on defining and detecting logical defects in DeFi staking. Through the analysis of 64 security incidents and 144 audit reports, we identified six distinct types of logical defects, each accompanied by detailed descriptions and code examples. Building on this empirical research, we developed SSR (Safeguarding Staking Reward), a static analysis tool designed to detect logical defects in DeFi staking contracts. SSR utilizes a large language model (LLM) to extract fundamental information about staking logic and constructs a DeFi staking model. It then identifies logical defects by analyzing the model and the associated semantic features. We constructed a ground truth dataset based on known security incidents and audit reports to evaluate the effectiveness of SSR. The results indicate that SSR achieves an overall precision of 92.31%, a recall of 87.92%, and an F1-score of 88.85%. Additionally, to assess the prevalence of logical defects in real-world smart contracts, we compiled a large-scale dataset of 15,992 DeFi staking contracts. SSR detected that 3,557 (22.24%) of these contracts contained at least one logical defect.

preprint2025arXiv

Dynamical Architectures of S-type Transiting Planets in Binaries II: A Dichotomy in Orbital Alignment of Small Planets in Close Binary Systems

Stellar multiplicity plays a crucial role in shaping planet formation and dynamical evolution. We present a survey of 54 TESS Objects of Interest (TOIs) within 300 pc that exhibit significant Hipparcos-Gaia astrometric accelerations. We identified 35 TOIs with stellar companions at projected separations between $0.1^{\prime\prime}$ to $2^{\prime\prime}$ (or $10-200$ AU). We also identified 12 TOIs that could host planetary-mass or brown dwarf companions, including 6 that are newly discovered. Furthermore, we perform three-dimensional orbital characterization for 12 binaries hosting confirmed planets or planet candidates, allowing us to constrain the line-of-sight mutual inclination, $ΔI_{\mathrm{los}}$, between the planetary and binary orbits. Combining our sample with previous measurements, we apply Bayesian hierarchical analysis to a total of 26 binary systems with S-type transiting planets ($r_p<5R_{\oplus}$). Specifically, we fit the $ΔI_{\mathrm{los}}$ distribution with both single (Rayleigh) and mixture models (two-component Rayleigh and Rayleigh-isotropic mixture). We find the mixture models are strongly favored ($\log Z\gtrsim13.9$, or $\approx$5$σ$), indicating the observed planet-binary $ΔI_{\mathrm{los}}$ values likely originate from two underlying populations: one nearly aligned ($σ_1 = 2^{\circ}.4^{+0.7}_{-0.9}$) and one with more scattered mutual inclinations ($σ_2 = 23^{\circ}.6^{+8.8}_{-7.1}$). Alternatively, the misaligned systems can be equally well described by an isotropic distribution of inclinations. This observed dichotomy likely reflects different dynamical histories. Notably, the misaligned population only emerges in systems with stellar periastron distances $>40$ AU while systems with close-in or eccentric stellar companions (periastron distances $<40$ AU) preserve planet-binary alignment.

preprint2024arXiv

A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends

General large language models (LLMs), represented by ChatGPT, have demonstrated significant potential in tasks such as code generation in software engineering. This has led to the development of specialized LLMs for software engineering, known as Code LLMs. A considerable portion of Code LLMs is derived from general LLMs through model fine-tuning. As a result, Code LLMs are often updated frequently and their performance can be influenced by the base LLMs. However, there is currently a lack of systematic investigation into Code LLMs and their performance. In this study, we conduct a comprehensive survey and analysis of the types of Code LLMs and their differences in performance compared to general LLMs. We aim to address three questions: (1) What LLMs are specifically designed for software engineering tasks, and what is the relationship between these Code LLMs? (2) Do Code LLMs really outperform general LLMs in software engineering tasks? (3) Which LLMs are more proficient in different software engineering tasks? To answer these questions, we first collect relevant literature and work from five major databases and open-source communities, resulting in 134 works for analysis. Next, we categorize the Code LLMs based on their publishers and examine their relationships with general LLMs and among themselves. Furthermore, we investigate the performance differences between general LLMs and Code LLMs in various software engineering tasks to demonstrate the impact of base models and Code LLMs. Finally, we comprehensively maintained the performance of LLMs across multiple mainstream benchmarks to identify the best-performing LLMs for each software engineering task. Our research not only assists developers of Code LLMs in choosing base models for the development of more advanced LLMs but also provides insights for practitioners to better understand key improvement directions for Code LLMs.

preprint2022arXiv

Auction-Based Ex-Post-Payment Incentive Mechanism Design for Horizontal Federated Learning with Reputation and Contribution Measurement

Federated learning trains models across devices with distributed data, while protecting the privacy and obtaining a model similar to that of centralized ML. A large number of workers with data and computing power are the foundation of federal learning. However, the inevitable costs prevent self-interested workers from serving for free. Moreover, due to data isolation, task publishers lack effective methods to select, evaluate and pay reliable workers with high-quality data. Therefore, we design an auction-based incentive mechanism for horizontal federated learning with reputation and contribution measurement. By designing a reasonable method of measuring contribution, we establish the reputation of workers, which is easy to decline and difficult to improve. Through reverse auctions, workers bid for tasks, and the task publisher selects workers combining reputation and bid price. With the budget constraint, winning workers are paid based on performance. We proved that our mechanism satisfies the individual rationality of the honest worker, budget feasibility, truthfulness, and computational efficiency.

preprint2022arXiv

Online Auction-Based Incentive Mechanism Design for Horizontal Federated Learning with Budget Constraint

Federated learning makes it possible for all parties with data isolation to train the model collaboratively and efficiently while satisfying privacy protection. To obtain a high-quality model, an incentive mechanism is necessary to motivate more high-quality workers with data and computing power. The existing incentive mechanisms are applied in offline scenarios, where the task publisher collects all bids and selects workers before the task. However, it is practical that different workers arrive online in different orders before or during the task. Therefore, we propose a reverse auction-based online incentive mechanism for horizontal federated learning with budget constraint. Workers submit bids when they arrive online. The task publisher with a limited budget leverages the information of the arrived workers to decide on whether to select the new worker. Theoretical analysis proves that our mechanism satisfies budget feasibility, computational efficiency, individual rationality, consumer sovereignty, time truthfulness, and cost truthfulness with a sufficient budget. The experimental results show that our online mechanism is efficient and can obtain high-quality models.

preprint2021arXiv

Analyzing the Spatiotemporal Interaction and Propagation of ATN Biomarkers in Alzheimer's Disease using Longitudinal Neuroimaging Data

Three major biomarkers: beta-amyloid (A), pathologic tau (T), and neurodegeneration (N), are recognized as valid proxies for neuropathologic changes of Alzheimer's disease. While there are extensive studies on cerebrospinal fluids biomarkers (amyloid, tau), the spatial propagation pattern across brain is missing and their interactive mechanisms with neurodegeneration are still unclear. To this end, we aim to analyze the spatiotemporal associations between ATN biomarkers using large-scale neuroimaging data. We first investigate the temporal appearances of amyloid plaques, tau tangles, and neuronal loss by modeling the longitudinal transition trajectories. Second, we propose linear mixed-effects models to quantify the pathological interactions and propagation of ATN biomarkers at each brain region. Our analysis of the current data shows that there exists a temporal latency in the build-up of amyloid to the onset of tau pathology and neurodegeneration. The propagation pattern of amyloid can be characterized by its diffusion along the topological brain network. Our models provide sufficient evidence that the progression of pathological tau and neurodegeneration share a strong regional association, which is different from amyloid.

preprint2020arXiv

A White-light Flare Powered by Magnetic Reconnection in the Lower Solar Atmosphere

White-light flares (WLFs), first observed in 1859, refer to a type of solar flares showing an obvious enhancement of the visible continuum emission. This type of enhancement often occurs in most energetic flares, and is usually interpreted as a consequence of efficient heating in the lower solar atmosphere through non-thermal electrons propagating downward from the energy release site in the corona. However, this coronal-reconnection model has difficulty in explaining the recently discovered small WLFs. Here we report a C2.3 white-light flare, which are associated with several observational phenomena: fast decrease in opposite-polarity photospheric magnetic fluxes, disappearance of two adjacent pores, significant heating of the lower chromosphere, negligible increase of hard X-ray flux, and an associated U-shaped magnetic field configuration. All these suggest that this white-light flare is powered by magnetic reconnection in the lower part of the solar atmosphere rather than by reconnection higher up in the corona.

preprint2020arXiv

Effects of Persuasive Dialogues: Testing Bot Identities and Inquiry Strategies

Intelligent conversational agents, or chatbots, can take on various identities and are increasingly engaging in more human-centered conversations with persuasive goals. However, little is known about how identities and inquiry strategies influence the conversation's effectiveness. We conducted an online study involving 790 participants to be persuaded by a chatbot for charity donation. We designed a two by four factorial experiment (two chatbot identities and four inquiry strategies) where participants were randomly assigned to different conditions. Findings showed that the perceived identity of the chatbot had significant effects on the persuasion outcome (i.e., donation) and interpersonal perceptions (i.e., competence, confidence, warmth, and sincerity). Further, we identified interaction effects among perceived identities and inquiry strategies. We discuss the findings for theoretical and practical implications for developing ethical and effective persuasive chatbots. Our published data, codes, and analyses serve as the first step towards building competent ethical persuasive chatbots.

preprint2020arXiv

Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good

Developing intelligent persuasive conversational agents to change people's opinions and actions for social good is the frontier in advancing the ethical development of automated dialogue systems. To do so, the first step is to understand the intricate organization of strategic disclosures and appeals employed in human persuasion conversations. We designed an online persuasion task where one participant was asked to persuade the other to donate to a specific charity. We collected a large dataset with 1,017 dialogues and annotated emerging persuasion strategies from a subset. Based on the annotation, we built a baseline classifier with context information and sentence-level features to predict the 10 persuasion strategies used in the corpus. Furthermore, to develop an understanding of personalized persuasion processes, we analyzed the relationships between individuals' demographic and psychological backgrounds including personality, morality, value systems, and their willingness for donation. Then, we analyzed which types of persuasion strategies led to a greater amount of donation depending on the individuals' personal backgrounds. This work lays the ground for developing a personalized persuasive dialogue system.

preprint2020arXiv

Predictive and Generative Neural Networks for Object Functionality

Humans can predict the functionality of an object even without any surroundings, since their knowledge and experience would allow them to "hallucinate" the interaction or usage scenarios involving the object. We develop predictive and generative deep convolutional neural networks to replicate this feat. Specifically, our work focuses on functionalities of man-made 3D objects characterized by human-object or object-object interactions. Our networks are trained on a database of scene contexts, called interaction contexts, each consisting of a central object and one or more surrounding objects, that represent object functionalities. Given a 3D object in isolation, our functional similarity network (fSIM-NET), a variation of the triplet network, is trained to predict the functionality of the object by inferring functionality-revealing interaction contexts. fSIM-NET is complemented by a generative network (iGEN-NET) and a segmentation network (iSEG-NET). iGEN-NET takes a single voxelized 3D object with a functionality label and synthesizes a voxelized surround, i.e., the interaction context which visually demonstrates the corresponding functionality. iSEG-NET further separates the interacting objects into different groups according to their interaction types.

preprint2020arXiv

Using Reports of Own and Others' Symptoms and Diagnosis on Social Media to Predict COVID-19 Case Counts: Observational Infoveillance Study in Mainland China

Can public social media data be harnessed to predict COVID-19 case counts? We analyzed approximately 15 million COVID-19 related posts on Weibo, a popular Twitter-like social media platform in China, from November 1, 2019 to March 31, 2020. We developed a machine learning classifier to identify "sick posts," which are reports of one's own and other people's symptoms and diagnosis related to COVID-19. We then modeled the predictive power of sick posts and other COVID-19 posts on daily case counts. We found that reports of symptoms and diagnosis of COVID-19 significantly predicted daily case counts, up to 14 days ahead of official statistics. But other COVID-19 posts did not have similar predictive power. For a subset of geotagged posts (3.10% of all retrieved posts), we found that the predictive pattern held true for both Hubei province and the rest of mainland China, regardless of unequal distribution of healthcare resources and outbreak timeline. Researchers and disease control agencies should pay close attention to the social media infosphere regarding COVID-19. On top of monitoring overall search and posting activities, it is crucial to sift through the contents and efficiently identify true signals from noise.

Jingwen Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

SSR: Safeguarding Staking Rewards by Defining and Detecting Logical Defects in DeFi Staking

Dynamical Architectures of S-type Transiting Planets in Binaries II: A Dichotomy in Orbital Alignment of Small Planets in Close Binary Systems

A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends

Auction-Based Ex-Post-Payment Incentive Mechanism Design for Horizontal Federated Learning with Reputation and Contribution Measurement

Online Auction-Based Incentive Mechanism Design for Horizontal Federated Learning with Budget Constraint

Analyzing the Spatiotemporal Interaction and Propagation of ATN Biomarkers in Alzheimer's Disease using Longitudinal Neuroimaging Data

A White-light Flare Powered by Magnetic Reconnection in the Lower Solar Atmosphere

Effects of Persuasive Dialogues: Testing Bot Identities and Inquiry Strategies

Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good

Predictive and Generative Neural Networks for Object Functionality

Using Reports of Own and Others' Symptoms and Diagnosis on Social Media to Predict COVID-19 Case Counts: Observational Infoveillance Study in Mainland China