Researcher profile

Yilong Wang

Yilong Wang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
13works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

13 published item(s)

preprint2026arXiv

Enhancing Multilingual Counterfactual Generation through Alignment-as-Preference Optimization

Self-generated counterfactual explanations (SCEs) are minimally modified inputs (minimality) generated by large language models (LLMs) that flip their own predictions (validity), offering a causally grounded approach to unraveling black-box LLM behavior. Yet extending them beyond English remains challenging: existing methods struggle to produce valid SCEs in non-dominant languages, and a persistent trade-off between validity and minimality undermines explanation quality. We introduce Macro, a preference alignment framework that applies Direct Preference Optimization (DPO) to multilingual SCE generation, using a composite scoring function to construct preference pairs that effectively translate the trade-off into measurable preference signals. Experiments across four LLMs and seven typologically diverse languages show that Macro improves validity by 12.55\% on average over the chain-of-thought baseline without degrading minimality, while avoiding the severe minimality violations of the translation-based baseline. Compared to supervised fine-tuning, Macro achieves superior performance on both metrics, confirming that explicit preference optimization is essential for balancing this trade-off. Further analyses reveal that Macro increases cross-lingual perturbation alignment and mitigates common generation errors. Our results highlight preference optimization as a promising direction for enhancing multilingual model explanations.

preprint2026arXiv

iFlip: Iterative Feedback-driven Counterfactual Example Refinement

Counterfactual examples are minimal edits to an input that alter a model's prediction. They are widely employed in explainable AI to probe model behavior and in natural language processing (NLP) to augment training data. However, generating valid counterfactuals with large language models (LLMs) remains challenging, as existing single-pass methods often fail to induce reliable label changes, neglecting LLMs' self-correction capabilities. To explore this untapped potential, we propose iFlip, an iterative refinement approach that leverages three types of feedback, including model confidence, feature attribution, and natural language. Our results show that iFlip achieves an average 57.8% higher validity than the five state-of-the-art baselines, as measured by the label flipping rate. The user study further corroborates that iFlip outperforms baselines in completeness, overall satisfaction, and feasibility. In addition, ablation studies demonstrate that three components are paramount for iFlip to generate valid counterfactuals: leveraging an appropriate number of iterations, pointing to highly attributed words, and early stopping. Finally, counterfactuals generated by iFlip enable effective counterfactual data augmentation, substantially improving model performance and robustness.

preprint2022arXiv

Higher central charges and Witt groups

In this paper, we introduce the definitions of signatures of braided fusion categories, which are proved to be invariants of their Witt equivalence classes. These signature assignments define group homomorphisms on the Witt group. The higher central charges of pseudounitary modular categories can be expressed in terms of these signatures, which are applied to prove that the Ising modular categories have infinitely many square roots in the Witt group. This result is further applied to prove a conjecture of Davydov-Nikshych-Ostrik on the super-Witt group: the torsion subgroup generated by the completely anisotropic s-simple braided fusion categories has infinite rank.

preprint2022arXiv

Image Fragile Watermarking Algorithm Based on Deneighborhood Mapping

To address the security risk caused by fixed offset mapping and the limited recoverability of random mapping used in image watermarking, we propose an image self-embedding fragile watermarking algorithm based on deneighborhood mapping. First, the image is divided into several 2*2 blocks, and authentication watermark and recovery watermark are generated based on the average value of the image blocks. Then, the denighborhood mapping is implemented as, for each image block, its mapping block is randomly selected outside it's neighborhood whose size is specified by a parameter. Finally, the authentication watermark and the recovery watermark are embedded in the image block itself and its corresponding mapping block. Theoretical analysis indicates that in the case of continuous region tampering, the proposed watermarking method can achieve better the recovery rate of the tampered image block than the method based on the random mapping. The experimental results verify the rationality and effectiveness of the theoretical analysis. Moreover, compared with the existing embedding algorithms based on random mapping, chaos mapping and Arnold mapping, in the case of continuous region tampering, the average recovery rate of the tampered region achieved by the proposed algorithm is higher.

preprint2022arXiv

Near-real-time estimates of daily CO2 emissions from 1500 cities worldwide

Building on near-real-time and spatially explicit estimates of daily carbon dioxide (CO2) emissions, here we present and analyze a new city-level dataset of fossil fuel and cement emissions. Carbon Monitor Cities provides daily, city-level estimates of emissions from January 2019 through December 2021 for 1500 cities in 46 countries, and disaggregates five sectors: power generation, residential (buildings), industry, ground transportation, and aviation. The goal of this dataset is to improve the timeliness and temporal resolution of city-level emission inventories and includes estimates for both functional urban areas and city administrative areas that are consistent with global and regional totals. Comparisons with other datasets (i.e. CEADs, MEIC, Vulcan, and CDP) were performed, and we estimate the overall uncertainty to be 21.7%. Carbon Monitor Cities is a near-real-time, city-level emission dataset that includes cities around the world, including the first estimates for many cities in low-income countries. A more complete description of this dataset is published in Scientific Data (https://doi.org/10.1038/s41597-022-01657-z).

preprint2022arXiv

The Witt classes of $SO(2r)_{2r}$

We study the Witt classes of the modular categories $SO(2r)_{2r}$ associated with quantum groups of type $D_r$ at $4r-2$th roots of unity. From these classes we derive infinitely many Witt classes of order 2 that are linearly independent modulo the subgroup generated by the pointed modular categories. In particular we produce an example of a simple, completely anisotropic modular category that is not pointed whose Witt class has order 2, answering a question of Davydov, Müger, Nikshych and Ostrik. Our results show that the trivial Witt class $[Vec]$ has infinitely many square roots modulo the pointed classes, in analogy with the recent construction of infinitely many square roots of the Ising Witt classes modulo the pointed classes constructed in a similar way from certain type $B_r$ modular categories. We compare the subgroups generated by the Ising square roots and $[Vec]$ square roots and provide evidence that they also generate linearly independent subgroups.

preprint2021arXiv

Global Daily CO$_2$ emissions for the year 2020

The diurnal cycle CO$_2$ emissions from fossil fuel combustion and cement production reflect seasonality, weather conditions, working days, and more recently the impact of the COVID-19 pandemic. Here, for the first time we provide a daily CO$_2$ emission dataset for the whole year of 2020 calculated from inventory and near-real-time activity data (called Carbon Monitor project: https://carbonmonitor.org). It was previously suggested from preliminary estimates that did not cover the entire year of 2020 that the pandemics may have caused more than 8% annual decline of global CO$_2$ emissions. Here we show from detailed estimates of the full year data that the global reduction was only 5.4% (-1,901 MtCO$_2$, ). This decrease is 5 times larger than the annual emission drop at the peak of the 2008 Global Financial Crisis. However, global CO$_2$ emissions gradually recovered towards 2019 levels from late April with global partial re-opening. More importantly, global CO$_2$ emissions even increased slightly by +0.9% in December 2020 compared with 2019, indicating the trends of rebound of global emissions. Later waves of COVID-19 infections in late 2020 and corresponding lockdowns have caused further CO$_2$ emissions reductions particularly in western countries, but to a much smaller extent than the declines in the first wave. That even substantial world-wide lockdowns of activity led to a one-time decline in global CO$_2$ emissions of only 5.4% in one year highlights the significant challenges for climate change mitigation that we face in the post-COVID era. These declines are significant, but will be quickly overtaken with new emissions unless the COVID-19 crisis is utilized as a break-point with our fossil-fuel trajectory, notably through policies that make the COVID-19 recovery an opportunity to green national energy and development plans.

preprint2021arXiv

Global Gridded Daily CO$_2$ Emissions

Precise and high-resolution carbon dioxide (CO$_2$) emission data is of great importance of achieving the carbon neutrality around the world. Here we present for the first time the near-real-time Global Gridded Daily CO$_2$ Emission Datasets (called GRACED) from fossil fuel and cement production with a global spatial-resolution of 0.1$^\circ$ by 0.1$^\circ$ and a temporal-resolution of 1-day. Gridded fossil emissions are computed for different sectors based on the daily national CO$_2$ emissions from near real time dataset (Carbon Monitor), the spatial patterns of point source emission dataset Global Carbon Grid (GID), Emission Database for Global Atmospheric Research (EDGAR) and spatiotemporal patters of satellite nitrogen dioxide (NO$_2$) retrievals. Our study on the global CO$_2$ emissions responds to the growing and urgent need for high-quality, fine-grained near-real-time CO2 emissions estimates to support global emissions monitoring across various spatial scales. We show the spatial patterns of emission changes for power, industry, residential consumption, ground transportation, domestic and international aviation, and international shipping sectors between 2019 and 2020. This help us to give insights on the relative contributions of various sectors and provides a fast and fine-grained overview of where and when fossil CO$_2$ emissions have decreased and rebounded in response to emergencies (e.g. COVID-19) and other disturbances of human activities than any previously published dataset. As the world recovers from the pandemic and decarbonizes its energy systems, regular updates of this dataset will allow policymakers to more closely monitor the effectiveness of climate and energy policies and quickly adapt.

preprint2021arXiv

Modular categories with transitive Galois actions

In this paper, we study modular categories whose Galois group actions on their simple objects are transitive. We show that such modular categories admit unique factorization into prime transitive factors. The representations of $SL_2(\mathbb{Z})$ associated with transitive modular categories are proven to be minimal and irreducible. Together with the Verlinde formula, we characterize prime transitive modular categories as the Galois conjugates of the adjoint subcategory of the quantum group modular category $\mathcal{C}(\mathfrak{sl}_2,p-2)$ for some prime $p > 3$. As a consequence, we completely classify transitive modular categories. Transitivity of super-modular categories can be similarly defined. A unique factorization of any transitive super-modular category into s-simple transitive factors is obtained, and the split transitive super-modular categories are completely classified.

preprint2020arXiv

Carbon Monitor: a near-real-time daily dataset of global CO2 emission from fossil fuel and cement production

We constructed a near-real-time daily CO2 emission dataset, namely the Carbon Monitor, to monitor the variations of CO2 emissions from fossil fuel combustion and cement production since January 1st 2019 at national level with near-global coverage on a daily basis, with the potential to be frequently updated. Daily CO2 emissions are estimated from a diverse range of activity data, including: hourly to daily electrical power generation data of 29 countries, monthly production data and production indices of industry processes of 62 countries/regions, daily mobility data and mobility indices of road transportation of 416 cities worldwide. Individual flight location data and monthly data were utilised for aviation and maritime transportation sectors estimates. In addition, monthly fuel consumption data that corrected for daily air temperature of 206 countries were used for estimating the emissions from commercial and residential buildings. This Carbon Monitor dataset manifests the dynamic nature of CO2 emissions through daily, weekly and seasonal variations as influenced by workdays and holidays, as well as the unfolding impacts of the COVID-19 pandemic. The Carbon Monitor near-real-time CO2 emission dataset shows a 7.8% decline of CO2 emission globally from Jan 1st to Apr 30th in 2020 when compared with the same period in 2019, and detects a re-growth of CO2 emissions by late April which are mainly attributed to the recovery of economy activities in China and partial easing of lockdowns in other countries. Further, this daily updated CO2 emission dataset could offer a range of opportunities for related scientific research and policy making.

preprint2020arXiv

COVID-19 causes record decline in global CO2 emissions

The considerable cessation of human activities during the COVID-19 pandemic has affected global energy use and CO2 emissions. Here we show the unprecedented decrease in global fossil CO2 emissions from January to April 2020 was of 7.8% (938 Mt CO2 with a +6.8% of 2-σ uncertainty) when compared with the period last year. In addition other emerging estimates of COVID impacts based on monthly energy supply or estimated parameters, this study contributes to another step that constructed the near-real-time daily CO2 emission inventories based on activity from power generation (for 29 countries), industry (for 73 countries), road transportation (for 406 cities), aviation and maritime transportation and commercial and residential sectors emissions (for 206 countries). The estimates distinguished the decline of CO2 due to COVID-19 from the daily, weekly and seasonal variations as well as the holiday events. The COVID-related decreases in CO2 emissions in road transportation (340.4 Mt CO2, -15.5%), power (292.5 Mt CO2, -6.4% compared to 2019), industry (136.2 Mt CO2, -4.4%), aviation (92.8 Mt CO2, -28.9%), residential (43.4 Mt CO2, -2.7%), and international shipping (35.9Mt CO2, -15%). Regionally, decreases in China were the largest and earliest (234.5 Mt CO2,-6.9%), followed by Europe (EU-27 & UK) (138.3 Mt CO2, -12.0%) and the U.S. (162.4 Mt CO2, -9.5%). The declines of CO2 are consistent with regional nitrogen oxides concentrations observed by satellites and ground-based networks, but the calculated signal of emissions decreases (about 1Gt CO2) will have little impacts (less than 0.13ppm by April 30, 2020) on the overserved global CO2 concertation. However, with observed fast CO2 recovery in China and partial re-opening globally, our findings suggest the longer-term effects on CO2 emissions are unknown and should be carefully monitored using multiple measures.

preprint2020arXiv

SemEval-2020 Task 4: Commonsense Validation and Explanation

In this paper, we present SemEval-2020 Task 4, Commonsense Validation and Explanation (ComVE), which includes three subtasks, aiming to evaluate whether a system can distinguish a natural language statement that makes sense to humans from one that does not, and provide the reasons. Specifically, in our first subtask, the participating systems are required to choose from two natural language statements of similar wording the one that makes sense and the one does not. The second subtask additionally asks a system to select the key reason from three options why a given statement does not make sense. In the third subtask, a participating system needs to generate the reason. We finally attracted 39 teams participating at least one of the three subtasks. For Subtask A and Subtask B, the performances of top-ranked systems are close to that of humans. However, for Subtask C, there is still a relatively large gap between systems and human performance. The dataset used in our task can be found at https://github.com/wangcunxiang/SemEval2020- Task4-Commonsense-Validation-and-Explanation; The leaderboard can be found at https://competitions.codalab.org/competitions/21080#results.