Researcher profile

Yi Cai

Yi Cai contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
12topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2026arXiv

Phase transitions for unique codings of fat Sierpinski gaskets with multiple digits

Given an integer $M\ge 1$ and $β\in(1, M+1)$, let $S_{β, M}$ be the fat Sierpinski gasket in $\mathbb R^2$ generated by the iterated function system $\left\{f_d(x)=\frac{x+d}β: d\inΩ_M\right\}$, where $Ω_M=\{(i,j)\in\mathbb Z_{\ge 0}^2: i+j\le M\}$. Then each $x\in S_{β, M}$ may be represented as a series $x=\sum_{i=1}^\infty\frac{d_i}{β^i}=:Π_β((d_i))$, and the infinite sequence $(d_i)\inΩ_M^{\mathbb N}$ is called a \emph{coding} of $x$. Since $β<M+1$, a point in $S_{β, M}$ may have multiple codings. Let $U_{β, M}$ be the set of $x\in S_{β, M}$ having a unique coding, that is \[ U_{β, M}=\left\{x\in S_{β, M}: \#Π_β^{-1}(x)=1\right\}. \] When $M=1$, Kong and Li [2020, Nonlinearity] described two critical bases for the phase transitions of the intrinsic univoque set $\widetilde U_{β, 1}$, which is a subset of $U_{β, 1}$. In this paper we consider $M\ge 2$, and characterize the two critical bases $β_G(M)$ and $β_c(M)$ for the phase transitions of $U_{β, M}$: (i) if $β\in(1, β_G(M)]$, then $U_{β, M}$ is finite; (ii) if $β\in(β_G(M), β_c(M))$ then $U_{β, M}$ is countably infinite; (iii) if $β=β_c(M)$ then $U_{β, M}$ is uncountable and has zero Hausdorff dimension; (iv) if $β>β_c(M)$ then $U_{β, M}$ has positive Hausdorff dimension. Our results can also be applied to the intrinsic univoque set $\widetilde{U}_{β, M}$. Moreover, we show that the first critical base $β_G(M)$ is a Perron number, while the second critical base $β_c(M)$ is a transcendental number.

preprint2026arXiv

Strat-Reasoner: Reinforcing Strategic Reasoning of LLMs in Multi-Agent Games

While Large Language Models (LLMs) excel in certain reasoning tasks, they struggle in multi-agent games where the final outcome depends on the joint strategies of all agents. In multi-agent games, the non-stationarity of other agents brings significant challenges on the evaluation of the reasoning process and the credit assignment over multiple reasoning steps. Existing single-agent reinforcement learning (RL) approaches and their multi-agent extensions fail to address these challenges as they do not incorporate other agents in the reasoning process. In this work, we propose Strat-Reasoner, a novel RL-based framework that improves LLMs' strategic reasoning ability in multi-agent games. We introduce a novel recursive reasoning paradigm where an agent's reasoning also integrates other agents' reasoning processes. To provide effective reward signals for the intermediate reasoning sequences, we employ a centralized Chain-of-Thought (CoT) comparison module to evaluate the reasoning quality. Finally, we compute an accurate hybrid advantage and develop a group-relative RL approach to optimize the LLM policy. Experimental results show that Strat-Reasoner substantially improves strategic abilities of underlying LLMs, achieving 22.1\% average performance improvements across various multi-agent games.

preprint2022arXiv

Bases which admit exactly two expansions

For a positive integer $m$ let $Ω_m=\{0,1, \cdots , m\}$ and \begin{align*} \mathcal B_2(m)=&\left \{q\in(1,m+1]: \text{$\exists\; x\in [0, m/(q-1)]$ has exactly }\right. \\ &\left. \text{two different $q$-expansions w.r.t. $Ω_m$}\right \}. \end{align*} Sidorov \cite{S} firstly studied the set $\mathcal B_2(1)$ and raised some questions. Komornik and Kong \cite{KK} further studied the set $\mathcal B_2(1)$ and answered partial Sidorov&#39;s questions. In the present paper, we consider the set $\mathcal B_2(m)$ for general positive integer $m$ and generalise the results obtained by Komornik and Kong.

preprint2022arXiv

CLOSE: Curriculum Learning On the Sharing Extent Towards Better One-shot NAS

One-shot Neural Architecture Search (NAS) has been widely used to discover architectures due to its efficiency. However, previous studies reveal that one-shot performance estimations of architectures might not be well correlated with their performances in stand-alone training because of the excessive sharing of operation parameters (i.e., large sharing extent) between architectures. Thus, recent methods construct even more over-parameterized supernets to reduce the sharing extent. But these improved methods introduce a large number of extra parameters and thus cause an undesirable trade-off between the training costs and the ranking quality. To alleviate the above issues, we propose to apply Curriculum Learning On Sharing Extent (CLOSE) to train the supernet both efficiently and effectively. Specifically, we train the supernet with a large sharing extent (an easier curriculum) at the beginning and gradually decrease the sharing extent of the supernet (a harder curriculum). To support this training strategy, we design a novel supernet (CLOSENet) that decouples the parameters from operations to realize a flexible sharing scheme and adjustable sharing extent. Extensive experiments demonstrate that CLOSE can obtain a better ranking quality across different computational budget constraints than other one-shot supernets, and is able to discover superior architectures when combined with various search strategies. Code is available at https://github.com/walkerning/aw_nas.

preprint2021arXiv

Enabling Lower-Power Charge-Domain Nonvolatile In-Memory Computing with Ferroelectric FETs

Compute-in-memory (CiM) is a promising approach to alleviating the memory wall problem for domain-specific applications. Compared to current-domain CiM solutions, charge-domain CiM shows the opportunity for higher energy efficiency and resistance to device variations. However, the area occupation and standby leakage power of existing SRAMbased charge-domain CiM (CD-CiM) are high. This paper proposes the first concept and analysis of CD-CiM using nonvolatile memory (NVM) devices. The design implementation and performance evaluation are based on a proposed 2-transistor-1-capacitor (2T1C) CiM macro using ferroelectric field-effect-transistors (FeFETs), which is free from leakage power and much denser than the SRAM solution. With the supply voltage between 0.45V and 0.90V, operating frequency between 100MHz to 1.0GHz, binary neural network application simulations show over 47%, 60%, and 64% energy consumption reduction from existing SRAM-based CD-CiM, SRAM-based current-domain CiM, and RRAM-based current-domain CiM, respectively. For classifications in MNIST and CIFAR-10 data sets, the proposed FeFETbased CD-CiM achieves an accuracy over 95% and 80%, respectively.

preprint2021arXiv

Quadratic fractional solitons

We introduce a system combining the quadratic self-attractive or composite quadratic-cubic nonlinearity, acting in the combination with the fractional diffraction, which is characterized by its Lévy index $α$. The model applies to a gas of quantum particles moving by Lévy flights, with the quadratic term representing the Lee-Huang-Yang correction to the mean-field interactions. A family of fundamental solitons is constructed in a numerical form, while the dependence of its norm on the chemical potential characteristic is obtained in an exact analytical form. The family of \textit{quasi-Townes solitons}, appearing in the limit case of $α=1/2$, is investigated by means of a variational approximation. A nonlinear lattice, represented by spatially periodical modulation of the quadratic term, is briefly addressed too. The consideration of the interplay of competing quadratic (attractive) and cubic (repulsive) terms with a lattice potential reveals families of single-, double-, and triple-peak gap solitons (GSs) in two finite bandgaps. The competing nonlinearity gives rise to alternating regions of stability and instability of the GS, the stability intervals shrinking with the increase of the number of peaks in the GS.

preprint2021arXiv

XPROAX-Local explanations for text classification with progressive neighborhood approximation

The importance of the neighborhood for training a local surrogate model to approximate the local decision boundary of a black box classifier has been already highlighted in the literature. Several attempts have been made to construct a better neighborhood for high dimensional data, like texts, by using generative autoencoders. However, existing approaches mainly generate neighbors by selecting purely at random from the latent space and struggle under the curse of dimensionality to learn a good local decision boundary. To overcome this problem, we propose a progressive approximation of the neighborhood using counterfactual instances as initial landmarks and a careful 2-stage sampling approach to refine counterfactuals and generate factuals in the neighborhood of the input instance to be explained. Our work focuses on textual data and our explanations consist of both word-level explanations from the original instance (intrinsic) and the neighborhood (extrinsic) and factual- and counterfactual-instances discovered during the neighborhood generation process that further reveal the effect of altering certain parts in the input text. Our experiments on real-world datasets demonstrate that our method outperforms the competitors in terms of usefulness and stability (for the qualitative part) and completeness, compactness and correctness (for the quantitative part).

preprint2020arXiv

Difference of Cantor sets and frequencies in Thue--Morse type sequences

In a recent paper, Baker and Kong have studied the Hausdorff dimension of the intersection of Cantor sets with their translations. We extend their results to more general Cantor sets. The proofs rely on the frequencies of digits in unique expansions in non-integer bases. In relation with this, we introduce a practical method to determine the frequency of any given finite block in Thue--Morse type sequences.

preprint2020arXiv

Intersections of Siepinski gasket with its translation

Let $E$ be the Sierpinski gasket, i.e., the self-similar set generated by the IFS $\left \{f_a(x)=\frac{x+a}{q}: a\in \{(0,0), (0,1), (1,0)\}\right \}$. In paper, we provide a description of the following set for $2<q<3$ \begin{equation*} D_q=\{\dim _H(E\cap (E+t)):\;t\in T\}, \end{equation*} where $T$ is the set of $t=(t_1, t_2)$ with $t\in E-E$ and $t_1, t_2$ have unique $q$-expansions w.r.t $\{-1,0,1\}$.