Source author record

Yifeng Liu

Yifeng Liu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.AG math.NT math.RT Artificial Intelligence Computation and Language Machine Learning math.CT

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Tensor Product Attention Is All You Need

Scaling language models to handle longer input sequences typically necessitates large key-value (KV) caches, resulting in substantial memory overhead during inference. In this paper, we propose Tensor Product Attention (TPA), a novel attention mechanism that uses tensor decompositions to represent queries, keys, and values compactly, substantially shrinking the KV cache size at inference time. By factorizing these representations into contextual low-rank components and seamlessly integrating with Rotary Position Embedding (RoPE), TPA achieves improved model quality alongside memory efficiency. Based on TPA, we introduce the Tensor ProducT ATTenTion Transformer (T6), a new model architecture for sequence modeling. Through extensive empirical evaluation on language modeling tasks, we demonstrate that T6 surpasses or matches the performance of standard Transformer baselines including Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped-Query Attention (GQA), and Multi-Head Latent Attention (MLA) across various metrics, including perplexity and a range of established evaluation benchmarks. Notably, TPA's memory efficiency and computational efficiency at decoding stage enables processing longer sequences under fixed resource constraints, addressing a critical scalability challenge in modern language models. Project Page: https://github.com/tensorgi/TPA.

preprint2018arXiv

Supersingular locus of Hilbert modular varieties, arithmetic level raising and Selmer groups

This article has three goals. First, we generalize the result of Deuring and Serre on the characterization of supersingular locus of modular curves to all Shimura varieties given by totally indefinite quaternion algebras over totally real number fields. Second, we generalize the result of Ribet on arithmetic level raising to such Shimura varieties in the inert case. Third, as an application to number theory, we use the previous results to study the Selmer group of certain triple product motive of an elliptic curve, in the context of the Bloch--Kato conjecture.

preprint2015arXiv

Gluing restricted nerves of $\infty$-categories

In this article, we develop a general technique for gluing subcategories of $\infty$-categories. We obtain categorical equivalences between simplicial sets associated to certain multisimplicial sets. Such equivalences can be used to construct functors in different contexts. One of our results generalizes Deligne's gluing theory developed in the construction of the extraordinary pushforward operation in étale cohomology of schemes. Our results are applied in subsequent articles to construct Grothendieck's six operations in étale cohomology of Artin stacks.

preprint2015arXiv

Hirzebruch-Zagier cycles and twisted triple product Selmer groups

Let $E$ be an elliptic curve over $\mathbb{Q}$ and $A$ be another elliptic curve over a real quadratic number field. We construct a $\mathbb{Q}$-motive of rank $8$, together with a distinguished class in the associated Bloch-Kato Selmer group, using Hirzebruch-Zagier cycles, that is, graphs of Hirzebruch-Zagier morphisms. We show that, under certain assumptions on $E$ and $A$, the non-vanishing of the central critical value of the (twisted) triple product $L$-function attached to $(E,A)$ implies that the dimension of the associated Bloch-Kato Selmer group of the motive is $0$; and the non-vanishing of the distinguished class implies that the dimension of the associated Bloch-Kato Selmer group of the motive is $1$. This can be viewed as the triple product version of Kolyvagin's work on bounding Selmer groups of a single elliptic curve using Heegner points.

preprint2012arXiv

Uniqueness of Fourier-Jacobi models: the Archimedean case

We prove uniqueness of Fourier-Jacobi models for general linear groups, unitary groups, symplectic groups and metaplectic groups, over an archimedean local field.

preprint2011arXiv

On quadratic distinction of automorphic sheaves

We prove a geometric version of a classical result on the characterization of an irreducible cuspidal automorphic representation of $\mathrm{GL}_n(\mathbb{A}_E)$ being the base change of a stable cuspidal packet of the quasi-split unitary group associated to the quadratic extension $E/F$, via the nonvanishing of certain period integrals, called being distinguished. We show that certain cohomology of an automorphic sheaf of $\mathrm{GL}_{n,X'}$ is nonvanishing if and only if the corresponding local system $E$ on $X'$ is conjugate self-dual with respect to an étale double cover $X'/X$ of curves, which directly relates to the base change from the associated unitary group. In particular, the geometric setting makes sense for any base field.

preprint2010arXiv

A non-archimedean analogue of Calabi-Yau theorem for totally degenerate abelian varieties

We show an example of a non-archimedean version of the Calabi-Yau theorem in complex geometry. Precisely, we consider totally degenerate abelian varieties and certain probability measures on their associated analytic spaces in the sense of Berkovich.

preprint2010arXiv

Relative trace formulae toward Bessel and Fourier-Jacobi periods of unitary groups

We propose a relative trace formula approach and state the corresponding fundamental lemma toward the global restriction problem involving Bessel or Fourier-Jacobi periods of unitary groups $\mathrm{U}_n\times\mathrm{U}_m$, extending the work of Jacquet-Rallis for $m=n-1$ (which is a Bessel period). In particular, when $m=0$, we recover a relative trace formula proposed by Flicker concerning Kloosterman/Fourier integrals on quasi-split unitary groups. As evidence for our approach, we prove the fundamental lemma for $\mathrm{U}_n\times\mathrm{U}_n$ in positive characteristics.

Yifeng Liu

What is connected

Connect this record

See the researcher in context

Building this map preview

8 published item(s)

Tensor Product Attention Is All You Need

Supersingular locus of Hilbert modular varieties, arithmetic level raising and Selmer groups

Gluing restricted nerves of $\infty$-categories

Hirzebruch-Zagier cycles and twisted triple product Selmer groups

Uniqueness of Fourier-Jacobi models: the Archimedean case

On quadratic distinction of automorphic sheaves

A non-archimedean analogue of Calabi-Yau theorem for totally degenerate abelian varieties

Relative trace formulae toward Bessel and Fourier-Jacobi periods of unitary groups