Source author record

Sirui Li

Sirui Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.AP cond-mat.soft Machine Learning physics.flu-dyn Artificial Intelligence Computer Vision

Catalog footprint

What is connected

10works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Docs2Synth: A Synthetic Data Trained Retriever Framework for Scanned Visually Rich Documents Understanding

Document understanding (VRDU) in regulated domains is particularly challenging, since scanned documents often contain sensitive, evolving, and domain specific knowledge. This leads to two major challenges: the lack of manual annotations for model adaptation and the difficulty for pretrained models to stay up-to-date with domain-specific facts. While Multimodal Large Language Models (MLLMs) show strong zero-shot abilities, they still suffer from hallucination and limited domain grounding. In contrast, discriminative Vision-Language Pre-trained Models (VLPMs) provide reliable grounding but require costly annotations to cover new domains. We introduce Docs2Synth, a synthetic-supervision framework that enables retrieval-guided inference for private and low-resource domains. Docs2Synth automatically processes raw document collections, generates and verifies diverse QA pairs via an agent-based system, and trains a lightweight visual retriever to extract domain-relevant evidence. During inference, the retriever collaborates with an MLLM through an iterative retrieval--generation loop, reducing hallucination and improving response consistency. We further deliver Docs2Synth as an easy-to-use Python package, enabling plug-and-play deployment across diverse real-world scenarios. Experiments on multiple VRDU benchmarks show that Docs2Synth substantially enhances grounding and domain generalization without requiring human annotations.

preprint2026arXiv

OptiMind: Teaching LLMs to Think Like Optimization Experts

Mathematical programming -- the task of expressing operations and decision-making problems in precise mathematical language -- is fundamental across domains, yet remains a skill-intensive process requiring operations research expertise. Recent advances in large language models for complex reasoning have spurred interest in automating this task, translating natural language into executable optimization models. Current approaches, however, achieve limited accuracy, hindered by scarce and noisy training data without leveraging domain knowledge. In this work, we systematically integrate optimization expertise to improve formulation accuracy for mixed-integer linear programming, a key family of mathematical programs. Our OptiMind framework leverages semi-automated, class-based error analysis to guide both training and inference, explicitly preventing common mistakes within each optimization class. Our resulting fine-tuned LLM significantly improves formulation accuracy by 20.7% across multiple optimization benchmarks, with consistent gains under test-time scaling methods such as self-consistency and multi-turn feedback, enabling further progress toward robust LLM-assisted optimization formulation.

preprint2026arXiv

WildTableBench: Benchmarking Multimodal Foundation Models on Table Understanding In the Wild

Using multimodal foundation models to analyze table images is a high-value yet challenging application in consumer and enterprise scenarios. Despite its importance, current evaluations rely largely on structured-text tables or clean rendered images, leaving the visual complexity of in-the-wild table images underexplored. Such images feature varied layouts and diverse domains that demand sophisticated structural perception and numerical reasoning. To bridge this gap, we introduce WildTableBench, the first question-answering benchmark for naturally occurring table images from real-world settings. WildTableBench comprises 402 high-information-density table images collected from online forums and websites across diverse domains, together with 928 manually annotated and verified questions spanning 17 subtypes across five categories. We evaluate 21 frontier proprietary and open-source multimodal foundation models on this benchmark. Only one model exceeds 50% accuracy, while all remaining models range from 4.1% to 49.9%. We further conduct diagnostic analyses to characterize model failures and reveal persistent weaknesses in structural perception and reasoning. These results and analyses provide useful insights into current model capabilities and establish WildTableBench as a valuable diagnostic benchmark for table image understanding.

preprint2024arXiv

Rigorous uniaxial limit of the Qian--Sheng inertial Q-tensor hydrodynamics for liquid crystals

This article is concerned with the rigorous connections between the inertial Qian--Sheng model and the Ericksen--Leslie model for the liquid crystal flow, under a more general condition of coefficients. More specifically, in the framework of Hilbert expansions, we show that: (i) when the elastic coefficients tend to zero (also called the uniaxial limit), the smooth solution to the inertial Qian--Sheng model converges to that to the full inertial Ericksen--Leslie model; (ii) when the elastic coefficients and the inertial coefficient tend to zero simultaneously, the smooth solution to the inertial Qian--Sheng model converges to that to the noninertial Ericksen--Leslie model.

preprint2022arXiv

Frame hydrodynamics of biaxial nematics from molecular-theory-based tensor models

Starting from a dynamic tensor model about two second-order tensors, we derive the frame hydrodynamics for the biaxial nematic phase using the Hilbert expansion. The coefficients in the frame model are derived from those in the tensor model. The energy dissipation of the tensor model is maintained in the frame model. The model is reduced to the Ericksen--Leslie model if the biaxial bulk energy minimum of the tensor model is reduced to a uniaxial one.

preprint2022arXiv

Improving Question Answering over Knowledge Graphs Using Graph Summarization

Question Answering (QA) systems over Knowledge Graphs (KGs) (KGQA) automatically answer natural language questions using triples contained in a KG. The key idea is to represent questions and entities of a KG as low-dimensional embeddings. Previous KGQAs have attempted to represent entities using Knowledge Graph Embedding (KGE) and Deep Learning (DL) methods. However, KGEs are too shallow to capture the expressive features and DL methods process each triple independently. Recently, Graph Convolutional Network (GCN) has shown to be excellent in providing entity embeddings. However, using GCNs to KGQAs is inefficient because GCNs treat all relations equally when aggregating neighbourhoods. Also, a problem could occur when using previous KGQAs: in most cases, questions often have an uncertain number of answers. To address the above issues, we propose a graph summarization technique using Recurrent Convolutional Neural Network (RCNN) and GCN. The combination of GCN and RCNN ensures that the embeddings are propagated together with the relations relevant to the question, and thus better answers. The proposed graph summarization technique can be used to tackle the issue that KGQAs cannot answer questions with an uncertain number of answers. In this paper, we demonstrated the proposed technique on the most common type of questions, which is single-relation questions. Experiments have demonstrated that the proposed graph summarization technique using RCNN and GCN can provide better results when compared to the GCN. The proposed graph summarization technique significantly improves the recall of actual answers when the questions have an uncertain number of answers.

preprint2022arXiv

Rigorous biaxial limit of a molecular-theory-based two-tensor hydrodynamics

We consider a two-tensor hydrodynamics derived from the molecular model, where high-order tensors are determined by closure approximation through the maximum entropy state or the quasi-entropy. We prove the existence and uniqueness of local in time smooth solutions to the two-tensor system. Then, we rigorously justify the connection between the molecular-theory-based two-tensor hydrodynamics and the biaxial frame hydrodynamics. More specifically, in the framework of Hilbert expansion, we show the convergence of the solution to the two-tensor hydrodynamics to the solution to the frame hydrodynamics.

preprint2022arXiv

Uniqueness of global weak solutions to the frame hydrodynamics for biaxial nematic phases in $\mathbb{R}^2$

We consider the hydrodynamics for biaxial nematic phases described by a field of orthonormal frame, which can be derived from a molecular-theory-based tensor model. We prove the uniqueness of global weak solutions to the Cauchy problem of the frame hydrodynamics in dimensional two. The proof is mainly based on the suitable weaker energy estimates within the Littlewood--Paley analysis. We take full advantage of the estimates of nonlinear terms with rotational derivatives on $SO(3)$, together with cancellation relations and dissipative structures of the biaxial frame system.

preprint2022arXiv

Well-posedness of frame hydrodynamics for biaxial nematic liquid crystals

We consider the hydrodynamics for the biaxial nematic phase characterized by a field of orthonormal frame, which can be derived from a molecular-theory-based tensor model. In dimension two and three, we establish the local well-posedness and the blow-up criterion for smooth solutions to the frame hydrodynamic model. Furthermore, we prove the global existence of weak solutions in $\mathbb{R}^2$ which are nonsmooth at finitely many singular times.

preprint2014arXiv

Local well-posedness and small Deborah limit of a molecule-based $Q$-tensor system

In this paper, we consider a hydrodynamic $Q$-tensor system for nematic liquid crystal flow, which is derived from Doi-Onsager molecular theory by the Bingham closure. We first prove the existence and uniqueness of local strong solution. Furthermore, by taking Deborah number goes to zero and using the Hilbert expansion method, we present a rigorous derivation from the molecule-based $Q$-tensor theory to the Ericksen-Leslie theory.

Sirui Li

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

Docs2Synth: A Synthetic Data Trained Retriever Framework for Scanned Visually Rich Documents Understanding

OptiMind: Teaching LLMs to Think Like Optimization Experts

WildTableBench: Benchmarking Multimodal Foundation Models on Table Understanding In the Wild

Rigorous uniaxial limit of the Qian--Sheng inertial Q-tensor hydrodynamics for liquid crystals

Frame hydrodynamics of biaxial nematics from molecular-theory-based tensor models

Improving Question Answering over Knowledge Graphs Using Graph Summarization

Rigorous biaxial limit of a molecular-theory-based two-tensor hydrodynamics

Uniqueness of global weak solutions to the frame hydrodynamics for biaxial nematic phases in $\mathbb{R}^2$

Well-posedness of frame hydrodynamics for biaxial nematic liquid crystals

Local well-posedness and small Deborah limit of a molecule-based $Q$-tensor system