Source author record

Fei Wei

Fei Wei appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Distributed, Parallel, and Cluster Computing math.NA math.NT cond-mat.mtrl-sci Information Theory math.IT Computational Engineering, Finance, and Science Cryptography and Security Machine Learning math.CA Numerical Analysis Computation and Language Computer Vision Databases math.ST Mathematical Software physics.chem-ph Statistics Theory

Catalog footprint

What is connected

19works

18topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Accurate Table Question Answering with Accessible LLMs

Given a table T in a database and a question Q in natural language, the table question answering (TQA) task aims to return an accurate answer to Q based on the content of T. Recent state-of-the-art solutions leverage large language models (LLMs) to obtain high-quality answers. However, most rely on proprietary, large-scale LLMs with costly API access, posing a significant financial barrier. This paper instead focuses on TQA with smaller, open-weight LLMs that can run on a desktop or laptop. This setting is challenging, as such LLMs typically have weaker capabilities than large proprietary models, leading to substantial performance degradation with existing methods. We observe that a key reason for this degradation is that prior approaches often require the LLM to solve a highly sophisticated task using long, complex prompts, which exceed the capabilities of small open-weight LLMs. Motivated by this observation, we present Orchestra, a multi-agent approach that unlocks the potential of accessible LLMs for high-quality, cost-effective TQA. Orchestra coordinates a group of LLM agents, each responsible for a relatively simple task, through a structured, layered workflow to solve complex TQA problems -- akin to an orchestra. By reducing the prompt complexity faced by each agent, Orchestra significantly improves output reliability. We implement Orchestra on top of AgentScope, an open-source multi-agent framework, and evaluate it on multiple TQA benchmarks using a wide range of open-weight LLMs. Experimental results show that Orchestra achieves strong performance even with small- to medium-sized models. For example, with Qwen2.5-14B, Orchestra reaches 72.1% accuracy on WikiTQ, approaching the best prior result of 75.3% achieved with GPT-4; with larger Qwen, Llama, or DeepSeek models, Orchestra outperforms all prior methods and establishes new state-of-the-art results across all benchmarks.

preprint2023arXiv

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices

We present MobileVLM, a competent multimodal vision language model (MMVLM) targeted to run on mobile devices. It is an amalgamation of a myriad of architectural designs and techniques that are mobile-oriented, which comprises a set of language models at the scale of 1.4B and 2.7B parameters, trained from scratch, a multimodal vision model that is pre-trained in the CLIP fashion, cross-modality interaction via an efficient projector. We evaluate MobileVLM on several typical VLM benchmarks. Our models demonstrate on par performance compared with a few much larger models. More importantly, we measure the inference speed on both a Qualcomm Snapdragon 888 CPU and an NVIDIA Jeston Orin GPU, and we obtain state-of-the-art performance of 21.5 tokens and 65.3 tokens per second, respectively. Our code will be made available at: https://github.com/Meituan-AutoML/MobileVLM.

preprint2022arXiv

A Non-iterative Overlapping Schwarz Waveform Relaxation Algorithm for Wave Equation

The Schwarz Waveform Relaxation algorithm (SWR) exchanges the waveform of boundary value between neighbouring sub-domains, which provides a more efficient way than the other Schwarz algorithms to realize distributed computation. However, the convergence speed of the traditional SWR is slow, and various optimization strategies have been brought in to accelerate the convergence. In this paper, we propose a non-iterative overlapping variant of SWR for wave equation, which is named Relative Schwarz Waveform Relaxation algorithm (RSWR). RSWR is inspired by the physical observation that the velocity of wave is limited, based on the Theory of Relativity. The change of value at one space point will take time span $Δt$ to transmit to another space point and vice versa. This $Δt$ could be utilized to design distributed numerical algorithm, as we have done in RSWR. During each time span, RSWR needs only 3 steps to achieve high accurate waveform, by using the predict-select-update strategy. The key for this strategy is to find the maximum time span for the waveform. The validation of RSWR could be proved straightfowardly. Numerical experiments show that RSWR is accurate, and is potential to be scalable and fast.

preprint2022arXiv

Anqie entropy and arithmetic compactification of natural numbers

To study arithmetic structures of natural numbers, we introduce a notion of entropy of arithmetic functions, called anqie entropy. This entropy possesses some crucial properties common to both Shannon's and Kolmogorov's entropies. We show that all arithmetic functions with zero anqie entropy form a C*-algebra. Its maximal ideal space defines our arithmetic compactification of natural numbers, which is totally disconnected but not extremely disconnected. We also compute the $K$-groups of the space of all continuous functions on the arithmetic compactification. As an application, we show that any topological dynamical system with topological entropy $λ$, can be approximated by symbolic dynamical systems with entropy less than or equal to $λ$.

preprint2022arXiv

Cactus Mechanisms: Optimal Differential Privacy Mechanisms in the Large-Composition Regime

Most differential privacy mechanisms are applied (i.e., composed) numerous times on sensitive data. We study the design of optimal differential privacy mechanisms in the limit of a large number of compositions. As a consequence of the law of large numbers, in this regime the best privacy mechanism is the one that minimizes the Kullback-Leibler divergence between the conditional output distributions of the mechanism given two different inputs. We formulate an optimization problem to minimize this divergence subject to a cost constraint on the noise. We first prove that additive mechanisms are optimal. Since the optimization problem is infinite dimensional, it cannot be solved directly; nevertheless, we quantize the problem to derive near-optimal additive mechanisms that we call "cactus mechanisms" due to their shape. We show that our quantization approach can be arbitrarily close to an optimal mechanism. Surprisingly, for quadratic cost, the Gaussian mechanism is strictly sub-optimal compared to this cactus mechanism. Finally, we provide numerical results which indicate that cactus mechanism outperforms the Gaussian mechanism for a finite number of compositions.

preprint2022arXiv

Disjointness of Möbius from asymptotically periodic functions

We investigate Sarnak's Möbius Disjointness Conjecture through asymptotically periodic functions. It is shown that Sarnak's conjecture for rigid dynamical systems is equivalent to the disjointness of Möbius from asymptotically periodic functions. We give sufficient conditions and a partial answer to the later one. As an application, we show that Sarnak's conjecture holds for a class of rigid dynamical systems, which improves an earlier result of Kanigowski-Lema{ń}czyk-Radziwiłł.

preprint2022arXiv

Möbius disjointness for a class of exponential functions

A vast class of exponential functions are shown to be deterministic. This class includes functions whose exponents are polynomial-like or "piece-wise" close to polynomials after differentiation. Many of these functions are proved to be disjoint from the Möbius function.

preprint2022arXiv

The Saddle-Point Accountant for Differential Privacy

We introduce a new differential privacy (DP) accountant called the saddle-point accountant (SPA). SPA approximates privacy guarantees for the composition of DP mechanisms in an accurate and fast manner. Our approach is inspired by the saddle-point method -- a ubiquitous numerical technique in statistics. We prove rigorous performance guarantees by deriving upper and lower bounds for the approximation error offered by SPA. The crux of SPA is a combination of large-deviation methods with central limit theorems, which we derive via exponentially tilting the privacy loss random variables corresponding to the DP mechanisms. One key advantage of SPA is that it runs in constant time for the $n$-fold composition of a privacy mechanism. Numerical experiments demonstrate that SPA achieves comparable accuracy to state-of-the-art accounting methods with a faster runtime.

preprint2020arXiv

Real-Space Imaging of the Ordered Small Molecule Orientations in Porous Frameworks by Electron Microscopy

The real-space imaging of small molecules is always challenging under the electron microscopes, but highly demanded for investigating various nanoscale interactions, such as hydrogen bond and van der Waals (vdW) force. Especially, identifying the host-guest interactions in porous materials directly at the molecular level will bring a deeper insight into the behaviors of guest molecules during the sorption, catalysis, gas separation and energy storage. In this work, we directly resolved the ordered configurations of p-xylenes (PXs) adsorbed in ZSM-5 frameworks by the scanning transmission electron microscopy (STEM) with the integrated differential phase contrast (iDPC) technique to identify the host-guest vdW interactions. Based on these observations, we revealed that the PXs in one straight channel modified the channel geometry with a coherent orientation. And the adjacent straight channels were deformed up to 8.8% along the different directions corresponding to three dominant PX configurations, resulting a negligible overall expansion of ZSM-5 lattices. Then, we could also image the disorder and desorption of PXs in ZSM-5 channels during the in situ heating. This work not only helped us to study the host-guest vdW interactions and the sorption behaviors of PXs in ZSM-5, but also provided an efficient tool for further imaging and studying other single-molecule behaviors under STEMs.

preprint2020arXiv

Revealing the configurations and host-guest interactions of small aromatics confined in porous frameworks by electron microscopy

Directly imaging the configurations of small molecules at the ambient temperatures will greatly promote the study of their chemical and physical properties, including the host-guest interactions of organics in porous materials during the adsorption, catalysis and energy storage. However, due to the current challenges on the small-molecule imaging by the (scanning) transmission electron microscopy ((S)TEM), we still have a lack of the molecular-level understandings on the host-guest interactions and other molecular behaviors. Here, we achieved the STEM imaging of various small aromatics confined in the MFI-type zeolite frameworks by using the integrated differential phase contrast (iDPC) technique. Due to the strong confinement effect in MFI channels, the 1D solid-like aromatic columns showed the coherent configurations, which were clearly resolved by enhancing the host-guest interactions. Then, we also evaluated the strength of host-guest interactions directly by the image analysis and revealed the desorption behaviors of confined aromatics during the in-situ heating process. These results not only helped us to reveal the configurations and host-guest interactions of small aromatics during the adsorption/desorption in porous materials, but also expanded the applications of STEM to further study other molecular behaviors in the real space.

preprint2020arXiv

Towards an Operational Definition of Group Network Codes

Group network codes are a generalization of linear codes that have seen several studies over the last decade. When studying network codes, operations performed at internal network nodes called local encoding functions, are of significant interest. While local encoding functions of linear codes are well understood (and of operational significance), no similar operational definition exists for group network codes. To bridge this gap, we study the connections between group network codes and a family of codes called Coordinate-Wise-Linear (CWL) codes. CWL codes generalize linear codes and, in addition, can be defined locally (i.e., operationally). In this work, we study the connection between CWL codes and group codes from both a local and global encoding perspective. We show that Abelian group codes can be expressed as CWL codes and, as a result, they inherit an operational definition.

preprint2015arXiv

General divisor functions in arithmetic progressions to large moduli

We prove a result on the distribution of the general divisor functions in arithmetic progressions to smooth moduli which exceed the square root of the length.

preprint2013arXiv

A Ni-Fe Layered Double Hydroxide-Carbon Nanotube Complex for Water Oxidation

Highly active, durable and cost-effective electrocatalysts for water oxidation to evolve oxygen gas hold a key to a range of renewable energy solutions including water splitting and rechargeable metal-air batteries. Here, we report the synthesis of ultrathin nickel iron layered double hydroxide nanoplates on mildly oxidized multi-walled carbon nanotubes. Incorporation of Fe into the nickel hydroxide induced the formation of NiFe-layered double hydroxide. The nanoplates were covalently attached to a network of nanotubes, affording excellent electrical wiring to the nanoplates. The ultra-thin Ni-Fe layered double hydroxide nanoplates/carbon nanotube complex was found to exhibit unusually high electro-catalytic activity and stability for oxygen evolution and outperformed commercial precious metal Ir catalysts.

preprint2011arXiv

Mini-step Strategy for Transient Analysis

Domain decomposition methods are widely used to solve sparse linear systems from scientific problems, but they are not suited to solve sparse linear systems extracted from integrated circuits. The reason is that the sparse linear system of integrated circuits may be non-diagonal-dominant, and domain decomposition method might be unconvergent for these non-diagonal-dominant matrices. In this paper, we propose a mini-step strategy to do the circuit transient analysis. Different from the traditional large-step approach, this strategy is able to generate diagonal-dominant sparse linear systems. As a result, preconditioned domain decomposition methods can be used to simulate the large integrated circuits on the supercomputers and clouds.

preprint2010arXiv

Directed Transmission Method, A Fully Asynchronous approach to Solve Sparse Linear Systems in Parallel

In this paper, we propose a new distributed algorithm, called Directed Transmission Method (DTM). DTM is a fully asynchronous and continuous-time iterative algorithm to solve SPD sparse linear system. As an architecture-aware algorithm, DTM could be freely running on all kinds of heterogeneous parallel computer. We proved that DTM is convergent by making use of the final-value theorem of Laplacian Transformation. Numerical experiments show that DTM is stable and efficient.

preprint2010arXiv

Transmission line inspires a new distributed algorithm to solve linear system of circuit

Transmission line, or wire, is always troublesome to integrated circuits designers, but it could be helpful to parallel computing researchers. This paper proposes the Virtual Transmission Method (VTM), which is a new distributed and stationary iterative algorithm to solve the linear system extracted from circuit. It tears the circuit by virtual transmission lines to achieve distributed computing. For the symmetric positive definite (SPD) linear system, VTM is proved to be convergent. For the unsymmetrical linear system, numerical experiments show that VTM is possible to achieve better convergence property than the traditional stationary algorithms. VTM could be accelerated by some preconditioning techniques, and the convergence speed of VTM is fast when its preconditioner is properly chosen.

preprint2010arXiv

Transmission Line Inspires A New Distributed Algorithm to Solve the Nonlinear Dynamical System of Physical Circuit

As known, physical circuits, e.g. integrated circuits or power system, work in a distributed manner, but these circuits could not be easily simulated in a distributed way. This is mainly because that the dynamical system of physical circuits is nonlinear and the linearized system of physical circuits is nonsymmetrical. This paper proposes a simple and natural strategy to mimic the distributed behavior of the physical circuit by mimicking the distributed behavior of the internal wires inside this circuit. Mimic Transmission Method (MTM) is a new distributed algorithm to solve the nonlinear ordinary differential equations extracted from physical circuits. It maps the transmission delay of interconnects between subcircuits to the communication delay of digital data link between processors. MTM is a black-box algorithm. By mimicking the transmission lines, MTM seals the nonlinear dynamical system within the subcircuit. As the result, we do not need to pay attention on how to solve the nonlinear dynamic system or nonsymmetrical linear system in parallel. MTM is a global direct algorithm, and it does only one distributed computation at each time window to obtain accurate result, so unconvergence issues do not need to be worried about.

preprint2010arXiv

Virtual Transmission Method, A New Distributed Algorithm to Solve Sparse Linear System

In this paper, we propose a new parallel algorithm which could work naturally on the parallel computer with arbitrary number of processors. This algorithm is named Virtual Transmission Method (VTM). Its physical backgroud is the lossless transmission line and microwave network. The basic idea of VTM is to insert lossless transmission lines into the sparse linear system to achieve distributed computing. VTM is proved to be convergent to solve SPD linear system. Preconditioning method and performance model are presented. Numerical experiments show that VTM is efficient, accurate and stable. Accompanied with VTM, we bring in a new technique to partition the symmetric linear system, which is named Generalized Node & Branch Tearing (GNBT). It is based on Kirchhoff's Current Law from circuit theory. We proved that GNBT is feasible to partition any SPD linear system.

preprint2010arXiv

Waveform Transmission Method, a New Waveform-relaxation Based Algorithm to Solve Ordinary Differential Equations in Parallel

Waveform Relaxation method (WR) is a beautiful algorithm to solve Ordinary Differential Equations (ODEs). However, because of its poor convergence capability, it was rarely used. In this paper, we propose a new distributed algorithm, named Waveform Transmission Method (WTM), by virtually inserting waveform transmission lines into the dynamical system to achieve distributed computing of extremely large ODEs. WTM has better convergence capability than the traditional WR algorithms.

Fei Wei

What is connected

Connect this record

See the researcher in context

Building this map preview

19 published item(s)

Accurate Table Question Answering with Accessible LLMs

MobileVLM : A Fast, Strong and Open Vision Language Assistant for Mobile Devices

A Non-iterative Overlapping Schwarz Waveform Relaxation Algorithm for Wave Equation

Anqie entropy and arithmetic compactification of natural numbers

Cactus Mechanisms: Optimal Differential Privacy Mechanisms in the Large-Composition Regime

Disjointness of Möbius from asymptotically periodic functions

Möbius disjointness for a class of exponential functions

The Saddle-Point Accountant for Differential Privacy

Real-Space Imaging of the Ordered Small Molecule Orientations in Porous Frameworks by Electron Microscopy

Revealing the configurations and host-guest interactions of small aromatics confined in porous frameworks by electron microscopy

Towards an Operational Definition of Group Network Codes

General divisor functions in arithmetic progressions to large moduli

A Ni-Fe Layered Double Hydroxide-Carbon Nanotube Complex for Water Oxidation

Mini-step Strategy for Transient Analysis

Directed Transmission Method, A Fully Asynchronous approach to Solve Sparse Linear Systems in Parallel

Transmission line inspires a new distributed algorithm to solve linear system of circuit

Transmission Line Inspires A New Distributed Algorithm to Solve the Nonlinear Dynamical System of Physical Circuit

Virtual Transmission Method, A New Distributed Algorithm to Solve Sparse Linear System

Waveform Transmission Method, a New Waveform-relaxation Based Algorithm to Solve Ordinary Differential Equations in Parallel