Source author record

Bin Fu

Bin Fu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Complexity Data Structures and Algorithms Computer Vision Emerging Technologies Artificial Intelligence Computation and Language Computational Geometry Formal Languages and Automata Theory Human-Computer Interaction

Catalog footprint

What is connected

20works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Patch-MoE Mamba: A Patch-Ordered Mixture-of-Experts State Space Architecture for Medical Image Segmentation

CNN- and Transformer-based architectures have achieved strong performance in medical image segmentation, but CNNs are limited in modeling long-range dependencies, while Transformers often suffer from quadratic computational and memory complexity. State space models, especially Mamba-based networks, offer an efficient alternative with linear sequence complexity. However, existing Mamba segmentation models still face two limitations: pixel-wise directional scanning can disrupt local 2D spatial structure, and simple summation-based fusion of scan directions cannot adapt well to diverse object sizes, shapes, and boundaries. To address these issues, we propose \textit{Patch-MoE Mamba}, a patch-ordered mixture-of-experts state space architecture for medical image segmentation. It introduces a hierarchical patch-ordered scanning mechanism that preserves local spatial neighborhoods while capturing multi-scale context, and an MoE-based directional fusion module that adaptively combines multiple Mamba scanner outputs using four directional experts, a learnable concatenation expert, and residual directional aggregation. Experiments on five public polyp segmentation benchmarks and the ISIC 2017/2018 skin lesion segmentation datasets demonstrate the effectiveness and generality of Patch-MoE Mamba.

preprint2026arXiv

PICABench: How Far Are We from Physically Realistic Image Editing?

Image editing has achieved remarkable progress recently. Modern editing models could already follow complex instructions to manipulate the original content. However, beyond completing the editing instructions, the accompanying physical effects are the key to the generation realism. For example, removing an object should also remove its shadow, reflections, and interactions with nearby objects. Unfortunately, existing models and benchmarks mainly focus on instruction completion but overlook these physical effects. So, at this moment, how far are we from physically realistic image editing? To answer this, we introduce PICABench, which systematically evaluates physical realism across eight sub-dimension (spanning optics, mechanics, and state transitions) for most of the common editing operations (add, remove, attribute change, etc.). We further propose the PICAEval, a reliable evaluation protocol that uses VLM-as-a-judge with per-case, region-level human annotations and questions. Beyond benchmarking, we also explore effective solutions by learning physics from videos and construct a training dataset PICA-100K. After evaluating most of the mainstream models, we observe that physical realism remains a challenging problem with large rooms to explore. We hope that our benchmark and proposed solutions can serve as a foundation for future work moving from naive content editing toward physically consistent realism.

preprint2022arXiv

GenText: Unsupervised Artistic Text Generation via Decoupled Font and Texture Manipulation

Automatic artistic text generation is an emerging topic which receives increasing attention due to its wide applications. The artistic text can be divided into three components, content, font, and texture, respectively. Existing artistic text generation models usually focus on manipulating one aspect of the above components, which is a sub-optimal solution for controllable general artistic text generation. To remedy this issue, we propose a novel approach, namely GenText, to achieve general artistic text style transfer by separably migrating the font and texture styles from the different source images to the target images in an unsupervised manner. Specifically, our current work incorporates three different stages, stylization, destylization, and font transfer, respectively, into a unified platform with a single powerful encoder network and two separate style generator networks, one for font transfer, the other for stylization and destylization. The destylization stage first extracts the font style of the font reference image, then the font transfer stage generates the target content with the desired font style. Finally, the stylization stage renders the resulted font image with respect to the texture style in the reference image. Moreover, considering the difficult data acquisition of paired artistic text images, our model is designed under the unsupervised setting, where all stages can be effectively optimized from unpaired data. Qualitative and quantitative results are performed on artistic text benchmarks, which demonstrate the superior performance of our proposed model. The code with models will become publicly available in the future.

preprint2022arXiv

Multitasking Scheduling with Shared Processing

Recently, the problem of multitasking scheduling has attracted a lot of attention in the service industries where workers frequently perform multiple tasks by switching from one task to another. Hall, Leung and Li (Discrete Applied Mathematics 2016) proposed a shared processing multitasking scheduling model which allows a team to continue to work on the primary tasks while processing the routinely scheduled activities as they occur. The processing sharing is achieved by allocating a fraction of the processing capacity to routine jobs and the remaining fraction, which we denote as sharing ratio, to the primary jobs. In this paper, we generalize this model to parallel machines and allow the fraction of the processing capacity assigned to routine jobs to vary from one to another. The objectives are minimizing makespan and minimizing the total completion time. We show that for both objectives, there is no polynomial time approximation algorithm unless P=NP if the sharing ratios are arbitrary for all machines. Then we consider the problems where the sharing ratios on some machines have a constant lower bound. For each objective, we analyze the performance of the classical scheduling algorithms and their variations and then develop a polynomial time approximation scheme when the number of machines is a constant.

preprint2022arXiv

Streaming Algorithms for Multitasking Scheduling with Shared Processing

In this paper, we design the first streaming algorithms for the problem of multitasking scheduling on parallel machines with shared processing. In one pass, our streaming approximation schemes can provide an approximate value of the optimal makespan. If the jobs can be read in two passes, the algorithm can find the schedule with the approximate value. This work not only provides an algorithmic big data solution for the studied problem, but also gives an insight into the design of streaming algorithms for other problems in the area of scheduling.

preprint2022arXiv

Streaming Approximation Scheme for Minimizing Total Completion Time on Parallel Machines Subject to Varying Processing Capacity

We study the problem of minimizing total completion time on parallel machines subject to varying processing capacity. In this paper, we develop an approximation scheme for the problem under the data stream model where the input data is massive and cannot fit into memory and thus can only be scanned for a few passes. Our algorithm can compute the approximate value of the optimal total completion time in one pass and output the schedule with the approximate value in two passes.

preprint2020arXiv

A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

Question Answering (QA) over Knowledge Base (KB) aims to automatically answer natural language questions via well-structured relation information between entities stored in knowledge bases. In order to make KBQA more applicable in actual scenarios, researchers have shifted their attention from simple questions to complex questions, which require more KB triples and constraint inference. In this paper, we introduce the recent advances in complex QA. Besides traditional methods relying on templates and rules, the research is categorized into a taxonomy that contains two main branches, namely Information Retrieval-based and Neural Semantic Parsing-based. After describing the methods of these branches, we analyze directions for future research and introduce the models proposed by the Alime team.

preprint2020arXiv

Hardness of Sparse Sets and Minimal Circuit Size Problem

We develop a polynomial method on finite fields to amplify the hardness of spare sets in nondeterministic time complexity classes on a randomized streaming model. One of our results shows that if there exists a $2^{n^{o(1)}}$-sparse set in $NTIME(2^{n^{o(1)}})$ that does not have any randomized streaming algorithm with $n^{o(1)}$ updating time, and $n^{o(1)}$ space, then $NEXP\not=BPP$, where a $f(n)$-sparse set is a language that has at most $f(n)$ strings of length $n$. We also show that if MCSP is $ZPP$-hard under polynomial time truth-table reductions, then $EXP\not=ZPP$.

preprint2016arXiv

Concentration Independent Random Number Generation in Tile Self-Assembly

In this paper we introduce the \emph{robust random number generation} problem where the goal is to design an abstract tile assembly system (aTAM system) whose terminal assemblies can be split into $n$ partitions such that a resulting assembly of the system lies within each partition with probability 1/$n$, regardless of the relative concentration assignment of the tile types in the system. First, we show this is possible for $n=2$ (a \emph{robust fair coin flip}) within the aTAM, and that such systems guarantee a worst case $\mathcal{O}(1)$ space usage. We accompany our primary construction with variants that show trade-offs in space complexity, initial seed size, temperature, tile complexity, bias, and extensibility, and also prove some negative results. As an application, we combine our coin-flip system with a result of Chandran, Gopalkrishnan, and Reif to show that for any positive integer $n$, there exists a $\mathcal{O}(\log n)$ tile system that assembles a constant-width linear assembly of expected length $n$ for any concentration assignment. We then extend our robust fair coin flip result to solve the problem of robust random number generation in the aTAM for all $n$. Two variants of robust random bit generation solutions are presented: an unbounded space solution and a bounded space solution which incurs a small bias. Further, we consider the harder scenario where tile concentrations change arbitrarily at each assembly step and show that while this is not possible in the aTAM, the problem can be solved by exotic tile assembly models from the literature.

preprint2016arXiv

Partial Sublinear Time Approximation and Inapproximation for Maximum Coverage

We develop a randomized approximation algorithm for the classical maximum coverage problem, which given a list of sets $A_1,A_2,\cdots, A_m$ and integer parameter $k$, select $k$ sets $A_{i_1}, A_{i_2},\cdots, A_{i_k}$ for maximum union $A_{i_1}\cup A_{i_2}\cup\cdots\cup A_{i_k}$. In our algorithm, each input set $A_i$ is a black box that can provide its size $|A_i|$, generate a random element of $A_i$, and answer the membership query $(x\in A_i?)$ in $O(1)$ time. Our algorithm gives $(1-{1\over e})$-approximation for maximum coverage problem in $O(p(m))$ time, which is independent of the sizes of the input sets. No existing $O(p(m)n^{1-ε})$ time $(1-{1\over e})$-approximation algorithm for the maximum coverage has been found for any function $p(m)$ that only depends on the number of sets, where $n=\max(|A_1|,\cdots,| A_m|)$ (the largest size of input sets). The notion of partial sublinear time algorithm is introduced. For a computational problem with input size controlled by two parameters $n$ and $m$, a partial sublinear time algorithm for it runs in a $O(p(m)n^{1-ε})$ time or $O(q(n)m^{1-ε})$ time. The maximum coverage has a partial sublinear time $O(p(m))$ constant factor approximation. On the other hand, we show that the maximum coverage problem has no partial sublinear $O(q(n)m^{1-ε})$ time constant factor approximation algorithm. It separates the partial sublinear time computation from the conventional sublinear time computation by disproving the existence of sublinear time approximation algorithm for the maximum coverage problem.

preprint2013arXiv

Derandomizing Polynomial Identity over Finite Fields Implies Super-Polynomial Circuit Lower Bounds for NEXP

We show that derandomizing polynomial identity testing over an arbitrary finite field implies that NEXP does not have polynomial size boolean circuits. In other words, for any finite field F(q) of size q, $PIT_q\in NSUBEXP\Rightarrow NEXP\not\subseteq P/poly$, where $PIT_q$ is the polynomial identity testing problem over F(q), and NSUBEXP is the nondeterministic subexpoential time class of languages. Our result is in contract to Kabanets and Impagliazzo's existing theorem that derandomizing the polynomial identity testing in the integer ring Z implies that NEXP does have polynomial size boolean circuits or permanent over Z does not have polynomial size arithmetic circuits.

preprint2012arXiv

On the Complexity of Approximate Sum of Sorted List

We consider the complexity for computing the approximate sum $a_1+a_2+...+a_n$ of a sorted list of numbers $a_1\le a_2\le ...\le a_n$. We show an algorithm that computes an $(1+ε)$-approximation for the sum of a sorted list of nonnegative numbers in an $O({1\over ε}\min(\log n, {\log ({x_{max}\over x_{min}})})\cdot (\log {1\over ε}+\log\log n))$ time, where $x_{max}$ and $x_{min}$ are the largest and the least positive elements of the input list, respectively. We prove a lower bound $Ω(\min(\log n,\log ({x_{max}\over x_{min}}))$ time for every O(1)-approximation algorithm for the sum of a sorted list of nonnegative elements. We also show that there is no sublinear time approximation algorithm for the sum of a sorted list that contains at least one negative number.

preprint2012arXiv

Sublinear Time Approximate Sum via Uniform Random Sampling

We investigate the approximation for computing the sum $a_1+...+a_n$ with an input of a list of nonnegative elements $a_1,..., a_n$. If all elements are in the range $[0,1]$, there is a randomized algorithm that can compute an $(1+ε)$-approximation for the sum problem in time ${O({n(\log\log n)\over\sum_{i=1}^n a_i})}$, where $ε$ is a constant in $(0,1)$. Our randomized algorithm is based on the uniform random sampling, which selects one element with equal probability from the input list each time. We also prove a lower bound $Ω({n\over \sum_{i=1}^n a_i})$, which almost matches the upper bound, for this problem.

preprint2012arXiv

Sublinear Time Motif Discovery from Multiple Sequences

A natural probabilistic model for motif discovery has been used to experimentally test the quality of motif discovery programs. In this model, there are $k$ background sequences, and each character in a background sequence is a random character from an alphabet $Σ$. A motif $G=g_1g_2...g_m$ is a string of $m$ characters. Each background sequence is implanted a probabilistically generated approximate copy of $G$. For a probabilistically generated approximate copy $b_1b_2...b_m$ of $G$, every character $b_i$ is probabilistically generated such that the probability for $b_i\neq g_i$ is at most $α$. We develop three algorithms that under the probabilistic model can find the implanted motif with high probability via a tradeoff between computational time and the probability of mutation. The methods developed in this paper have been used in the software implementation. We observed some encouraging results that show improved performance for motif detection compared with other softwares.

preprint2011arXiv

A Dense Hierarchy of Sublinear Time Approximation Schemes for Bin Packing

The bin packing problem is to find the minimum number of bins of size one to pack a list of items with sizes $a_1,..., a_n$ in $(0,1]$. Using uniform sampling, which selects a random element from the input list each time, we develop a randomized $O({n(\log n)(\log\log n)\over \sum_{i=1}^n a_i}+({1\over ε})^{O({1\overε})})$ time $(1+ε)$-approximation scheme for the bin packing problem. We show that every randomized algorithm with uniform random sampling needs $Ω({n\over \sum_{i=1}^n a_i})$ time to give an $(1+ε)$-approximation. For each function $s(n): N\rightarrow N$, define $\sum(s(n))$ to be the set of all bin packing problems with the sum of item sizes equal to $s(n)$. For a constant $b\in (0,1)$, every problem in $\sum(n^{b})$ has an $O(n^{1-b}(\log n)(\log\log n)+({1\over ε})^{O({1\overε})})$ time $(1+ε)$-approximation for an arbitrary constant $ε$. On the other hand, there is no $o(n^{1-b})$ time $(1+ε)$-approximation scheme for the bin packing problems in $\sum(n^{b})$ for some constant $ε>0$.

preprint2011arXiv

Self-Assembly with Geometric Tiles

In this work we propose a generalization of Winfree's abstract Tile Assembly Model (aTAM) in which tile types are assigned rigid shapes, or geometries, along each tile face. We examine the number of distinct tile types needed to assemble shapes within this model, the temperature required for efficient assembly, and the problem of designing compact geometric faces to meet given compatibility specifications. Our results show a dramatic decrease in the number of tile types needed to assemble $n \times n$ squares to $Θ(\sqrt{\log n})$ at temperature 1 for the most simple model which meets a lower bound from Kolmogorov complexity, and $O(\log\log n)$ in a model in which tile aggregates must move together through obstacle free paths within the plane. This stands in contrast to the $Θ(\log n / \log\log n)$ tile types at temperature 2 needed in the basic aTAM. We also provide a general method for simulating a large and computationally universal class of temperature 2 aTAM systems with geometric tiles at temperature 1. Finally, we consider the problem of computing a set of compact geometric faces for a tile system to implement a given set of compatibility specifications. We show a number of bounds on the complexity of geometry size needed for various classes of compatibility specifications, many of which we directly apply to our tile assembly results to achieve non-trivial reductions in geometry size.

preprint2010arXiv

Algorithms for Testing Monomials in Multivariate Polynomials

This paper is our second step towards developing a theory of testing monomials in multivariate polynomials. The central question is to ask whether a polynomial represented by an arithmetic circuit has some types of monomials in its sum-product expansion. The complexity aspects of this problem and its variants have been investigated in our first paper by Chen and Fu (2010), laying a foundation for further study. In this paper, we present two pairs of algorithms. First, we prove that there is a randomized $O^*(p^k)$ time algorithm for testing $p$-monomials in an $n$-variate polynomial of degree $k$ represented by an arithmetic circuit, while a deterministic $O^*(6.4^k + p^k)$ time algorithm is devised when the circuit is a formula, here $p$ is a given prime number. Second, we present a deterministic $O^*(2^k)$ time algorithm for testing multilinear monomials in $Π_mΣ_2Π_t\times Π_kΠ_3$ polynomials, while a randomized $O^*(1.5^k)$ algorithm is given for these polynomials. The first algorithm extends the recent work by Koutis (2008) and Williams (2009) on testing multilinear monomials. Group algebra is exploited in the algorithm designs, in corporation with the randomized polynomial identity testing over a finite field by Agrawal and Biswas (2003), the deterministic noncommunicative polynomial identity testing by Raz and Shpilka (2005) and the perfect hashing functions by Chen {\em at el.} (2007). Finally, we prove that testing some special types of multilinear monomial is W[1]-hard, giving evidence that testing for specific monomials is not fixed-parameter tractable.

preprint2010arXiv

Multivariate Polynomial Integration and Derivative Are Polynomial Time Inapproximable unless P=NP

We investigate the complexity of integration and derivative for multivariate polynomials in the standard computation model. The integration is in the unit cube $[0,1]^d$ for a multivariate polynomial, which has format $f(x_1,\cdots, x_d)=p_1(x_1,\cdots, x_d)p_2(x_1,\cdots, x_d)\cdots p_k(x_1,\cdots, x_d)$, where each $p_i(x_1,\cdots, x_d)=\sum_{j=1}^d q_j(x_j)$ with all single variable polynomials $q_j(x_j)$ of degree at most two and constant coefficients. We show that there is no any factor polynomial time approximation for the integration $\int_{[0,1]^d}f(x_1,\cdots,x_d)d_{x_1}\cdots d_{x_d}$ unless $P=NP$. For the complexity of multivariate derivative, we consider the functions with the format $f(x_1,\cdots, x_d)=p_1(x_1,\cdots, x_d)p_2(x_1,\cdots, x_d)\cdots p_k(x_1,\cdots, x_d),$ where each $p_i(x_1,\cdots, x_d)$ is of degree at most $2$ and $0,1$ coefficients. We also show that unless $P=NP$, there is no any factor polynomial time approximation to its derivative ${\partial f^{(d)}(x_1,\cdots, x_d)\over \partial x_1\cdots \partial x_d}$ at the origin point $(x_1,\cdots, x_d)=(0,\cdots,0)$. Our results show that the derivative may not be easier than the integration in high dimension. We also give some tractable cases of high dimension integration and derivative.

preprint2010arXiv

NE is not NP Turing Reducible to Nonexpoentially Dense NP Sets

A long standing open problem in the computational complexity theory is to separate NE from BPP, which is a subclass of $NP_T(NP\cap P/poly)$. In this paper, we show that $NE\not\subseteq NP_(NP \cap$ Nonexponentially-Dense-Class), where Nonexponentially-Dense-Class is the class of languages A without exponential density (for each constant c>0,$|A^{\le n}|\le 2^{n^c}$ for infinitely many integers n). Our result implies $NE\not\subseteq NP_T({pad(NP, g(n))})$ for every time constructible super-polynomial function g(n) such as $g(n)=n^{\ceiling{\log\ceiling{\log n}}}$, where Pad(NP, g(n)) is class of all languages $L_B=\{s10^{g(|s|)-|s|-1}:s\in B\}$ for $B\in NP$. We also show $NE\not\subseteq NP_T(P_{tt}(NP)\cap Tally)$.

preprint2010arXiv

The Complexity of Testing Monomials in Multivariate Polynomials

The work in this paper is to initiate a theory of testing monomials in multivariate polynomials. The central question is to ask whether a polynomial represented by certain economically compact structure has a multilinear monomial in its sum-product expansion. The complexity aspects of this problem and its variants are investigated with two folds of objectives. One is to understand how this problem relates to critical problems in complexity, and if so to what extent. The other is to exploit possibilities of applying algebraic properties of polynomials to the study of those problems. A series of results about $ΠΣΠ$ and $ΠΣ$ polynomials are obtained in this paper, laying a basis for further study along this line.

Bin Fu

What is connected

Connect this record

See the researcher in context

Building this map preview

20 published item(s)

Patch-MoE Mamba: A Patch-Ordered Mixture-of-Experts State Space Architecture for Medical Image Segmentation

PICABench: How Far Are We from Physically Realistic Image Editing?

GenText: Unsupervised Artistic Text Generation via Decoupled Font and Texture Manipulation

Multitasking Scheduling with Shared Processing

Streaming Algorithms for Multitasking Scheduling with Shared Processing

Streaming Approximation Scheme for Minimizing Total Completion Time on Parallel Machines Subject to Varying Processing Capacity

A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

Hardness of Sparse Sets and Minimal Circuit Size Problem

Concentration Independent Random Number Generation in Tile Self-Assembly

Partial Sublinear Time Approximation and Inapproximation for Maximum Coverage

Derandomizing Polynomial Identity over Finite Fields Implies Super-Polynomial Circuit Lower Bounds for NEXP

On the Complexity of Approximate Sum of Sorted List

Sublinear Time Approximate Sum via Uniform Random Sampling

Sublinear Time Motif Discovery from Multiple Sequences

A Dense Hierarchy of Sublinear Time Approximation Schemes for Bin Packing

Self-Assembly with Geometric Tiles

Algorithms for Testing Monomials in Multivariate Polynomials

Multivariate Polynomial Integration and Derivative Are Polynomial Time Inapproximable unless P=NP

NE is not NP Turing Reducible to Nonexpoentially Dense NP Sets

The Complexity of Testing Monomials in Multivariate Polynomials