Source author record

Qiang Huo

Qiang Huo appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Theory math.IT Computer Vision Computation and Language eess.AS Emerging Technologies math.CA math.DS Sound

Catalog footprint

What is connected

12works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Mean Assouad dimension and spectrum, with applications to infinite dimensional fractals

We introduce the mean Assouad dimension of a dynamical system, motivated by the Assouad dimension in fractal geometry. Using dimension interpolation, we further define the mean Assouad spectrum. This provides a new family of bi-Lipschitz invariants of dynamical systems. We study its basic properties and calculate it for several classes of dynamical systems. As an application, we determine explicit formulae for the mean Assouad dimension and spectrum of infinite-dimensional Bedford--McMullen carpet systems, contributing to the program of studying infinite dimensional fractals, initiated recently by Tsukamoto.

preprint2022arXiv

APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic Text Image Generation

Style-guided text image generation tries to synthesize text image by imitating reference image's appearance while keeping text content unaltered. The text image appearance includes many aspects. In this paper, we focus on transferring style image's background and foreground color patterns to the content image to generate photo-realistic text image. To achieve this goal, we propose 1) a content-style cross attention based pixel sampling approach to roughly mimicking the style text image's background; 2) a pixel-wise style modulation technique to transfer varying color patterns of the style image to the content image spatial-adaptively; 3) a cross attention based multi-scale style fusion approach to solving text foreground misalignment issue between style and content images; 4) an image patch shuffling strategy to create style, content and ground truth image tuples for training. Experimental results on Chinese handwriting text image synthesis with SCUT-HCCDoc and CASIA-OLHWDB datasets demonstrate that the proposed method can improve the quality of synthetic text images and make them more photo-realistic.

preprint2022arXiv

TSRFormer: Table Structure Recognition with Transformers

We present a new table structure recognition (TSR) approach, called TSRFormer, to robustly recognizing the structures of complex tables with geometrical distortions from various table images. Unlike previous methods, we formulate table separation line prediction as a line regression problem instead of an image segmentation problem and propose a new two-stage DETR based separator prediction approach, dubbed \textbf{Sep}arator \textbf{RE}gression \textbf{TR}ansformer (SepRETR), to predict separation lines from table images directly. To make the two-stage DETR framework work efficiently and effectively for the separation line prediction task, we propose two improvements: 1) A prior-enhanced matching strategy to solve the slow convergence issue of DETR; 2) A new cross attention module to sample features from a high-resolution convolutional feature map directly so that high localization accuracy is achieved with low computational cost. After separation line prediction, a simple relation network based cell merging module is used to recover spanning cells. With these new techniques, our TSRFormer achieves state-of-the-art performance on several benchmark datasets, including SciTSR, PubTabNet and WTW. Furthermore, we have validated the robustness of our approach to tables with complex structures, borderless cells, large blank spaces, empty or spanning cells as well as distorted or even curved shapes on a more challenging real-world in-house dataset.

preprint2020arXiv

A Study on Effects of Implicit and Explicit Language Model Information for DBLSTM-CTC Based Handwriting Recognition

Deep Bidirectional Long Short-Term Memory (D-BLSTM) with a Connectionist Temporal Classification (CTC) output layer has been established as one of the state-of-the-art solutions for handwriting recognition. It is well known that the DBLSTM trained by using a CTC objective function will learn both local character image dependency for character modeling and long-range contextual dependency for implicit language modeling. In this paper, we study the effects of implicit and explicit language model information for DBLSTM-CTC based handwriting recognition by comparing the performance of using or without using an explicit language model in decoding. It is observed that even using one million lines of training sentences to train the DBLSTM, using an explicit language model is still helpful. To deal with such a large-scale training problem, a GPU-based training tool has been developed for CTC training of DBLSTM by using a mini-batch based epochwise Back Propagation Through Time (BPTT) algorithm.

preprint2020arXiv

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

We introduce a new arbitrary-shaped text detection approach named ReLaText by formulating text detection as a visual relationship detection problem. To demonstrate the effectiveness of this new formulation, we start from using a "link" relationship to address the challenging text-line grouping problem firstly. The key idea is to decompose text detection into two subproblems, namely detection of text primitives and prediction of link relationships between nearby text primitive pairs. Specifically, an anchor-free region proposal network based text detector is first used to detect text primitives of different scales from different feature maps of a feature pyramid network, from which a text primitive graph is constructed by linking each pair of nearby text primitives detected from a same feature map with an edge. Then, a Graph Convolutional Network (GCN) based link relationship prediction module is used to prune wrongly-linked edges in the text primitive graph to generate a number of disjoint subgraphs, each representing a detected text instance. As GCN can effectively leverage context information to improve link prediction accuracy, our GCN based text-line grouping approach can achieve better text detection accuracy than previous text-line grouping methods, especially when dealing with text instances with large inter-character or very small inter-line spacings. Consequently, the proposed ReLaText achieves state-of-the-art performance on five public text detection benchmarks, namely RCTW-17, MSRA-TD500, Total-Text, CTW1500 and DAST1500.

preprint2020arXiv

The Investigation of Negative Capacitance Vertical Nanowire FETs Based on SPICE Model at Device-Circuit Level

In this study, a SPICE model for negative capacitance vertical nanowire field-effect-transistor (NC VNW-FET) based on BSIM-CMG model and Landau-Khalatnikov (LK) equation was presented. Suffering from the limitation of short gate length there is lack of controllable and integrative structures for high performance NC VNW-FETs. A new kind of structure was proposed for NC VNW-FETs at sub-3nm node. Moreover, in order to understand and improve NC VNW-FETs, the S-shaped polarization-voltage curve (S-curve) was divided into four regions and some new design rules were proposed. By using the SPICE model, device-circuit co-optimization was implemented. The co-design of gate work function (WF) and NC was investigated. A ring oscillator was simulated to analyze the circuit energy-delay, and it shown that significant energy reduction, up to 88%, at iso-delay for NC VNW-FETs at low supply voltage can be achieved. This study gives a credible method to analysis the performance of NC based devices and circuits and reveals the potential of NC VNW-FETs in low-power applications.

preprint2016arXiv

Source and Physical-Layer Network Coding for Correlated Two-Way Relaying

In this paper, we study a half-duplex two-way relay channel (TWRC) with correlated sources exchanging bidirectional information. In the case, when both sources have the knowledge of correlation statistics, a source compression with physical-layer network coding (SCPNC) scheme is proposed to perform the distributed compression at each source node. When only the relay has the knowledge of correlation statistics, we propose a relay compression with physical-layer network coding (RCPNC) scheme to compress the bidirectional messages at the relay. The closed-form block error rate (BLER) expressions of both schemes are derived and verified through simulations. It is shown that the proposed schemes achieve considerable improvements in both error performance and throughput compared with the conventional non-compression scheme in correlated two-way relay networks (CTWRNs).

preprint2014arXiv

Compressed Relaying for Two-Way Relay Networks with Correlated Sources

In this letter, a compressed relaying scheme via Huffman and physical-layer network coding (HPNC) is proposed for two-way relay networks with correlated sources (TWRN-CS). In the HPNC scheme, both sources first transmit the correlated raw source messages to the relay simultaneously. The relay performs physical-layer network coding (PNC) on the received symbols, compresses the PNC-coded symbols using Huffman coding, and broadcasts the compressed symbols to both source nodes. Then, each source decodes the other source's messages by using its own messages as side information. Compression rate and block error rate (BLER) of the proposed scheme are analyzed. Simulation results demonstrate that the HPNC scheme can effectively improve the network throughput, and meanwhile, achieve the superior BLER performance compared with the conventional non-compressed relaying scheme in TWRN-CS.

preprint2014arXiv

Selective Combining for Hybrid Cooperative Networks

In this study, we consider the selective combining in hybrid cooperative networks (SCHCNs scheme) with one source node, one destination node and $N$ relay nodes. In the SCHCN scheme, each relay first adaptively chooses between amplify-and-forward protocol and decode-and-forward protocol on a per frame basis by examining the error-detecting code result, and $N_c$ ($1\leq N_c \leq N$) relays will be selected to forward their received signals to the destination. We first develop a signal-to-noise ratio (SNR) threshold-based frame error rate (FER) approximation model. Then, the theoretical FER expressions for the SCHCN scheme are derived by utilizing the proposed SNR threshold-based FER approximation model. The analytical FER expressions are validated through simulation results.

preprint2014arXiv

Study on Downlink Spectral Efficiency in Orthogonal Frequency Division Multiple Access Systems

In previous studies on the capacity of orthogonal frequency division multiple access (OFDMA) systems, it is usually assumed that co-channel interference (CCI) from adjacent cells is a Gaussian-distributed random variable. However, very-little work shows that the Gaussian assumption does not hold true in OFDMA systems. In this paper, the statistical property of CCI in downlink OFDMA systems is studied, and spectral efficiency of downlink OFDMA system is analyzed based on the derived statistical model. First, the probability density function (PDF) of CCI in downlink OFDMA cellular systems is studied with the considerations of path loss, multipath fading and Gaussian-like transmit signals. Moreover, some closed-form expressions of the PDF are obtained for special cases. The derived results show that the PDFs of CCI are with a heavy tail, and significantly deviate from the Gaussian distribution. Then, based on the derived statistical properties of CCI, the downlink spectral efficiency is derived. Numerical and simulation results justify the derived statistical CCI model and spectral efficiency.

preprint2013arXiv

A Distributed Differential Space-Time Coding Scheme With Analog Network Coding in Two-Way Relay Networks

In this paper, we consider general two-way relay networks (TWRNs) with two source and N relay nodes. A distributed differential space time coding with analog network coding (DDSTC-ANC) scheme is proposed. A simple blind estimation and a differential signal detector are developed to recover the desired signal at each source. The pairwise error probability (PEP) and block error rate (BLER) of the DDSTC-ANC scheme are analyzed. Exact and simplified PEP expressions are derived. To improve the system performance, the optimum power allocation (OPA) between the source and relay nodes is determined based on the simplified PEP expression. The analytical results are verified through simulations.

preprint2011arXiv

Performance Analysis of Hybrid Relay Selection in Cooperative Wireless Systems

The hybrid relay selection (HRS) scheme, which adaptively chooses amplify-and-forward (AF) and decode-and-forward (DF) protocols, is very effective to achieve robust performance in wireless networks. This paper analyzes the frame error rate (FER) of the HRS scheme in general cooperative wireless networks without and with utilizing error control coding at the source node. We first develop an improved signal-to-noise ratio (SNR) threshold-based FER approximation model. Then, we derive an analytical average FER expression as well as an asymptotic expression at high SNR for the HRS scheme and generalize to other relaying schemes. Simulation results are in excellent agreement with the theoretical analysis, which validates the derived FER expressions.

Qiang Huo

What is connected

Connect this record

See the researcher in context

Building this map preview

12 published item(s)

Mean Assouad dimension and spectrum, with applications to infinite dimensional fractals

APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic Text Image Generation

TSRFormer: Table Structure Recognition with Transformers

A Study on Effects of Implicit and Explicit Language Model Information for DBLSTM-CTC Based Handwriting Recognition

ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks

The Investigation of Negative Capacitance Vertical Nanowire FETs Based on SPICE Model at Device-Circuit Level

Source and Physical-Layer Network Coding for Correlated Two-Way Relaying

Compressed Relaying for Two-Way Relay Networks with Correlated Sources

Selective Combining for Hybrid Cooperative Networks

Study on Downlink Spectral Efficiency in Orthogonal Frequency Division Multiple Access Systems

A Distributed Differential Space-Time Coding Scheme With Analog Network Coding in Two-Way Relay Networks

Performance Analysis of Hybrid Relay Selection in Cooperative Wireless Systems