Source author record

Kai Ding

Kai Ding appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Computation and Language cond-mat.mtrl-sci physics.app-ph physics.med-ph physics.optics

Catalog footprint

What is connected

5works

6topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context

Text removal has attracted increasingly attention due to its various applications on privacy protection, document restoration, and text editing. It has shown significant progress with deep neural network. However, most of the existing methods often generate inconsistent results for complex background. To address this issue, we propose a Contextual-guided Text Removal Network, termed as CTRNet. CTRNet explores both low-level structure and high-level discriminative context feature as prior knowledge to guide the process of background restoration. We further propose a Local-global Content Modeling (LGCM) block with CNNs and Transformer-Encoder to capture local features and establish the long-term relationship among pixels globally. Finally, we incorporate LGCM with context guidance for feature modeling and decoding. Experiments on benchmark datasets, SCUT-EnsText and SCUT-Syn show that CTRNet significantly outperforms the existing state-of-the-art methods. Furthermore, a qualitative experiment on examination papers also demonstrates the generalization ability of our method. The codes and supplement materials are available at https://github.com/lcy0604/CTRNet.

preprint2022arXiv

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

Structured document understanding has attracted considerable attention and made significant progress recently, owing to its crucial role in intelligent document processing. However, most existing related models can only deal with the document data of specific language(s) (typically English) included in the pre-training collection, which is extremely limited. To address this issue, we propose a simple yet effective Language-independent Layout Transformer (LiLT) for structured document understanding. LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models. Experimental results on eight languages have shown that LiLT can achieve competitive or even superior performance on diverse widely-used downstream benchmarks, which enables language-independent benefit from the pre-training of document layout structure. Code and model are publicly available at https://github.com/jpWang/LiLT.

preprint2022arXiv

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition. However, recent state-of-the-art methods usually incorporate detection and recognition simply by sharing the backbone, which does not directly take advantage of the feature interaction between the two tasks. In this paper, we propose a new end-to-end scene text spotting framework termed SwinTextSpotter. Using a transformer encoder with dynamic head as the detector, we unify the two tasks with a novel Recognition Conversion mechanism to explicitly guide text localization through recognition loss. The straightforward design results in a concise framework that requires neither additional rectification module nor character-level annotation for the arbitrarily-shaped text. Qualitative and quantitative experiments on multi-oriented datasets RoIC13 and ICDAR 2015, arbitrarily-shaped datasets Total-Text and CTW1500, and multi-lingual datasets ReCTS (Chinese) and VinText (Vietnamese) demonstrate SwinTextSpotter significantly outperforms existing methods. Code is available at https://github.com/mxin262/SwinTextSpotter.

preprint2019arXiv

Plasmonic Titanium Nitride via Atomic Layer Deposition: A Low-Temperature Route

To integrate plasmonic devices into industry, it is essential to develop scalable and CMOS compatible plasmonic materials. In this work, we report high plasmonic quality titanium nitride (TiN) on c-plane sapphire by plasma enhanced atomic layer deposition (PE-ALD). TiN with low losses and high metallicity was achieved at temperatures below 500°C, by exploring the effects of chemisorption time, substrate temperature and plasma exposure time on material properties. Reduction in chemisorption time mitigates premature precursor decomposition at T_S > 375°C , and a trade-off between reduced impurity concentration and structural degradation caused by plasma bombardment is achieved for 25s plasma exposure. 85 nm thick TiN films grown at a substrate temperature of 450°C, compatible with CMOS processes, with 0.5s chemisorption time and 25s plasma exposure exhibited a high plasmonic figure of merit (|ε^'/ε^''|) of 2.8 and resistivity of 31 μΩ-cm. These TiN thin films fabricated with subwavelength apertures were shown to exhibit extraordinary transmission.

preprint2012arXiv

Comparison of Image Registration Based Measures of Regional Lung Ventilation from Dynamic Spiral CT with Xe-CT

Purpose: Regional lung volume change as a function of lung inflation serves as an index of parenchymal and airway status as well as an index of regional ventilation and can be used to detect pathologic changes over time. In this article, we propose a new regional measure of lung mechanics --- the specific air volume change by corrected Jacobian. Methods: 4DCT and Xe-CT data sets from four adult sheep are used in this study. Nonlinear, 3D image registration is applied to register an image acquired near end inspiration to an image acquired near end expiration. Approximately 200 annotated anatomical points are used as landmarks to evaluate registration accuracy. Three different registration-based measures of regional lung mechanics are derived and compared: the specific air volume change calculated from the Jacobian (SAJ); the specific air volume change calculated by the corrected Jacobian (SACJ); and the specific air volume change by intensity change (SAI). Results: After registration, the mean registration error is on the order of 1 mm. For cubical ROIs in cubes with size 20 mm $\times$ 20 mm $\times$ 20 mm, the SAJ and SACJ measures show significantly higher correlation (linear regression, average $r^2=0.75$ and $r^2=0.82$) with the Xe-CT based measure of specific ventilation (sV) than the SAI measure. For ROIs in slabs along the ventral-dorsal vertical direction with size of 150 mm $\times$ 8 mm $\times$ 40 mm, the SAJ, SACJ, and SAI all show high correlation (linear regression, average $r^2=0.88$, $r^2=0.92$ and $r^2=0.87$) with the Xe-CT based sV without significant differences when comparing between the three methods. Conclusion: Given a deformation field by an image registration algorithm, significant differences between the SAJ, SACJ, and SAI measures were found at a regional level compared to the Xe-CT sV in four sheep that were studied.

Kai Ding

What is connected

Connect this record

See the researcher in context

Building this map preview

5 published item(s)

Don't Forget Me: Accurate Background Recovery for Text Removal via Modeling Local-Global Context

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

Plasmonic Titanium Nitride via Atomic Layer Deposition: A Low-Temperature Route

Comparison of Image Registration Based Measures of Regional Lung Ventilation from Dynamic Spiral CT with Xe-CT