Researcher profile

Jun Tang

Jun Tang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

CC-OCR V2: Benchmarking Large Multimodal Models for Literacy in Real-world Document Processing

Large Multimodal Models (LMMs) have recently shown strong performance on Optical Character Recognition (OCR) tasks, demonstrating their promising capability in document literacy. However, their effectiveness in real-world applications remains underexplored, as existing benchmarks adopt task scopes misaligned with practical applications and assume homogeneous acquisition conditions. To address this gap, we introduce CC-OCR V2, a comprehensive and challenging OCR benchmark tailored to real-world document processing. CC-OCR V2 focuses on practical enterprise document processing tasks and incorporates hard and corner cases that are critical yet underrepresented in prior benchmarks, covering 5 major OCR-centric tracks: text recognition, document parsing, document grounding, key information extraction, and document question answering, comprising 7,093 high-difficulty samples. Extensive experiments on 14 advanced LMMs reveal that current models fall short of real-world application requirements. Even state-of-the-art LMMs exhibit substantial performance degradation across diverse tasks and scenarios. These findings reveal a significant gap between performance on current benchmarks and effectiveness in real-world applications. We release the full dataset and evaluation toolkit at https://github.com/eioss/CC-OCR-V2.

preprint2022arXiv

Vision-Language Pre-Training for Boosting Scene Text Detectors

Recently, vision-language joint representation learning has proven to be highly effective in various scenarios. In this paper, we specifically adapt vision-language joint learning for scene text detection, a task that intrinsically involves cross-modal interaction between the two modalities: vision and language, since text is the written form of language. Concretely, we propose to learn contextualized, joint representations through vision-language pre-training, for the sake of enhancing the performance of scene text detectors. Towards this end, we devise a pre-training architecture with an image encoder, a text encoder and a cross-modal encoder, as well as three pretext tasks: image-text contrastive learning (ITC), masked language modeling (MLM) and word-in-image prediction (WIP). The pre-trained model is able to produce more informative representations with richer semantics, which could readily benefit existing scene text detectors (such as EAST and PSENet) in the down-stream text detection task. Extensive experiments on standard benchmarks demonstrate that the proposed paradigm can significantly improve the performance of various representative text detectors, outperforming previous pre-training approaches. The code and pre-trained models will be publicly released.

preprint2020arXiv

Deep Time-Stream Framework for Click-Through Rate Prediction by Tracking Interest Evolution

Click-through rate (CTR) prediction is an essential task in industrial applications such as video recommendation. Recently, deep learning models have been proposed to learn the representation of users' overall interests, while ignoring the fact that interests may dynamically change over time. We argue that it is necessary to consider the continuous-time information in CTR models to track user interest trend from rich historical behaviors. In this paper, we propose a novel Deep Time-Stream framework (DTS) which introduces the time information by an ordinary differential equations (ODE). DTS continuously models the evolution of interests using a neural network, and thus is able to tackle the challenge of dynamically representing users' interests based on their historical behaviors. In addition, our framework can be seamlessly applied to any existing deep CTR models by leveraging the additional Time-Stream Module, while no changes are made to the original CTR models. Experiments on public dataset as well as real industry dataset with billions of samples demonstrate the effectiveness of proposed approaches, which achieve superior performance compared with existing methods.

preprint2013arXiv

Superconductivity induced by U-doping in the SmFeAsO system

Through partial substitution of Sm by U in SmFeAsO, a different member of the family of iron-based superconductors was successfully synthesized. X-ray diffraction measurements show that the lattice constants along the a and c axes are both squeezed through U doping, indicating a successful substitution of U at the Sm site. The parent compound shows a strong resistivity anomaly near 150 K, associated with spin-density-wave instability.U doping suppresses this instability and leads to a transition to the superconducting state at temperatures up to 49 K. Magnetic measurements confirm the bulk superconductivity in this system. For the sample with a doping level of x = 0.2, the external magnetic field suppresses the onset temperature very slowly, indicating a rather high upper critical field. In addition, the Hall effect measurements show that U clearly dopes electrons into the material.

preprint2011arXiv

Evidence for line nodes in the energy gap of the overdoped Ba(Fe$_{1-x}$Co$_{x}$)$_{2}$As$_{2}$ from low-temperature specific heat measurements

Low-temperature specific heat (SH) is measured on Ba(Fe$_{1-x}$Co$_{x}$)$_2$As$_2$ single crystals in a wide doping region under different magnetic fields. For the overdoped sample, we find the clear evidence for the presence of $T^2$ term in the data, which is absent both for the underdoped and optimal doped samples, suggesting the presence of line nodes in the energy gap of the overdoped samples. Moreover, the field induced electron specific heat coefficient $Δγ(H)$ increases more quickly with the field for the overdoped sample than the underdoped and optimal doped ones, giving another support to our arguments. Our results suggest that the superconducting gap(s) in the present system may have different structures strongly depending on the doping regions.