Source author record

Chunyi Peng

Chunyi Peng appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computation and Language Cryptography and Security cs.CY Distributed, Parallel, and Cluster Computing eess.SY Networking and Internet Architecture Systems and Control

Catalog footprint

What is connected

4works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

CC-OCR V2: Benchmarking Large Multimodal Models for Literacy in Real-world Document Processing

Large Multimodal Models (LMMs) have recently shown strong performance on Optical Character Recognition (OCR) tasks, demonstrating their promising capability in document literacy. However, their effectiveness in real-world applications remains underexplored, as existing benchmarks adopt task scopes misaligned with practical applications and assume homogeneous acquisition conditions. To address this gap, we introduce CC-OCR V2, a comprehensive and challenging OCR benchmark tailored to real-world document processing. CC-OCR V2 focuses on practical enterprise document processing tasks and incorporates hard and corner cases that are critical yet underrepresented in prior benchmarks, covering 5 major OCR-centric tracks: text recognition, document parsing, document grounding, key information extraction, and document question answering, comprising 7,093 high-difficulty samples. Extensive experiments on 14 advanced LMMs reveal that current models fall short of real-world application requirements. Even state-of-the-art LMMs exhibit substantial performance degradation across diverse tasks and scenarios. These findings reveal a significant gap between performance on current benchmarks and effectiveness in real-world applications. We release the full dataset and evaluation toolkit at https://github.com/eioss/CC-OCR-V2.

preprint2019arXiv

Resilient Cyberphysical Systems and their Application Drivers: A Technology Roadmap

Cyberphysical systems (CPS) are ubiquitous in our personal and professional lives, and they promise to dramatically improve micro-communities (e.g., urban farms, hospitals), macro-communities (e.g., cities and metropolises), urban structures (e.g., smart homes and cars), and living structures (e.g., human bodies, synthetic genomes). The question that we address in this article pertains to designing these CPS systems to be resilient-from-the-ground-up, and through progressive learning, resilient-by-reaction. An optimally designed system is resilient to both unique attacks and recurrent attacks, the latter with a lower overhead. Overall, the notion of resilience can be thought of in the light of three main sources of lack of resilience, as follows: exogenous factors, such as natural variations and attack scenarios; mismatch between engineered designs and exogenous factors ranging from DDoS (distributed denial-of-service) attacks or other cybersecurity nightmares, so called "black swan" events, disabling critical services of the municipal electrical grids and other connected infrastructures, data breaches, and network failures; and the fragility of engineered designs themselves encompassing bugs, human-computer interactions (HCI), and the overall complexity of real-world systems. In the paper, our focus is on design and deployment innovations that are broadly applicable across a range of CPS application areas.

preprint2015arXiv

iCellular: Define Your Own Cellular Network Access on Commodity Smartphones

Leveraging multi-carrier access offers a promising approach to boosting access quality in mobile networks. However, our experiments show that the potential benefits are hard to fulfill due to fundamental limitations in the network-controlled design. To overcome these limitations, we propose iCellular, which allows users to define and intelligently select their own cellular network access from multiple carriers. iCellular reuses the existing device-side mechanisms and the standard cellular network procedure, but leverages the end device's intelligence to be proactive and adaptive in multi-carrier selection. It performs adaptive monitoring to ensure responsive selection and minimal service disruption, and enhances carrier selection with online learning and runtime decision fault prevention. It is deployable on commodity phones without any infrastructure/hardware change. We implement iCellular on commodity Nexus 6 phones and leverage Google Project-Fi's efforts to test multi-carrier access among two top US carriers: T-Mobile and Sprint. Our experiments confirm that iCellular helps users with up to 3.74x throughput improvement (7x suspension and 1.9x latency reduction) over the state-of-art selection. Moreover, iCellular locates the best-quality carrier in most cases, with negligible overhead on CPU, memory and energy consumption.

preprint2015arXiv

New Threats to SMS-Assisted Mobile Internet Services from 4G LTE: Lessons Learnt from Distributed Mobile-Initiated Attacks towards Facebook and Other Services

Mobile Internet is becoming the norm. With more personalized mobile devices in hand, many services choose to offer alternative, usually more convenient, approaches to authenticating and delivering the content between mobile users and service providers. One main option is to use SMS (i.e., short messaging service). Such carrier-grade text service has been widely used to assist versatile mobile services, including social networking, banking, to name a few. Though the text service can be spoofed via certain Internet text service providers which cooperated with carriers, such attacks haven well studied and defended by industry due to the efforts of research community. However, as cellular network technology advances to the latest IP-based 4G LTE, we find that these mobile services are somehow exposed to new threats raised by this change, particularly on 4G LTE Text service (via brand-new distributed Mobile-Initiated Spoofed SMS attack which is not available in legacy 2G/3G systems). The reason is that messaging service over LTE shifts from the circuit-switched (CS) design to the packet-switched (PS) paradigm as 4G LTE supports PS only. Due to this change, 4G LTE Text Service becomes open to access. However, its shields to messaging integrity and user authentication are not in place. As a consequence, such weaknesses can be exploited to launch attacks (e.g., hijack Facebook accounts) against a targeted individual, a large scale of mobile users and even service providers, from mobile devices. Current defenses for Internet-Initiated Spoofed SMS attacks cannot defend the unprecedented attack. Our study shows that 53 of 64 mobile services over 27 industries are vulnerable to at least one threat. We validate these proof-of-concept attacks in one major US carrier which supports more than 100 million users. We finally propose quick fixes and discuss security insights and lessons we have learnt.

Chunyi Peng

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

CC-OCR V2: Benchmarking Large Multimodal Models for Literacy in Real-world Document Processing

Resilient Cyberphysical Systems and their Application Drivers: A Technology Roadmap

iCellular: Define Your Own Cellular Network Access on Commodity Smartphones

New Threats to SMS-Assisted Mobile Internet Services from 4G LTE: Lessons Learnt from Distributed Mobile-Initiated Attacks towards Facebook and Other Services