Researcher profile

Jon C. Calhoun

Jon C. Calhoun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - UnverifiedVerification L1Unclaimed author
3works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2025arXiv

Understanding LLM Checkpoint/Restore I/O Strategies and Patterns

As LLMs and foundation models scale, checkpoint/restore has become a critical pattern for training and inference. With 3D parallelism (tensor, pipeline, data), checkpointing involves many processes, each managing numerous tensors of varying shapes and sizes, that must be persisted frequently to stable storage (e.g., parallel file systems). This turns checkpoint/restore into a big-data I/O problem characterized by volume, variety, and velocity. The workflow must traverse the full storage stack -- from GPU memory through host memory and local storage to external repositories -- whose tiers differ by orders of magnitude in performance, creating bottlenecks under concurrency even with asynchronous flush/prefetch. Kernel-accelerated I/O libraries such as \texttt{liburing} may mitigate these issues versus POSIX, but their effectiveness for LLM checkpointing remains underexplored. We develop microbenchmarks to quantify trade-offs when using \texttt{liburing}, evaluating how aggregation, alignment, and I/O coalescing interact under buffered and direct I/O. We find that uncoalesced small-buffer operations halve throughput relative to synthetic workloads, while file system-aware aggregation restores bandwidth and reduces metadata overhead. Compared to state-of-the-art LLM checkpointing engines, our approach achieves up to $3.9\times$ higher write throughput than DataStates-LLM and $7.6\times$ higher than TorchSnapshot. These results highlight the need for aggregation and coalescing strategies that align with modern file systems and I/O backends.

preprint2020arXiv

Dynamic Error-bounded Lossy Compression (EBLC) to Reduce the Bandwidth Requirement for Real-time Vision-based Pedestrian Safety Applications

As camera quality improves and their deployment moves to areas with limited bandwidth, communication bottlenecks can impair real-time constraints of an ITS application, such as video-based real-time pedestrian detection. Video compression reduces the bandwidth requirement to transmit the video but degrades the video quality. As the quality level of the video decreases, it results in the corresponding decreases in the accuracy of the vision-based pedestrian detection model. Furthermore, environmental conditions (e.g., rain and darkness) alter the compression ratio and can make maintaining a high pedestrian detection accuracy more difficult. The objective of this study is to develop a real-time error-bounded lossy compression (EBLC) strategy to dynamically change the video compression level depending on different environmental conditions in order to maintain a high pedestrian detection accuracy. We conduct a case study to show the efficacy of our dynamic EBLC strategy for real-time vision-based pedestrian detection under adverse environmental conditions. Our strategy selects the error tolerances dynamically for lossy compression that can maintain a high detection accuracy across a representative set of environmental conditions. Analyses reveal that our strategy increases pedestrian detection accuracy up to 14% and reduces the communication bandwidth up to 14x for adverse environmental conditions compared to the same conditions but without our dynamic EBLC strategy. Our dynamic EBLC strategy is independent of detection models and environmental conditions allowing other detection models and environmental conditions to be easily incorporated in our strategy.

preprint2020arXiv

FRaZ: A Generic High-Fidelity Fixed-Ratio Lossy Compression Framework for Scientific Floating-point Data

With ever-increasing volumes of scientific floating-point data being produced by high-performance computing applications, significantly reducing scientific floating-point data size is critical, and error-controlled lossy compressors have been developed for years. None of the existing scientific floating-point lossy data compressors, however, support effective fixed-ratio lossy compression. Yet fixed-ratio lossy compression for scientific floating-point data not only compresses to the requested ratio but also respects a user-specified error bound with higher fidelity. In this paper, we present FRaZ: a generic fixed-ratio lossy compression framework respecting user-specified error constraints. The contribution is twofold. (1) We develop an efficient iterative approach to accurately determine the appropriate error settings for different lossy compressors based on target compression ratios. (2) We perform a thorough performance and accuracy evaluation for our proposed fixed-ratio compression framework with multiple state-of-the-art error-controlled lossy compressors, using several real-world scientific floating-point datasets from different domains. Experiments show that FRaZ effectively identifies the optimum error setting in the entire error setting space of any given lossy compressor. While fixed-ratio lossy compression is slower than fixed-error compression, it provides an important new lossy compression technique for users of very large scientific floating-point datasets.