Source author record

Xu Zhang

Xu Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence Distributed, Parallel, and Cluster Computing eess.SY Emerging Technologies Hardware Architecture Information Theory Machine Learning math.IT Networking and Internet Architecture Systems and Control

Catalog footprint

What is connected

7works

11topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Convergence Analysis of Weighted Median Opinion Dynamics with Higher-Order Effects

The weighted median mechanism provides a robust alternative to weighted averaging in opinion dynamics. Existing models, however, are predominantly formulated on pairwise interaction graphs, which limits their ability to represent higher-order environmental effects. In this work, a generalized weighted median opinion dynamics model is proposed by incorporating high-order interactions through a simplicial complex representation. The resulting dynamics are formulated as a nonlinear discrete-time system with synchronous opinion updates, in which intrinsic agent interactions and external environmental influences are jointly modeled. Sufficient conditions for asymptotic consensus are established for heterogeneous systems composed of opinionated and unopinionated agents. For homogeneous opinionated systems, convergence and convergence rates are rigorously analyzed using the Banach fixed-point theorem. Theoretical results demonstrate the stability of the proposed dynamics under mild conditions, and numerical simulations are provided to corroborate the analysis. This work extends median-based opinion dynamics to high-order interaction settings and provides a system-level framework for stability and consensus analysis.

preprint2026arXiv

InterLight: Leveraging Intrinsic Illumination Priors for Low-Light Image Enhancement

Low-Light Image Enhancement (LLIE) has long been a challenging problem in low-level vision, as insufficient illumination often leads to low contrast, detail loss, and noise. Recent studies show that deep learning-based Retinex theory can effectively decouple illumination and reflectance. However, existing methods frequently suffer from over-enhancement or color distortion, and often assume uniform noise or ideal lighting. To address these limitations, we propose InterLight, a novel framework that systematically excavates and operationalizes intrinsic illumination priors for LLIE.Our core insight is that robust enhancement requires not just estimating illumination, but constructing an illumination-aware pipeline. We first inject sensor-level illumination-response priors via physics-guided augmentation, then represent the degradation through adaptive prompts conditioned on the scene's latent illumination state. This explicit representation directly guides a luminance-gated intrinsic memory mechanism to selectively compensate for information loss, prioritizing reconstruction in dark regions while preserving fidelity in bright ones. Finally, the entire process is regularized by a self-supervised consistency objective that distills illumination-invariant features. By deeply exploiting intrinsic illumination priors, our method achieves clearer textures and more visually coherent enhancement results. Extensive experiments across multiple benchmarks demonstrate the effectiveness of our approach. Code is available at: https://github.com/House-yuyu/InterLight.

preprint2026arXiv

Logic-Guided Multistage Inference for Explainable Multidefendant Judgment Prediction

Crime disrupts societal stability, making law essential for balance. In multidefendant cases, assigning responsibility is complex and challenges fairness, requiring precise role differentiation. However, judicial phrasing often obscures the roles of the defendants, hindering effective AI-driven analyses. To address this issue, we incorporate sentencing logic into a pretrained Transformer encoder framework to enhance the intelligent assistance in multidefendant cases while ensuring legal interpretability. Within this framework an oriented masking mechanism clarifies roles and a comparative data construction strategy improves the model's sensitivity to culpability distinctions between principals and accomplices. Predicted guilt labels are further incorporated into a regression model through broadcasting, consolidating crime descriptions and court views. Our proposed masked multistage inference (MMSI) framework, evaluated on the custom IMLJP dataset for intentional injury cases, achieves significant accuracy improvements, outperforming baselines in role-based culpability differentiation. This work offers a robust solution for enhancing intelligent judicial systems, with publicly code available.

preprint2026arXiv

Low-Complexity Monitoring and Compensation of Transceiver IQ Imbalance by Multi-dimensional Architecture for Dual-Polarization 16 Quadrature Amplitude Modulation

In this paper, a low-complexity multi-dimensional architecture for IQ imbalance compensation is proposed, which reduces the effects of in-phase (I) and quadrature (Q) imbalance. The architecture use a transceiver IQ skew estimation structure to compensate for IQ skew, and then use a low-complexity MIMO equalizer to compensate for IQ amplitude/phase imbalance. In the transceiver IQ skew estimation structure, the receiver(RX) IQ skew is estimated by Gardner's phase detector, and the transmitter TX skew is estimated by finding the value that yields the lowest equalizer error. The low-complexity MIMO equalizer consists of a complex-valued MIMO (CV-MIMO) and a two-layer multimodulus algorithm real-valued MIMO (TMMA-RV-MIMO), which employ a butterfly and a non-butterfly structure, respectively. The CV-MIMO is used to perform polarization demultiplexing and the TMMA-RV-MIMO equalizes each of the two polarizations. In addition, the TMMA-RV-MIMO can recovery the carrier phase. A 100 km transmission simulation and experiment with 36 Gbaud dual-polarization 16 quadrature amplitude modulation (DP-16QAM) signals showed that, with the TX/RX IQ skew estimation, the estimation error is less than 0.9/0.25 ps. The low-complexity MIMO equalizer can tolerate 0.1 TX IQ amplitude imbalance and 5 degrees at a 0.3 dB Q-factor penalty. The number of real multiplications is reduced by 55% compared with conventional cases in total.

preprint2026arXiv

ResTok: Learning Hierarchical Residuals in 1D Visual Tokenizers for Autoregressive Image Generation

Existing 1D visual tokenizers for autoregressive (AR) generation largely follow the design principles of language modeling, as they are built directly upon transformers whose priors originate in language, yielding single-hierarchy latent tokens and treating visual data as flat sequential token streams. However, this language-like formulation overlooks key properties of vision, particularly the hierarchical and residual network designs that have long been essential for convergence and efficiency in visual models. To bring "vision" back to vision, we propose the Residual Tokenizer (ResTok), a 1D visual tokenizer that builds hierarchical residuals for both image tokens and latent tokens. The hierarchical representations obtained through progressively merging enable cross-level feature fusion at each layer, substantially enhancing representational capacity. Meanwhile, the semantic residuals between hierarchies prevent information overlap, yielding more concentrated latent distributions that are easier for AR modeling. Cross-level bindings consequently emerge without any explicit constraints. To accelerate the generation process, we further introduce a hierarchical AR generator that substantially reduces sampling steps by predicting an entire level of latent tokens at once rather than generating them strictly token-by-token. Extensive experiments demonstrate that restoring hierarchical residual priors in visual tokenization significantly improves AR image generation, achieving a gFID of 2.34 on ImageNet-256 with only 9 sampling steps. Code is available at https://github.com/Kwai-Kolors/ResTok.

preprint2026arXiv

Sortblock: Similarity-Aware Feature Reuse for Diffusion Model

Diffusion Transformers (DiTs) have demonstrated remarkable generative capabilities, particularly benefiting from Transformer architectures that enhance visual and artistic fidelity. However, their inherently sequential denoising process results in high inference latency, limiting their deployment in real-time scenarios. Existing training-free acceleration approaches typically reuse intermediate features at fixed timesteps or layers, overlooking the evolving semantic focus across denoising stages and Transformer blocks.To address this, we propose Sortblock, a training-free inference acceleration framework that dynamically caches block-wise features based on their similarity across adjacent timesteps. By ranking the evolution of residuals, Sortblock adaptively determines a recomputation ratio, selectively skipping redundant computations while preserving generation quality. Furthermore, we incorporate a lightweight linear prediction mechanism to reduce accumulated errors in skipped blocks.Extensive experiments across various tasks and DiT architectures demonstrate that Sortblock achieves over 2$\times$ inference speedup with minimal degradation in output quality, offering an effective and generalizable solution for accelerating diffusion-based generative models.

preprint2024arXiv

DFabric: Scaling Out Data Parallel Applications with CXL-Ethernet Hybrid Interconnects

Emerging interconnects, such as CXL and NVLink, have been integrated into the intra-host topology to scale more accelerators and facilitate efficient communication between them, such as GPUs. To keep pace with the accelerator's growing computing throughput, the interconnect has seen substantial enhancement in link bandwidth, e.g., 256GBps for CXL 3.0 links, which surpasses Ethernet and InfiniBand network links by an order of magnitude or more. Consequently, when data-intensive jobs, such as LLM training, scale across multiple hosts beyond the reach limit of the interconnect, the performance is significantly hindered by the limiting bandwidth of the network infrastructure. We address the problem by proposing DFabric, a two-tier interconnect architecture. We address the problem by proposing DFabric, a two-tier interconnect architecture. First, DFabric disaggregates rack's computing units with an interconnect fabric, i.e., CXL fabric, which scales at rack-level, so that they can enjoy intra-rack efficient interconnecting. Second, DFabric disaggregates NICs from hosts, and consolidates them to form a NIC pool with CXL fabric. By providing sufficient aggregated capacity comparable to interconnect bandwidth, the NIC pool bridges efficient communication across racks or beyond the reach limit of interconnect fabric. However, the local memory accessing becomes the bottleneck when enabling each host to utilize the NIC pool efficiently. To the end, DFabric builds a memory pool with sufficient bandwidth by disaggregating host local memory and adding more memory devices. We have implemented a prototype of DFabric that can run applications transparently. We validated its performance gain by running various microbenchmarks and compute-intensive applications such as DNN and graph.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint