Source author record

Linjie Yang

Linjie Yang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Information Theory math.IT eess.SP Machine Learning Artificial Intelligence Neural and Evolutionary Computing

Catalog footprint

What is connected

10works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer

Autoregressive image modeling relies on visual tokenizers to compress images into compact latent representations. We design an end-to-end training pipeline that jointly optimizes reconstruction and generation, enabling direct supervision from generation results to the tokenizer. This contrasts with prior two-stage approaches that train tokenizers and generative models separately. We further investigate leveraging vision foundation models to improve 1D tokenizers for autoregressive modeling. Our autoregressive generative model achieves strong empirical results, including a state-of-the-art FID score of 1.48 without guidance on ImageNet 256x256 generation.

preprint2022arXiv

Data-aided Active User Detection with False Alarm Correction in Grant-Free Transmission

In most existing grant-free (GF) studies, the two key tasks, namely active user detection (AUD) and payload data decoding, are handled separately. In this paper, a two-step dataaided AUD scheme is proposed, namely the initial AUD step and the false alarm correction step respectively. To implement the initial AUD step, an embedded low-density-signature (LDS) based preamble pool is constructed. In addition, two message passing algorithm (MPA) based initial estimators are developed. In the false alarm correction step, a redundant factor graph is constructed based on the initial active user set, on which MPA is employed for data decoding. The remaining false detected inactive users will be further recognized by the false alarm corrector with the aid of decoded data symbols. Simulation results reveal that both the data decoding performance and the AUD performance are significantly enhanced by more than 1:5 dB at the target accuracy of 10^3 compared with the traditional compressed sensing (CS) based counterparts

preprint2022arXiv

Dynamic Proposals for Efficient Object Detection

Object detection is a basic computer vision task to loccalize and categorize objects in a given image. Most state-of-the-art detection methods utilize a fixed number of proposals as an intermediate representation of object candidates, which is unable to adapt to different computational constraints during inference. In this paper, we propose a simple yet effective method which is adaptive to different computational resources by generating dynamic proposals for object detection. We first design a module to make a single query-based model to be able to inference with different numbers of proposals. Further, we extend it to a dynamic model to choose the number of proposals according to the input image, greatly reducing computational costs. Our method achieves significant speed-up across a wide range of detection models including two-stage and query-based models while obtaining similar or even better accuracy.

preprint2022arXiv

Grant-Free Transmission by LDPC Matrix Mapping and Integrated Cover-MPA Detector

In this paper, a novel transceiver architecture is proposed to simultaneously achieve efficient random access and reliable data transmission in massive IoT networks. At the transmitter side, each user is assigned a unique protocol sequence which is used to identify the user and also indicate the user's channel access pattern. Hence, user identification is completed by the detection of channel access patterns. Particularly, the columns of a parity check matrix of low-density-parity-check (LDPC) code are employed as protocol sequences. The design guideline of this LDPC parity check matrix and the associated performance analysis are provided in this paper.At the receiver side, a two-stage iterative detection architecture is designed, which consists of a group testing component and a payload data decoding component. They collaborate in a way that the group testing component maps detected protocol sequences to a tanner graph, on which the second component could execute its message passing algorithm. In turn, zero symbols detected by the message passing algorithm of the second component indicate potential false alarms made by the first group testing component. Hence, the tanner graph could iteratively evolve.The provided simulation results demonstrate that our transceiver design realizes a practical one-step grant-free transmission and has a compelling performance.

preprint2022arXiv

Improved Sparse Vector Code Based on Optimized Spreading Matrix for Short-Packet URLLC in mMTC

Recently, the sparse vector code (SVC) is emerging as a promising solution for short-packet transmission in massive machine type communication (mMTC) as well as ultra-reliable and low-latency communication (URLLC). In the SVC process, the encoding and decoding stages are jointly modeled as a standard compressed sensing (CS) problem. Hence, this paper aims at improving the decoding performance of SVC by optimizing the spreading matrix (i.e. measurement matrix in CS). To this end, two greedy algorithms to minimize the mutual coherence value of the spreading matrix in SVC are proposed. Specially, for practical applications, the spreading matrices are further required to be bipolar whose entries are constrained as +1 or -1. As a result, the optimized spreading matrices are highly efficient for storage, computation, and hardware realization. Simulation results reveal that, compared with the existing work, the block error rate (BLER) performance of SVC can be improved significantly with the optimized spreading matrices.

preprint2022arXiv

Learning Versatile Neural Architectures by Propagating Network Codes

This work explores how to design a single neural network capable of adapting to multiple heterogeneous vision tasks, such as image segmentation, 3D detection, and video recognition. This goal is challenging because both network architecture search (NAS) spaces and methods in different tasks are inconsistent. We solve this challenge from both sides. We first introduce a unified design space for multiple tasks and build a multitask NAS benchmark (NAS-Bench-MR) on many widely used datasets, including ImageNet, Cityscapes, KITTI, and HMDB51. We further propose Network Coding Propagation (NCP), which back-propagates gradients of neural predictors to directly update architecture codes along the desired gradient directions to solve various tasks. In this way, optimal architecture configurations can be found by NCP in our large search space in seconds. Unlike prior arts of NAS that typically focus on a single task, NCP has several unique benefits. (1) NCP transforms architecture optimization from data-driven to architecture-driven, enabling joint search an architecture among multitasks with different data distributions. (2) NCP learns from network codes but not original data, enabling it to update the architecture efficiently across datasets. (3) In addition to our NAS-Bench-MR, NCP performs well on other NAS benchmarks, such as NAS-Bench-201. (4) Thorough studies of NCP on inter-, cross-, and intra-tasks highlight the importance of cross-task neural architecture design, i.e., multitask neural architectures and architecture transferring between different tasks. Code is available at https://github.com/dingmyu/NCP.

preprint2020arXiv

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

Deep neural networks with adaptive configurations have gained increasing attention due to the instant and flexible deployment of these models on platforms with different resource budgets. In this paper, we investigate a novel option to achieve this goal by enabling adaptive bit-widths of weights and activations in the model. We first examine the benefits and challenges of training quantized model with adaptive bit-widths, and then experiment with several approaches including direct adaptation, progressive training and joint training. We discover that joint training is able to produce comparable performance on the adaptive model as individual models. We further propose a new technique named Switchable Clipping Level (S-CL) to further improve quantized models at the lowest bit-width. With our proposed techniques applied on a bunch of models including MobileNet-V1/V2 and ResNet-50, we demonstrate that bit-width of weights and activations is a new option for adaptively executable deep neural networks, offering a distinct opportunity for improved accuracy-efficiency trade-off as well as instant adaptation according to the platform constraints in real-world applications.

preprint2020arXiv

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

Search space design is very critical to neural architecture search (NAS) algorithms. We propose a fine-grained search space comprised of atomic blocks, a minimal search unit that is much smaller than the ones used in recent NAS algorithms. This search space allows a mix of operations by composing different types of atomic blocks, while the search space in previous methods only allows homogeneous operations. Based on this search space, we propose a resource-aware architecture search framework which automatically assigns the computational resources (e.g., output channel numbers) for each operation by jointly considering the performance and the computational cost. In addition, to accelerate the search process, we propose a dynamic network shrinkage technique which prunes the atomic blocks with negligible influence on outputs on the fly. Instead of a search-and-retrain two-stage paradigm, our method simultaneously searches and trains the target architecture. Our method achieves state-of-the-art performance under several FLOPs configurations on ImageNet with a small searching cost. We open our entire codebase at: https://github.com/meijieru/AtomNAS.

preprint2020arXiv

Neural Architecture Search for Lightweight Non-Local Networks

Non-Local (NL) blocks have been widely studied in various vision tasks. However, it has been rarely explored to embed the NL blocks in mobile neural networks, mainly due to the following challenges: 1) NL blocks generally have heavy computation cost which makes it difficult to be applied in applications where computational resources are limited, and 2) it is an open problem to discover an optimal configuration to embed NL blocks into mobile neural networks. We propose AutoNL to overcome the above two obstacles. Firstly, we propose a Lightweight Non-Local (LightNL) block by squeezing the transformation operations and incorporating compact features. With the novel design choices, the proposed LightNL block is 400x computationally cheaper} than its conventional counterpart without sacrificing the performance. Secondly, by relaxing the structure of the LightNL block to be differentiable during training, we propose an efficient neural architecture search algorithm to learn an optimal configuration of LightNL blocks in an end-to-end manner. Notably, using only 32 GPU hours, the searched AutoNL model achieves 77.7% top-1 accuracy on ImageNet under a typical mobile setting (350M FLOPs), significantly outperforming previous mobile models including MobileNetV2 (+5.7%), FBNet (+2.8%) and MnasNet (+2.1%). Code and models are available at https://github.com/LiYingwei/AutoNL.

preprint2015arXiv

A Large-Scale Car Dataset for Fine-Grained Categorization and Verification

Updated on 24/09/2015: This update provides preliminary experiment results for fine-grained classification on the surveillance data of CompCars. The train/test splits are provided in the updated dataset. See details in Section 6.

Linjie Yang

What is connected

Connect this record

See the researcher in context

Building this map preview

10 published item(s)

End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer

Data-aided Active User Detection with False Alarm Correction in Grant-Free Transmission

Dynamic Proposals for Efficient Object Detection

Grant-Free Transmission by LDPC Matrix Mapping and Integrated Cover-MPA Detector

Improved Sparse Vector Code Based on Optimized Spreading Matrix for Short-Packet URLLC in mMTC

Learning Versatile Neural Architectures by Propagating Network Codes

AdaBits: Neural Network Quantization with Adaptive Bit-Widths

AtomNAS: Fine-Grained End-to-End Neural Architecture Search

Neural Architecture Search for Lightweight Non-Local Networks

A Large-Scale Car Dataset for Fine-Grained Categorization and Verification