Source author record

Jianguo Zhang

Jianguo Zhang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Artificial Intelligence math.OA math.KT Computation and Language eess.IV Machine Learning math.FA math.GR Software Engineering

Catalog footprint

What is connected

15works

10topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

The Baum-Connes and the Mishchenko-Kasparov assembly maps for group extensions

The Baum-Connes assembly map with coefficients $e_{\ast}$ and the Mishchenko-Kasparov assembly map with coefficients $μ_{\ast}$ are two homomorphisms from the equivariant $K$-homology of classifying spaces of groups to the $K$-theory of reduced crossed products. In this paper, we investigate these two assembly maps for group extensions $1\rightarrow N \rightarrow Γ\xrightarrow{q} Γ/ N \rightarrow 1$. Firstly, under the assumption that $e_{\ast}$ is isomorphic for $q^{-1}(F)$ for any finite subgroup $F$ of $Γ/N$, we prove that $e_{\ast}$ is injective, surjective and isomorphic for $Γ$ if they are also true for $Γ/N$, respectively. Secondly, under the assumption that $e_{\ast}$ is rationally isomorphic for $N$, we verify that $μ_{\ast}$ is rationally injective for $Γ$ if it is also rationally injective for $Γ/N$. Finally, when $Γ$ is an isometric semi-direct product $N\rtimes G$, we confirm that $e_{\ast}$ is injective, surjective and isomorphic for $Γ$ if they also hold for $G$ and $Γ$ satisfies three partial conjectures along $N$, respectively. As applications, we show that the strong Novikov conjecture, the surjective assembly conjecture and the Baum-Connes conjecture with coefficients are closed under direct products, central extensions of groups and extensions by finite groups. Meanwhile, we also show that the rational analytic Novikov conjecture with coefficients is preserved under extensions of finite groups. Besides, we employ these results to obtain some new examples for the rational analytic and the strong Novikov conjecture beyond the class of coarsely embeddable groups.

preprint2023arXiv

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

Code intelligence leverages machine learning techniques to extract knowledge from extensive code corpora, with the aim of developing intelligent tools to improve the quality and productivity of computer programming. Currently, there is already a thriving research community focusing on code intelligence, with efforts ranging from software engineering, machine learning, data mining, natural language processing, and programming languages. In this paper, we conduct a comprehensive literature review on deep learning for code intelligence, from the aspects of code representation learning, deep learning techniques, and application tasks. We also benchmark several state-of-the-art neural models for code intelligence, and provide an open-source toolkit tailored for the rapid prototyping of deep-learning-based code intelligence models. In particular, we inspect the existing code intelligence models under the basis of code representation learning, and provide a comprehensive overview to enhance comprehension of the present state of code intelligence. Furthermore, we publicly release the source code and data resources to provide the community with a ready-to-use benchmark, which can facilitate the evaluation and comparison of existing and future code intelligence models (https://xcodemind.github.io). At last, we also point out several challenging and promising directions for future research.

preprint2022arXiv

$B^p_r(F_n)$ has no nontrivial idempotents

We show that there is no nontrivial idempotent in the reduced group $\ell^p$-operator algebra $B^p_r(F_n)$ of the free group $F_n$ on $n$ generators for each positive integer $n$.

preprint2022arXiv

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Pre-trained Transformer-based models were reported to be robust in intent classification. In this work, we first point out the importance of in-domain out-of-scope detection in few-shot intent recognition tasks and then illustrate the vulnerability of pre-trained Transformer-based models against samples that are in-domain but out-of-scope (ID-OOS). We construct two new datasets, and empirically show that pre-trained models do not perform well on both ID-OOS examples and general out-of-scope examples, especially on fine-grained few-shot intent detection tasks. To figure out how the models mistakenly classify ID-OOS intents as in-scope intents, we further conduct analysis on confidence scores and the overlapping keywords, as well as point out several prospective directions for future work. Resources are available on https://github.com/jianguoz/Few-Shot-Intent-Detection.

preprint2022arXiv

Domain-Adaptive 3D Medical Image Synthesis: An Efficient Unsupervised Approach

Medical image synthesis has attracted increasing attention because it could generate missing image data, improving diagnosis and benefits many downstream tasks. However, so far the developed synthesis model is not adaptive to unseen data distribution that presents domain shift, limiting its applicability in clinical routine. This work focuses on exploring domain adaptation (DA) of 3D image-to-image synthesis models. First, we highlight the technical difference in DA between classification, segmentation and synthesis models. Second, we present a novel efficient adaptation approach based on 2D variational autoencoder which approximates 3D distributions. Third, we present empirical studies on the effect of the amount of adaptation data and the key hyper-parameters. Our results show that the proposed approach can significantly improve the synthesis accuracy on unseen domains in a 3D setting. The code is publicly available at https://github.com/WinstonHuTiger/2D_VAE_UDA_for_3D_sythesis

preprint2022arXiv

Open Cones and $K$-theory for $\ell^p$ Roe Algebras

In this paper, we verify the $\ell^p$ coarse Baum-Connes conjecture for open cones and show that the $K$-theory for $\ell^p$ Roe algebras of open cones are independent of $p\in[1,\infty)$. Combined with the result of T. Fukaya and S.-I. Oguni, we give an application to the class of coarsely convex spaces that includes geodesic Gromov hyperbolic spaces, CAT(0)-spaces, certain Artin groups and Helly groups equipped with the word length metric.

preprint2022arXiv

Partial Least Square Regression via Three-factor SVD-type Manifold Optimization for EEG Decoding

Partial least square regression (PLSR) is a widely-used statistical model to reveal the linear relationships of latent factors that comes from the independent variables and dependent variables. However, traditional methods to solve PLSR models are usually based on the Euclidean space, and easily getting stuck into a local minimum. To this end, we propose a new method to solve the partial least square regression, named PLSR via optimization on bi-Grassmann manifold (PLSRbiGr). Specifically, we first leverage the three-factor SVD-type decomposition of the cross-covariance matrix defined on the bi-Grassmann manifold, converting the orthogonal constrained optimization problem into an unconstrained optimization problem on bi-Grassmann manifold, and then incorporate the Riemannian preconditioning of matrix scaling to regulate the Riemannian metric in each iteration. PLSRbiGr is validated with a variety of experiments for decoding EEG signals at motor imagery (MI) and steady-state visual evoked potential (SSVEP) task. Experimental results demonstrate that PLSRbiGr outperforms competing algorithms in multiple EEG decoding tasks, which will greatly facilitate small sample data learning.

preprint2022arXiv

Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning

Heatmap regression methods have dominated face alignment area in recent years while they ignore the inherent relation between different landmarks. In this paper, we propose a Sparse Local Patch Transformer (SLPT) for learning the inherent relation. The SLPT generates the representation of each single landmark from a local patch and aggregates them by an adaptive inherent relation based on the attention mechanism. The subpixel coordinate of each landmark is predicted independently based on the aggregated feature. Moreover, a coarse-to-fine framework is further introduced to incorporate with the SLPT, which enables the initial landmarks to gradually converge to the target facial landmarks using fine-grained features from dynamically resized local patches. Extensive experiments carried out on three popular benchmarks, including WFLW, 300W and COFW, demonstrate that the proposed method works at the state-of-the-art level with much less computational complexity by learning the inherent relation between facial landmarks. The code is available at the project website.

preprint2021arXiv

$L^p$ Coarse Baum-Connes Conjecture and $K$-theory for $L^p$ Roe Algebras

In this paper, we verify the $L^p$ coarse Baum-Connes conjecture for spaces with finite asymptotic dimension for $p\in[1,\infty)$. We also show that the $K$-theory of $L^p$ Roe algebras are independent of $p\in(1,\infty)$ for spaces with finite asymptotic dimension.

preprint2021arXiv

Deep Class-Specific Affinity-Guided Convolutional Network for Multimodal Unpaired Image Segmentation

Multi-modal medical image segmentation plays an essential role in clinical diagnosis. It remains challenging as the input modalities are often not well-aligned spatially. Existing learning-based methods mainly consider sharing trainable layers across modalities and minimizing visual feature discrepancies. While the problem is often formulated as joint supervised feature learning, multiple-scale features and class-specific representation have not yet been explored. In this paper, we propose an affinity-guided fully convolutional network for multimodal image segmentation. To learn effective representations, we design class-specific affinity matrices to encode the knowledge of hierarchical feature reasoning, together with the shared convolutional layers to ensure the cross-modality generalization. Our affinity matrix does not depend on spatial alignments of the visual features and thus allows us to train with unpaired, multimodal inputs. We extensively evaluated our method on two public multimodal benchmark datasets and outperform state-of-the-art methods.

preprint2020arXiv

Domain Adaptive Medical Image Segmentation via Adversarial Learning of Disease-Specific Spatial Patterns

In medical imaging, the heterogeneity of multi-centre data impedes the applicability of deep learning-based methods and results in significant performance degradation when applying models in an unseen data domain, e.g. a new centreor a new scanner. In this paper, we propose an unsupervised domain adaptation framework for boosting image segmentation performance across multiple domains without using any manual annotations from the new target domains, but by re-calibrating the networks on few images from the target domain. To achieve this, we enforce architectures to be adaptive to new data by rejecting improbable segmentation patterns and implicitly learning through semantic and boundary information, thus to capture disease-specific spatial patterns in an adversarial optimization. The adaptation process needs continuous monitoring, however, as we cannot assume the presence of ground-truth masks for the target domain, we propose two new metrics to monitor the adaptation process, and strategies to train the segmentation algorithm in a stable fashion. We build upon well-established 2D and 3D architectures and perform extensive experiments on three cross-centre brain lesion segmentation tasks, involving multicentre public and in-house datasets. We demonstrate that recalibrating the deep networks on a few unlabeled images from the target domain improves the segmentation accuracy significantly.

preprint2020arXiv

Generalisable Cardiac Structure Segmentation via Attentional and Stacked Image Adaptation

Tackling domain shifts in multi-centre and multi-vendor data sets remains challenging for cardiac image segmentation. In this paper, we propose a generalisable segmentation framework for cardiac image segmentation in which multi-centre, multi-vendor, multi-disease datasets are involved. A generative adversarial networks with an attention loss was proposed to translate the images from existing source domains to a target domain, thus to generate good-quality synthetic cardiac structure and enlarge the training set. A stack of data augmentation techniques was further used to simulate real-world transformation to boost the segmentation performance for unseen domains.We achieved an average Dice score of 90.3% for the left ventricle, 85.9% for the myocardium, and 86.5% for the right ventricle on the hidden validation set across four vendors. We show that the domain shifts in heterogeneous cardiac imaging datasets can be drastically reduced by two aspects: 1) good-quality synthetic data by learning the underlying target domain distribution, and 2) stacked classical image processing techniques for data augmentation.

preprint2020arXiv

MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines

MultiWOZ is a well-known task-oriented dialogue dataset containing over 10,000 annotated dialogues spanning 8 domains. It is extensively used as a benchmark for dialogue state tracking. However, recent works have reported presence of substantial noise in the dialogue state annotations. MultiWOZ 2.1 identified and fixed many of these erroneous annotations and user utterances, resulting in an improved version of this dataset. This work introduces MultiWOZ 2.2, which is a yet another improved version of this dataset. Firstly, we identify and fix dialogue state annotation errors across 17.3% of the utterances on top of MultiWOZ 2.1. Secondly, we redefine the ontology by disallowing vocabularies of slots with a large number of possible values (e.g., restaurant name, time of booking). In addition, we introduce slot span annotations for these slots to standardize them across recent models, which previously used custom string matching heuristics to generate them. We also benchmark a few state of the art dialogue state tracking models on the corrected dataset to facilitate comparison for future work. In the end, we discuss best practices for dialogue data collection that can help avoid annotation errors.

preprint2020arXiv

TEA: Temporal Excitation and Aggregation for Action Recognition

Temporal modeling is key for action recognition in videos. It normally considers both short-range motions and long-range aggregations. In this paper, we propose a Temporal Excitation and Aggregation (TEA) block, including a motion excitation (ME) module and a multiple temporal aggregation (MTA) module, specifically designed to capture both short- and long-range temporal evolution. In particular, for short-range motion modeling, the ME module calculates the feature-level temporal differences from spatiotemporal features. It then utilizes the differences to excite the motion-sensitive channels of the features. The long-range temporal aggregations in previous works are typically achieved by stacking a large number of local temporal convolutions. Each convolution processes a local temporal window at a time. In contrast, the MTA module proposes to deform the local convolution to a group of sub-convolutions, forming a hierarchical residual architecture. Without introducing additional parameters, the features will be processed with a series of sub-convolutions, and each frame could complete multiple temporal aggregations with neighborhoods. The final equivalent receptive field of temporal dimension is accordingly enlarged, which is capable of modeling the long-range temporal relationship over distant frames. The two components of the TEA block are complementary in temporal modeling. Finally, our approach achieves impressive results at low FLOPs on several action recognition benchmarks, such as Kinetics, Something-Something, HMDB51, and UCF101, which confirms its effectiveness and efficiency.

preprint2016arXiv

A Multi-task Deep Network for Person Re-identification

Person re-identification (ReID) focuses on identifying people across different scenes in video surveillance, which is usually formulated as a binary classification task or a ranking task in current person ReID approaches. In this paper, we take both tasks into account and propose a multi-task deep network (MTDnet) that makes use of their own advantages and jointly optimize the two tasks simultaneously for person ReID. To the best of our knowledge, we are the first to integrate both tasks in one network to solve the person ReID. We show that our proposed architecture significantly boosts the performance. Furthermore, deep architecture in general requires a sufficient dataset for training, which is usually not met in person ReID. To cope with this situation, we further extend the MTDnet and propose a cross-domain architecture that is capable of using an auxiliary set to assist training on small target sets. In the experiments, our approach outperforms most of existing person ReID algorithms on representative datasets including CUHK03, CUHK01, VIPeR, iLIDS and PRID2011, which clearly demonstrates the effectiveness of the proposed approach.

Jianguo Zhang

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

The Baum-Connes and the Mishchenko-Kasparov assembly maps for group extensions

Deep Learning for Code Intelligence: Survey, Benchmark and Toolkit

$B^p_r(F_n)$ has no nontrivial idempotents

Are Pretrained Transformers Robust in Intent Classification? A Missing Ingredient in Evaluation of Out-of-Scope Intent Detection

Domain-Adaptive 3D Medical Image Synthesis: An Efficient Unsupervised Approach

Open Cones and $K$-theory for $\ell^p$ Roe Algebras

Partial Least Square Regression via Three-factor SVD-type Manifold Optimization for EEG Decoding

Sparse Local Patch Transformer for Robust Face Alignment and Landmarks Inherent Relation Learning

$L^p$ Coarse Baum-Connes Conjecture and $K$-theory for $L^p$ Roe Algebras

Deep Class-Specific Affinity-Guided Convolutional Network for Multimodal Unpaired Image Segmentation

Domain Adaptive Medical Image Segmentation via Adversarial Learning of Disease-Specific Spatial Patterns

Generalisable Cardiac Structure Segmentation via Attentional and Stacked Image Adaptation

MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines

TEA: Temporal Excitation and Aggregation for Action Recognition

A Multi-task Deep Network for Person Re-identification