Source author record

Chang Shu

Chang Shu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Graphics Artificial Intelligence Computation and Language Computational Geometry physics.flu-dyn Human-Computer Interaction Machine Learning physics.comp-ph

Catalog footprint

What is connected

15works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Accurate near wall steady flow field prediction using Physics Informed Neural Network (PINN)

In this paper, Physics Informed Neural Network (PINN) is explored in order to obtain flow predictions near the wall region accurately with measurements (or sampling points) away from the wall. Often, in fluid mechanics experiments, it is difficult to perform velocity measurements near the wall accurately. Therefore, the present study reveals a new and elegant approach to recover the flow solutions near the wall. Laminar boundary layer flow over a flat plate case is considered for this study in order to explore the ability of PINN to accurately predict the flow field. All the required sampling data for this study is obtained from CFD simulations. A wide range of Reynolds number cases from Re=500 to 100000 has been investigated. First, using PINN, the boundary layer solution is obtained with three different types of boundary conditions. Further, the influence of the location of the sampling points on the accuracy is analysed. From the velocity profiles and the skin friction coefficient distribution, it is clear that PINN results are reasonably accurate near the wall with only a few sampling points away from the wall. This approach has potential application in experiments to obtain the near wall solutions accurately with measurements away from the wall.

preprint2022arXiv

Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

Real-time semantic segmentation, which aims to achieve high segmentation accuracy at real-time inference speed, has received substantial attention over the past few years. However, many state-of-the-art real-time semantic segmentation methods tend to sacrifice some spatial details or contextual information for fast inference, thus leading to degradation in segmentation quality. In this paper, we propose a novel Deep Multi-branch Aggregation Network (called DMA-Net) based on the encoder-decoder structure to perform real-time semantic segmentation in street scenes. Specifically, we first adopt ResNet-18 as the encoder to efficiently generate various levels of feature maps from different stages of convolutions. Then, we develop a Multi-branch Aggregation Network (MAN) as the decoder to effectively aggregate different levels of feature maps and capture the multi-scale information. In MAN, a lattice enhanced residual block is designed to enhance feature representations of the network by taking advantage of the lattice structure. Meanwhile, a feature transformation block is introduced to explicitly transform the feature map from the neighboring branch before feature aggregation. Moreover, a global context block is used to exploit the global contextual information. These key components are tightly combined and jointly optimized in a unified network. Extensive experimental results on the challenging Cityscapes and CamVid datasets demonstrate that our proposed DMA-Net respectively obtains 77.0% and 73.6% mean Intersection over Union (mIoU) at the inference speed of 46.7 FPS and 119.8 FPS by only using a single NVIDIA GTX 1080Ti GPU. This shows that DMA-Net provides a good tradeoff between segmentation quality and speed for semantic segmentation in street scenes.

preprint2022arXiv

ICAF: Iterative Contrastive Alignment Framework for Multimodal Abstractive Summarization

Integrating multimodal knowledge for abstractive summarization task is a work-in-progress research area, with present techniques inheriting fusion-then-generation paradigm. Due to semantic gaps between computer vision and natural language processing, current methods often treat multiple data points as separate objects and rely on attention mechanisms to search for connection in order to fuse together. In addition, missing awareness of cross-modal matching from many frameworks leads to performance reduction. To solve these two drawbacks, we propose an Iterative Contrastive Alignment Framework (ICAF) that uses recurrent alignment and contrast to capture the coherences between images and texts. Specifically, we design a recurrent alignment (RA) layer to gradually investigate fine-grained semantical relationships between image patches and text tokens. At each step during the encoding process, cross-modal contrastive losses are applied to directly optimize the embedding space. According to ROUGE, relevance scores, and human evaluation, our model outperforms the state-of-the-art baselines on MSMO dataset. Experiments on the applicability of our proposed framework and hyperparameters settings have been also conducted.

preprint2022arXiv

Pre-trained Language Models as Re-Annotators

Annotation noise is widespread in datasets, but manually revising a flawed corpus is time-consuming and error-prone. Hence, given the prior knowledge in Pre-trained Language Models and the expected uniformity across all annotations, we attempt to reduce annotation noise in the corpus through two tasks automatically: (1) Annotation Inconsistency Detection that indicates the credibility of annotations, and (2) Annotation Error Correction that rectifies the abnormal annotations. We investigate how to acquire semantic sensitive annotation representations from Pre-trained Language Models, expecting to embed the examples with identical annotations to the mutually adjacent positions even without fine-tuning. We proposed a novel credibility score to reveal the likelihood of annotation inconsistencies based on the neighbouring consistency. Then, we fine-tune the Pre-trained Language Models based classifier with cross-validation for annotation correction. The annotation corrector is further elaborated with two approaches: (1) soft labelling by Kernel Density Estimation and (2) a novel distant-peer contrastive loss. We study the re-annotation in relation extraction and create a new manually revised dataset, Re-DocRED, for evaluating document-level re-annotation. The proposed credibility scores show promising agreement with human revisions, achieving a Binary F1 of 93.4 and 72.5 in detecting inconsistencies on TACRED and DocRED respectively. Moreover, the neighbour-aware classifiers based on distant-peer contrastive learning and uncertain labels achieve Macro F1 up to 66.2 and 57.8 in correcting annotations on TACRED and DocRED respectively. These improvements are not merely theoretical: Rather, automatically denoised training sets demonstrate up to 3.6% performance improvement for state-of-the-art relation extraction models.

preprint2022arXiv

SideRT: A Real-time Pure Transformer Architecture for Single Image Depth Estimation

Since context modeling is critical for estimating depth from a single image, researchers put tremendous effort into obtaining global context. Many global manipulations are designed for traditional CNN-based architectures to overcome the locality of convolutions. Attention mechanisms or transformers originally designed for capturing long-range dependencies might be a better choice, but usually complicates architectures and could lead to a decrease in inference speed. In this work, we propose a pure transformer architecture called SideRT that can attain excellent predictions in real-time. In order to capture better global context, Cross-Scale Attention (CSA) and Multi-Scale Refinement (MSR) modules are designed to work collaboratively to fuse features of different scales efficiently. CSA modules focus on fusing features of high semantic similarities, while MSR modules aim to fuse features at corresponding positions. These two modules contain a few learnable parameters without convolutions, based on which a lightweight yet effective model is built. This architecture achieves state-of-the-art performances in real-time (51.3 FPS) and becomes much faster with a reasonable performance drop on a smaller backbone Swin-T (83.1 FPS). Furthermore, its performance surpasses the previous state-of-the-art by a large margin, improving AbsRel metric 6.9% on KITTI and 9.7% on NYU. To the best of our knowledge, this is the first work to show that transformer-based networks can attain state-of-the-art performance in real-time in the single image depth estimation field. Code will be made available soon.

preprint2020arXiv

Feature-metric Loss for Self-supervised Learning of Depth and Egomotion

Photometric loss is widely used for self-supervised depth and egomotion estimation. However, the loss landscapes induced by photometric differences are often problematic for optimization, caused by plateau landscapes for pixels in textureless regions or multiple local minima for less discriminative pixels. In this work, feature-metric loss is proposed and defined on feature representation, where the feature representation is also learned in a self-supervised manner and regularized by both first-order and second-order derivatives to constrain the loss landscapes to form proper convergence basins. Comprehensive experiments and detailed analysis via visualization demonstrate the effectiveness of the proposed feature-metric loss. In particular, our method improves state-of-the-art methods on KITTI from 0.885 to 0.925 measured by $δ_1$ for depth estimation, and significantly outperforms previous method for visual odometry.

preprint2020arXiv

How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context

We study the influence of context on sentence acceptability. First we compare the acceptability ratings of sentences judged in isolation, with a relevant context, and with an irrelevant context. Our results show that context induces a cognitive load for humans, which compresses the distribution of ratings. Moreover, in relevant contexts we observe a discourse coherence effect which uniformly raises acceptability. Next, we test unidirectional and bidirectional language models in their ability to predict acceptability ratings. The bidirectional models show very promising results, with the best model achieving a new state-of-the-art for unsupervised acceptability prediction. The two sets of experiments provide insights into the cognitive aspects of sentence processing and central issues in the computational modelling of text and discourse.

preprint2020arXiv

Non-iterative Simultaneous Rigid Registration Method for Serial Sections of Biological Tissue

In this paper, we propose a novel non-iterative algorithm to simultaneously estimate optimal rigid transformation for serial section images, which is a key component in volume reconstruction of serial sections of biological tissue. In order to avoid error accumulation and propagation caused by current algorithms, we add extra condition that the position of the first and the last section images should remain unchanged. This constrained simultaneous registration problem has not been solved before. Our algorithm method is non-iterative, it can simultaneously compute rigid transformation for a large number of serial section images in a short time. We prove that our algorithm gets optimal solution under ideal condition. And we test our algorithm with synthetic data and real data to verify our algorithm's effectiveness.

preprint2020arXiv

Propagation of weakly stretched premixed spherical spray flames in localized homogeneous and heterogeneous reactants

Propagation of weakly stretched spherical flames in partially pre-vaporized fuel sprays is theoretically investigated in this work. A general theory is developed to describe flame propagation speed, flame temperature, droplet evaporation onset and completion locations. The influences of liquid fuel and gas mixture properties on spherical spray flame propagation are studied. The results indicate that the spray flame propagation speed is enhanced with increased droplet mass loading and/or evaporation heat exchange coefficient (or evaporation rate). Opposite trends are found when the latent heat is high, due to strong evaporation heat absorption. Fuel vapor and temperature gradients are observed in the post-flame evaporation zone of heterogeneous flames. Evaporation completion front location considerably changes with flame radius, but the evaporation onset location varies little relative to the flame front when the flame propagates. For larger droplet loading and smaller evaporation rate, the fuel droplet tends to complete evaporation behind the flame front. Flame bifurcation occurs with high droplet mass loading under large latent heat, leading to multiplicity of flame propagation speed, droplet evaporation onset and completion fronts. The flame enhancement or weakening effects by the fuel droplet sprays are revealed by enhanced or suppressed heat and mass diffusion process in the pre-flame zone. Besides, for heterogeneous flames, heat and mass diffusion in the post-flame zone also exists. The mass diffusion for both homogeneous and heterogeneous flames is enhanced with decreased Lewis number. The magnitude of Markstein length is considerably reduced with increased droplet loading. Moreover, post-flame droplet burning behind heterogeneous flame influences the flame propagation speed and Markstein length when the liquid fuel loading is relatively low.

preprint2014arXiv

Estimation of Human Body Shape and Posture Under Clothing

Estimating the body shape and posture of a dressed human subject in motion represented as a sequence of (possibly incomplete) 3D meshes is important for virtual change rooms and security. To solve this problem, statistical shape spaces encoding human body shape and posture variations are commonly used to constrain the search space for the shape estimate. In this work, we propose a novel method that uses a posture-invariant shape space to model body shape variation combined with a skeleton-based deformation to model posture variation. Our method can estimate the body shape and posture of both static scans and motion sequences of dressed human body scans. In case of motion sequences, our method takes advantage of motion cues to solve for a single body shape estimate along with a sequence of posture estimates. We apply our approach to both static scans and motion sequences and demonstrate that using our method, higher fitting accuracy is achieved than when using a variant of the popular SCAPE model as statistical model.

preprint2014arXiv

Finite Element Based Tracking of Deforming Surfaces

We present an approach to robustly track the geometry of an object that deforms over time from a set of input point clouds captured from a single viewpoint. The deformations we consider are caused by applying forces to known locations on the object's surface. Our method combines the use of prior information on the geometry of the object modeled by a smooth template and the use of a linear finite element method to predict the deformation. This allows the accurate reconstruction of both the observed and the unobserved sides of the object. We present tracking results for noisy low-quality point clouds acquired by either a stereo camera or a depth camera, and simulations with point clouds corrupted by different error terms. We show that our method is also applicable to large non-linear deformations.

preprint2013arXiv

Fully Automatic Expression-Invariant Face Correspondence

We consider the problem of computing accurate point-to-point correspondences among a set of human face scans with varying expressions. Our fully automatic approach does not require any manually placed markers on the scan. Instead, the approach learns the locations of a set of landmarks present in a database and uses this knowledge to automatically predict the locations of these landmarks on a newly available scan. The predicted landmarks are then used to compute point-to-point correspondences between a template model and the newly available scan. To accurately fit the expression of the template to the expression of the scan, we use as template a blendshape model. Our algorithm was tested on a database of human faces of different ethnic groups with strongly varying expressions. Experimental results show that the obtained point-to-point correspondence is both highly accurate and consistent for most of the tested 3D face models.

preprint2012arXiv

Estimating 3D Human Shapes from Measurements

The recent advances in 3-D imaging technologies give rise to databases of human shapes, from which statistical shape models can be built. These statistical models represent prior knowledge of the human shape and enable us to solve shape reconstruction problems from partial information. Generating human shape from traditional anthropometric measurements is such a problem, since these 1-D measurements encode 3-D shape information. Combined with a statistical shape model, these easy-to-obtain measurements can be leveraged to create 3D human shapes. However, existing methods limit the creation of the shapes to the space spanned by the database and thus require a large amount of training data. In this paper, we introduce a technique that extrapolates the statistically inferred shape to fit the measurement data using nonlinear optimization. This method ensures that the generated shape is both human-like and satisfies the measurement conditions. We demonstrate the effectiveness of the method and compare it to existing approaches through extensive experiments, using both synthetic data and real human measurements.

preprint2011arXiv

Automatically Creating Design Models from 3D Anthropometry Data

When designing a product that needs to fit the human shape, designers often use a small set of 3D models, called design models, either in physical or digital form, as representative shapes to cover the shape variabilities of the population for which the products are designed. Until recently, the process of creating these models has been an art involving manual interaction and empirical guesswork. The availability of the 3D anthropometric databases provides an opportunity to create design models optimally. In this paper, we propose a novel way to use 3D anthropometric databases to generate design models that represent a given population for design applications such as the sizing of garments and gear. We generate the representative shapes by solving a covering problem in a parameter space. Well-known techniques in computational geometry are used to solve this problem. We demonstrate the method using examples in designing glasses and helmets.

preprint2008arXiv

Morphing of Triangular Meshes in Shape Space

We present a novel approach to morph between two isometric poses of the same non-rigid object given as triangular meshes. We model the morphs as linear interpolations in a suitable shape space $\mathcal{S}$. For triangulated 3D polygons, we prove that interpolating linearly in this shape space corresponds to the most isometric morph in $\mathbb{R}^3$. We then extend this shape space to arbitrary triangulations in 3D using a heuristic approach and show the practical use of the approach using experiments. Furthermore, we discuss a modified shape space that is useful for isometric skeleton morphing. All of the newly presented approaches solve the morphing problem without the need to solve a minimization problem.

Chang Shu

What is connected

Connect this record

See the researcher in context

Building this map preview

15 published item(s)

Accurate near wall steady flow field prediction using Physics Informed Neural Network (PINN)

Deep Multi-Branch Aggregation Network for Real-Time Semantic Segmentation in Street Scenes

ICAF: Iterative Contrastive Alignment Framework for Multimodal Abstractive Summarization

Pre-trained Language Models as Re-Annotators

SideRT: A Real-time Pure Transformer Architecture for Single Image Depth Estimation

Feature-metric Loss for Self-supervised Learning of Depth and Egomotion

How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context

Non-iterative Simultaneous Rigid Registration Method for Serial Sections of Biological Tissue

Propagation of weakly stretched premixed spherical spray flames in localized homogeneous and heterogeneous reactants

Estimation of Human Body Shape and Posture Under Clothing

Finite Element Based Tracking of Deforming Surfaces

Fully Automatic Expression-Invariant Face Correspondence

Estimating 3D Human Shapes from Measurements

Automatically Creating Design Models from 3D Anthropometry Data

Morphing of Triangular Meshes in Shape Space