Researcher profile

Minghui Liu

Minghui Liu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
8works
0followers
5topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

8 published item(s)

preprint2022arXiv

Channel Self-Supervision for Online Knowledge Distillation

Recently, researchers have shown an increased interest in the online knowledge distillation. Adopting an one-stage and end-to-end training fashion, online knowledge distillation uses aggregated intermediated predictions of multiple peer models for training. However, the absence of a powerful teacher model may result in the homogeneity problem between group peers, affecting the effectiveness of group distillation adversely. In this paper, we propose a novel online knowledge distillation method, \textbf{C}hannel \textbf{S}elf-\textbf{S}upervision for Online Knowledge Distillation (CSS), which structures diversity in terms of input, target, and network to alleviate the homogenization problem. Specifically, we construct a dual-network multi-branch structure and enhance inter-branch diversity through self-supervised learning, adopting the feature-level transformation and augmenting the corresponding labels. Meanwhile, the dual network structure has a larger space of independent parameters to resist the homogenization problem during distillation. Extensive quantitative experiments on CIFAR-100 illustrate that our method provides greater diversity than OKDDip and we also give pretty performance improvement, even over the state-of-the-art such as PCL. The results on three fine-grained datasets (StanfordDogs, StanfordCars, CUB-200-211) also show the significant generalization capability of our approach.

preprint2022arXiv

Factorization of the forward-backward charge asymmetry and measurements of the weak mixing angle and proton structure at hadron colliders

The forward-backward charge asymmetry (AFB) at hadron colliders is sensitive to both the electroweak (EW) symmetry breaking represented by the effective weak mixing angle, and the proton structure information in the initial state modeled by the parton distribution functions (PDFs). Due to their strong correlation, the precisions of the determination on the weak mixing angle and PDFs using the measured AFB spectrum are limited. In this paper, we define a set of structure parameters which factorize the unique proton information of the relative difference between quarks and antiquarks in the AFB observation. Other than the conventional way of extracting the weak mixing angle fro the convolution of PDF and EW calculations, we propose a new method to simultaneously determine the value of the weak mixing angle and the proton structure terms by fitting to the observed AFB distribution, and point out the necessity of specifying additional observations to further reduce the uncertainties on the proton structure terms respectively, so that the model-independent high precision measurements can be achieved at the future LHC experiments.

preprint2022arXiv

Hint for a minimal interaction length in $e^+e^-\toγγ$ annihilation in total cross section of centre-of-mass energies 55-207 GeV

The measurements of the total cross section of the $ e^+e^-\toγγ$ reaction from the VENUS, TOPAS, OPAL, DELPHI, ALEPH and L3 collaborations, collected between 1989 to 2003, are used to perform a $ χ^{2} $ test to search for a finite interaction length in direct contact term. The experimental data of the total cross section compared to the QED cross section of a $ χ^{2} $ test allows, to set limit on a finite interaction length $ r_{e}=(1.25\pm 0.16) \times 10 ^{-17} [cm] $. In the direct contact term annihilation is this interaction lengths a measure for the size of the electron.

preprint2022arXiv

On Schur Algebras and Derivations of Free Lie Algebras

We investigate the action of Schur algebra on the Lie algebras of derivations of free Lie algebras and operad structures constructed from it. We also show that the Lie algebra of derivations is generated by quadratic derivations together with the action of the Schur operad. Applications to certain subgroups of the automorphism group of a finitely generated free group are given as well.

preprint2022arXiv

Selective Output Smoothing Regularization: Regularize Neural Networks by Softening Output Distributions

In this paper, we propose Selective Output Smoothing Regularization, a novel regularization method for training the Convolutional Neural Networks (CNNs). Inspired by the diverse effects on training from different samples, Selective Output Smoothing Regularization improves the performance by encouraging the model to produce equal logits on incorrect classes when dealing with samples that the model classifies correctly and over-confidently. This plug-and-play regularization method can be conveniently incorporated into almost any CNN-based project without extra hassle. Extensive experiments have shown that Selective Output Smoothing Regularization consistently achieves significant improvement in image classification benchmarks, such as CIFAR-100, Tiny ImageNet, ImageNet, and CUB-200-2011. Particularly, our method obtains 77.30% accuracy on ImageNet with ResNet-50, which gains 1.1% than baseline (76.2%). We also empirically demonstrate the ability of our method to make further improvements when combining with other widely used regularization techniques. On Pascal detection, using the SOSR-trained ImageNet classifier as the pretrained model leads to better detection performances.

preprint2022arXiv

Temporally Resolution Decrement: Utilizing the Shape Consistency for Higher Computational Efficiency

Image resolution that has close relations with accuracy and computational cost plays a pivotal role in network training. In this paper, we observe that the reduced image retains relatively complete shape semantics but loses extensive texture information. Inspired by the consistency of the shape semantics as well as the fragility of the texture information, we propose a novel training strategy named Temporally Resolution Decrement. Wherein, we randomly reduce the training images to a smaller resolution in the time domain. During the alternate training with the reduced images and the original images, the unstable texture information in the images results in a weaker correlation between the texture-related patterns and the correct label, naturally enforcing the model to rely more on shape properties that are robust and conform to the human decision rule. Surprisingly, our approach greatly improves both the training and inference efficiency of convolutional neural networks. On ImageNet classification, using only 33\% calculation quantity (randomly reducing the training image to 112$\times$112 within 90\% epochs) can still improve ResNet-50 from 76.32\% to 77.71\%. Superimposed with the strong training procedure of ResNet-50 on ImageNet, our method achieves 80.42\% top-1 accuracy with saving 37.5\% calculation overhead. To the best of our knowledge this is the highest ImageNet single-crop accuracy on ResNet-50 under 224$\times$224 without extra data or distillation.

preprint2022arXiv

White Paper Assistance: A Step Forward Beyond the Shortcut Learning

The promising performances of CNNs often overshadow the need to examine whether they are doing in the way we are actually interested. We show through experiments that even over-parameterized models would still solve a dataset by recklessly leveraging spurious correlations, or so-called 'shortcuts'. To combat with this unintended propensity, we borrow the idea of printer test page and propose a novel approach called White Paper Assistance. Our proposed method involves the white paper to detect the extent to which the model has preference for certain characterized patterns and alleviates it by forcing the model to make a random guess on the white paper. We show the consistent accuracy improvements that are manifest in various architectures, datasets and combinations with other techniques. Experiments have also demonstrated the versatility of our approach on fine-grained recognition, imbalanced classification and robustness to corruptions.

preprint2021arXiv

Reduction of the electroweak correlation in the PDF updating by using the forward-backward asymmetry of Drell-Yan process

We propose a new observable for the measurement of the forward-backward asymmetry $(A_{FB})$ in Drell-Yan lepton production. At hadron colliders, the $A_{FB}$ distribution is sensitive to both the electroweak (EW) fundamental parameter $\sin^2 θ_{W}$, the weak mixing angle, and the parton distribution functions (PDFs). Hence, the determination of $\sin^2 θ_{W}$ and the updating of PDFs by directly using the same $A_{FB}$ spectrum are strongly correlated. This correlation would introduce large bias or uncertainty into both precise measurements of EW and PDF sectors. In this article, we show that the sensitivity of $A_{FB}$ on $\sin^2 θ_{W}$ is dominated by its average value around the $Z$ pole region, while the shape (or gradient) of the $A_{FB}$ spectrum is insensitive to $\sin^2 θ_{W}$ and contains important information on the PDF modeling. Accordingly, a new observable related to the gradient of the spectrum is introduced, and demonstrated to be able to significantly reduce the potential bias on the determination of $\sin^2 θ_{W}$ when updating the PDFs using the same $A_{FB}$ data.