Source author record

Xiaoyong Wei

Xiaoyong Wei appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci Information Retrieval Multimedia Artificial Intelligence cond-mat.mes-hall Databases physics.app-ph

Catalog footprint

What is connected

7works

8topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

Flexo-photovoltaic effect and above-bandgap photovoltage in halide perovskites

Halide perovskites have outstanding photovoltaic properties which have been optimized through interfacial engineering. However, as these materials approach the limits imposed by the physics of semiconductor junctions, it is urgent to explore alternatives, such as the bulk photovoltaic effect, whose physical origin is different and not bound by the same limits. In this context, we focus on the flexo-photovoltaic effect, a type of bulk photovoltaic effect that was recently observed in oxides under strain gradients. We have measured the flexo-photovoltaic effect of MAPbBr3 and MAPbI3 crystals under bending and found it to be orders of magnitude larger than for SrTiO3, the benchmark flexo-photovoltaic oxide. For sufficiently large strain gradients, photovoltages bigger than the bandgap can be produced. Bulk photovoltaic effects are additive and, for MAPbI3, the flexo-photovoltage exists on top of a native bulk photovoltage that is hysteretic, consistent with the electrically switchable macroscopic polarization of this material. The results suggest that harnessing the flexo-photovoltaic effect through strain gradient engineering can provide a functional leap forward for halide perovskites.

preprint2022arXiv

Deep learning-based person re-identification methods: A survey and outlook of recent works

In recent years, with the increasing demand for public safety and the rapid development of intelligent surveillance networks, person re-identification (Re-ID) has become one of the hot research topics in the computer vision field. The main research goal of person Re-ID is to retrieve persons with the same identity from different cameras. However, traditional person Re-ID methods require manual marking of person targets, which consumes a lot of labor cost. With the widespread application of deep neural networks, many deep learning-based person Re-ID methods have emerged. Therefore, this paper is to facilitate researchers to understand the latest research results and the future trends in the field. Firstly, we summarize the studies of several recently published person Re-ID surveys and complement the latest research methods to systematically classify deep learning-based person Re-ID methods. Secondly, we propose a multi-dimensional taxonomy that classifies current deep learning-based person Re-ID methods into four categories according to metric and representation learning, including methods for deep metric learning, local feature learning, generative adversarial learning and sequence feature learning. Furthermore, we subdivide the above four categories according to their methodologies and motivations, discussing the advantages and limitations of part subcategories. Finally, we discuss some challenges and possible research directions for person Re-ID.

preprint2022arXiv

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval

Our goal in this research is to study a more realistic environment in which we can conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories. We first contribute the Product1M datasets, and define two real practical instance-level retrieval tasks to enable the evaluations on the price comparison and personalized recommendations. For both instance-level tasks, how to accurately pinpoint the product target mentioned in the visual-linguistic data and effectively decrease the influence of irrelevant contents is quite challenging. To address this, we exploit to train a more effective cross-modal pertaining model which is adaptively capable of incorporating key concept information from the multi-modal data, by using an entity graph whose node and edge respectively denote the entity and the similarity relation between entities. Specifically, a novel Entity-Graph Enhanced Cross-Modal Pretraining (EGE-CMP) model is proposed for instance-level commodity retrieval, that explicitly injects entity knowledge in both node-based and subgraph-based ways into the multi-modal networks via a self-supervised hybrid-stream transformer, which could reduce the confusion between different object contents, thereby effectively guiding the network to focus on entities with real semantic. Experimental results well verify the efficacy and generalizability of our EGE-CMP, outperforming several SOTA cross-modal baselines like CLIP, UNITER and CAPTURE.

preprint2022arXiv

Global-Local Dynamic Feature Alignment Network for Person Re-Identification

The misalignment of human images caused by bounding box detection errors or partial occlusions is one of the main challenges in person Re-Identification (Re-ID) tasks. Previous local-based methods mainly focus on learning local features in predefined semantic regions of pedestrians. These methods usually use local hard alignment methods or introduce auxiliary information such as key human pose points to match local features, which are often not applicable when large scene differences are encountered. To solve these problems, we propose a simple and efficient Local Sliding Alignment (LSA) strategy to dynamically align the local features of two images by setting a sliding window on the local stripes of the pedestrian. LSA can effectively suppress spatial misalignment and does not need to introduce extra supervision information. Then, we design a Global-Local Dynamic Feature Alignment Network (GLDFA-Net) framework, which contains both global and local branches. We introduce LSA into the local branch of GLDFA-Net to guide the computation of distance metrics, which can further improve the accuracy of the testing phase. Evaluation experiments on several mainstream evaluation datasets including Market-1501, DukeMTMC-reID, CUHK03 and MSMT17 show that our method has competitive accuracy over the several state-of-the-art person Re-ID methods. Specifically, it achieves 86.1% mAP and 94.8% Rank-1 accuracy on Market1501.

preprint2022arXiv

Indicative Image Retrieval: Turning Blackbox Learning into Grey

Deep learning became the game changer for image retrieval soon after it was introduced. It promotes the feature extraction (by representation learning) as the core of image retrieval, with the relevance/matching evaluation being degenerated into simple similarity metrics. In many applications, we need the matching evidence to be indicated rather than just have the ranked list (e.g., the locations of the target proteins/cells/lesions in medical images). It is like the matched words need to be highlighted in search engines. However, this is not easy to implement without explicit relevance/matching modeling. The deep representation learning models are not feasible because of their blackbox nature. In this paper, we revisit the importance of relevance/matching modeling in deep learning era with an indicative retrieval setting. The study shows that it is possible to skip the representation learning and model the matching evidence directly. By removing the dependency on the pre-trained models, it has avoided a lot of related issues (e.g., the domain gap between classification and retrieval, the detail-diffusion caused by convolution, and so on). More importantly, the study demonstrates that the matching can be explicitly modeled and backtracked later for generating the matching evidence indications. It can improve the explainability of deep inference. Our method obtains a best performance in literature on both Oxford-5k and Paris-6k, and sets a new record of 97.77% on Oxford-5k (97.81% on Paris-6k) without extracting any deep features.

preprint2022arXiv

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining

Despite the potential of multi-modal pre-training to learn highly discriminative feature representations from complementary data modalities, current progress is being slowed by the lack of large-scale modality-diverse datasets. By leveraging the natural suitability of E-commerce, where different modalities capture complementary semantic information, we contribute a large-scale multi-modal pre-training dataset M5Product. The dataset comprises 5 modalities (image, text, table, video, and audio), covers over 6,000 categories and 5,000 attributes, and is 500 larger than the largest publicly available dataset with a similar number of modalities. Furthermore, M5Product contains incomplete modality pairs and noise while also having a long-tailed distribution, resembling most real-world problems. We further propose Self-harmonized ContrAstive LEarning (SCALE), a novel pretraining framework that integrates the different modalities into a unified model through an adaptive feature fusion mechanism, where the importance of each modality is learned directly from the modality embeddings and impacts the inter-modality contrastive learning and masked tasks within a multi-modal transformer model. We evaluate the current multi-modal pre-training state-of-the-art approaches and benchmark their ability to learn from unlabeled data when faced with the large number of modalities in the M5Product dataset. We conduct extensive experiments on four downstream tasks and demonstrate the superiority of our SCALE model, providing insights into the importance of dataset scale and diversity.

preprint2014arXiv

More ferroelectrics discovered by switching spectroscopy piezoresponse force microscopy?

The local hysteresis loop obtained by switching spectroscopy piezoresponse force microscopy (SS-PFM) is usually regarded as a typical signature of ferroelectric switching. However, such hysteresis loops were also observed in a broad variety of non-ferroelectric materials in the past several years, which casts doubts on the viewpoint that the local hysteresis loops in SS-PFM originate from ferroelectricity. Therefore, it is crucial to explore the mechanism of local hysteresis loops obtained in SS-PFM testing. Here we proposed that non-ferroelectric materials can also exhibit amplitude butterfly loops and phase hysteresis loops in SS-PFM testing due to the Maxwell force as long as the material can show macroscopic D-E hysteresis loops under cyclic electric field loading, no matter what the inherent physical mechanism is. To verify our viewpoint, both the macroscopic D-E and microscopic SS-PFM testing are conducted on a soda-lime glass and a non-ferroelectric dielectric material Ba0.4Sr0.6TiO3. Results show that both materials can exhibit D-E hysteresis loops and SS-PFM phase hysteresis loops, which can well support our viewpoint.

Xiaoyong Wei

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Flexo-photovoltaic effect and above-bandgap photovoltage in halide perovskites

Deep learning-based person re-identification methods: A survey and outlook of recent works

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-level Product Retrieval

Global-Local Dynamic Feature Alignment Network for Person Re-Identification

Indicative Image Retrieval: Turning Blackbox Learning into Grey

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining

More ferroelectrics discovered by switching spectroscopy piezoresponse force microscopy?