Source author record

Yixuan Xu

Yixuan Xu appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mtrl-sci

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Qwen-Image-2.0 Technical Report

We present Qwen-Image-2.0, an omni-capable image generation foundation model that unifies high-fidelity generation and precise image editing within a single framework. Despite recent progress, existing models still struggle with ultra-long text rendering, multilingual typography, high-resolution photorealism, robust instruction following, and efficient deployment, especially in text-rich and compositionally complex scenarios. Qwen-Image-2.0 addresses these challenges by coupling Qwen3-VL as the condition encoder with a Multimodal Diffusion Transformer for joint condition-target modeling, supported by large-scale data curation and a customized multi-stage training pipeline. This enables strong multimodal understanding while preserving flexible generation and editing capabilities. The model supports instructions of up to 1K tokens for generating text-rich content such as slides, posters, infographics, and comics, while significantly improving multilingual text fidelity and typography. It also enhances photorealistic generation with richer details, more realistic textures, and coherent lighting, and follows complex prompts more reliably across diverse styles. Extensive human evaluations show that Qwen-Image-2.0 substantially outperforms previous Qwen-Image models in both generation and editing, marking a step toward more general, reliable, and practical image generation foundation models.

preprint2023arXiv

Coupling of structure and magnetism to spin splitting in hybrid organic-inorganic perovskites

Hybrid organic-inorganic perovskites are famous for the diversity of their chemical compositions, phases and phase transitions, and associated physical properties. We use a combination of experimental and computational techniques to reveal strong coupling between structure, magnetism, and spin splitting in a representative of the largest family of hybrid organic-inorganic perovskites: the formates. With the help of first-principles simulations, we find spin splitting in both conduction and valence bands of [NH$_2$NH$_3$]Co(HCOO)$_3$, induced by spin-orbit interactions, which can reach up to 14~meV. Our magnetic measurements reveal that this material exhibits canted antiferromagnetism below 15.5 K. The direction of the associated antiferromagnetic order parameter is strongly coupled with the spin splitting already in the centrosymmetric phase, allowing for the creation and annihilation of spin splitting through the application of a magnetic field. Furthermore, the structural phase transition into experimentally observed polar Pna2$_1$ phase completely changes the aforementioned spin splitting and its coupling to magnetic degrees of freedom. This reveals that in [NH$_2$NH$_3$]Co(HCOO)$_3$, the structure and magnetism are strongly coupled to spin splitting in a way that allows for its manipulation through both magnetic and electric fields. As an example, for a given point inside the Brillouin zone of centrosymmetric Pnma phase of [NH$_2$NH$_3$]Co(HCOO)$_3$, spin splitting can be turned on/off by aligning the antiferromagnetic vector along certain crystallographic directions or through inducing a polar phase by the application of an electric field. We believe that our findings offer an important step toward fundamental understanding and practical applications of materials with coupled properties.

preprint2022arXiv

A Versatile Multi-View Framework for LiDAR-based 3D Object Detection with Guidance from Panoptic Segmentation

3D object detection using LiDAR data is an indispensable component for autonomous driving systems. Yet, only a few LiDAR-based 3D object detection methods leverage segmentation information to further guide the detection process. In this paper, we propose a novel multi-task framework that jointly performs 3D object detection and panoptic segmentation. In our method, the 3D object detection backbone in Bird's-Eye-View (BEV) plane is augmented by the injection of Range-View (RV) feature maps from the 3D panoptic segmentation backbone. This enables the detection backbone to leverage multi-view information to address the shortcomings of each projection view. Furthermore, foreground semantic information is incorporated to ease the detection task by highlighting the locations of each object class in the feature maps. Finally, a new center density heatmap generated based on the instance-level information further guides the detection backbone by suggesting possible box center locations for objects. Our method works with any BEV-based 3D object detection method, and as shown by extensive experiments on the nuScenes dataset, it provides significant performance gains. Notably, the proposed method based on a single-stage CenterPoint 3D object detection network achieved state-of-the-art performance on nuScenes 3D Detection Benchmark with 67.3 NDS.