Source author record

Lu Jiang

Lu Jiang appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning cond-mat.mtrl-sci Artificial Intelligence Information Retrieval Computation and Language cond-mat.mes-hall cond-mat.str-el cs.CY Multimedia q-fin.RM q-fin.ST Systems and Control

Catalog footprint

What is connected

28works

13topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

From Passive Reuse to Active Reasoning: Grounding Large Language Models for Neuro-Symbolic Experience Replay

While experience replay is essential for data efficiency in reinforcement learning (RL), standard methods treat the replay buffer as a passive memory system, prioritizing samples based on numerical prediction errors rather than their semantic significance. This approach stands in contrast to human learning, which accelerates mastery by actively abstracting fragmented experiences into behavioral rules. To bridge this gap, we propose Neuro-Symbolic Experience Replay (NSER), a framework that transforms experience replay from a passive sample reuse mechanism into an active engine for knowledge construction. Specifically, NSER addresses the incompatibility between linguistic reasoning and numerical optimization through a novel neuro-symbolic grounding pipeline. It leverages Large Language Models (LLMs) in a zero-shot manner to induce candidate behavioral rules from accumulated trajectories, grounds these insights into differentiable first-order logic representations, and utilizes the resulting symbolic structures to dynamically reweight the replay distribution. By allowing abstract knowledge to directly shape policy optimization, NSER achieves consistent superior sample efficiency and convergence speed across reactive, rule-based, and procedural benchmarks.

preprint2026arXiv

Uni-FinLLM: A Unified Multimodal Large Language Model with Modular Task Heads for Micro-Level Stock Prediction and Macro-Level Systemic Risk Assessment

Financial institutions and regulators require systems that integrate heterogeneous data to assess risks from stock fluctuations to systemic vulnerabilities. Existing approaches often treat these tasks in isolation, failing to capture cross-scale dependencies. We propose Uni-FinLLM, a unified multimodal large language model that uses a shared Transformer backbone and modular task heads to jointly process financial text, numerical time series, fundamentals, and visual data. Through cross-modal attention and multi-task optimization, it learns a coherent representation for micro-, meso-, and macro-level predictions. Evaluated on stock forecasting, credit-risk assessment, and systemic-risk detection, Uni-FinLLM significantly outperforms baselines. It raises stock directional accuracy to 67.4% (from 61.7%), credit-risk accuracy to 84.1% (from 79.6%), and macro early-warning accuracy to 82.3%. Results validate that a unified multimodal LLM can jointly model asset behavior and systemic vulnerabilities, offering a scalable decision-support engine for finance.

preprint2026arXiv

UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

With the rapid growth of video generative models (VGMs), it is essential to develop reliable and comprehensive automatic metrics for AI-generated videos (AIGVs). Existing methods either use off-the-shelf models optimized for other tasks or rely on human assessment data to train specialized evaluators. These approaches are constrained to specific evaluation aspects and are difficult to scale with the increasing demands for finer-grained and more comprehensive evaluations. To address this issue, this work investigates the feasibility of using multimodal large language models (MLLMs) as a unified evaluator for AIGVs, leveraging their strong visual perception and language understanding capabilities. To evaluate the performance of automatic metrics in unified AIGV evaluation, we introduce a benchmark called UVE-Bench. UVE-Bench collects videos generated by state-of-the-art VGMs and provides pairwise human preference annotations across 15 evaluation aspects. Using UVE-Bench, we extensively evaluate 18 MLLMs. Our empirical results suggest that while advanced MLLMs (e.g., Qwen2VL-72B and InternVL2.5-78B) still lag behind human evaluators, they demonstrate promising ability in unified AIGV evaluation, significantly surpassing existing specialized evaluation methods. Additionally, we conduct an in-depth analysis of key design choices that impact the performance of MLLM-driven evaluators, offering valuable insights for future research on AIGV evaluation.

preprint2023arXiv

A Multi-Source Information Learning Framework for Airbnb Price Prediction

With the development of technology and sharing economy, Airbnb as a famous short-term rental platform, has become the first choice for many young people to select. The issue of Airbnb's pricing has always been a problem worth studying. While the previous studies achieve promising results, there are exists deficiencies to solve. Such as, (1) the feature attributes of rental are not rich enough; (2) the research on rental text information is not deep enough; (3) there are few studies on predicting the rental price combined with the point of interest(POI) around the house. To address the above challenges, we proposes a multi-source information embedding(MSIE) model to predict the rental price of Airbnb. Specifically, we first selects the statistical feature to embed the original rental data. Secondly, we generates the word feature vector and emotional score combination of three different text information to form the text feature embedding. Thirdly, we uses the points of interest(POI) around the rental house information generates a variety of spatial network graphs, and learns the embedding of the network to obtain the spatial feature embedding. Finally, this paper combines the three modules into multi source rental representations, and uses the constructed fully connected neural network to predict the price. The analysis of the experimental results shows the effectiveness of our proposed model.

preprint2023arXiv

Multi-View MOOC Quality Evaluation via Information-Aware Graph Representation Learning

In this paper, we study the problem of MOOC quality evaluation which is essential for improving the course materials, promoting students' learning efficiency, and benefiting user services. While achieving promising performances, current works still suffer from the complicated interactions and relationships of entities in MOOC platforms. To tackle the challenges, we formulate the problem as a course representation learning task-based and develop an Information-aware Graph Representation Learning(IaGRL) for multi-view MOOC quality evaluation. Specifically, We first build a MOOC Heterogeneous Network (HIN) to represent the interactions and relationships among entities in MOOC platforms. And then we decompose the MOOC HIN into multiple single-relation graphs based on meta-paths to depict the multi-view semantics of courses. The course representation learning can be further converted to a multi-view graph representation task. Different from traditional graph representation learning, the learned course representations are expected to match the following three types of validity: (1) the agreement on expressiveness between the raw course portfolio and the learned course representations; (2) the consistency between the representations in each view and the unified representations; (3) the alignment between the course and MOOC platform representations. Therefore, we propose to exploit mutual information for preserving the validity of course representations. We conduct extensive experiments over real-world MOOC datasets to demonstrate the effectiveness of our proposed method.

preprint2023arXiv

Muse: Text-To-Image Generation via Masked Generative Transformers

We present Muse, a text-to-image Transformer model that achieves state-of-the-art image generation performance while being significantly more efficient than diffusion or autoregressive models. Muse is trained on a masked modeling task in discrete token space: given the text embedding extracted from a pre-trained large language model (LLM), Muse is trained to predict randomly masked image tokens. Compared to pixel-space diffusion models, such as Imagen and DALL-E 2, Muse is significantly more efficient due to the use of discrete tokens and requiring fewer sampling iterations; compared to autoregressive models, such as Parti, Muse is more efficient due to the use of parallel decoding. The use of a pre-trained LLM enables fine-grained language understanding, translating to high-fidelity image generation and the understanding of visual concepts such as objects, their spatial relationships, pose, cardinality etc. Our 900M parameter model achieves a new SOTA on CC3M, with an FID score of 6.06. The Muse 3B parameter model achieves an FID of 7.88 on zero-shot COCO evaluation, along with a CLIP score of 0.32. Muse also directly enables a number of image editing applications without the need to fine-tune or invert the model: inpainting, outpainting, and mask-free editing. More results are available at https://muse-model.github.io

preprint2022arXiv

Confident Learning: Estimating Uncertainty in Dataset Labels

Learning exists in the context of data, yet notions of confidence typically focus on model predictions, not label quality. Confident learning (CL) is an alternative approach which focuses instead on label quality by characterizing and identifying label errors in datasets, based on the principles of pruning noisy data, counting with probabilistic thresholds to estimate noise, and ranking examples to train with confidence. Whereas numerous studies have developed these principles independently, here, we combine them, building on the assumption of a class-conditional noise process to directly estimate the joint distribution between noisy (given) labels and uncorrupted (unknown) labels. This results in a generalized CL which is provably consistent and experimentally performant. We present sufficient conditions where CL exactly finds label errors, and show CL performance exceeding seven recent competitive approaches for learning with noisy labels on the CIFAR dataset. Uniquely, the CL framework is not coupled to a specific data modality or model (e.g., we use CL to find several label errors in the presumed error-free MNIST dataset and improve sentiment classification on text data in Amazon Reviews). We also employ CL on ImageNet to quantify ontological class overlap (e.g., estimating 645 "missile" images are mislabeled as their parent class "projectile"), and moderately increase model accuracy (e.g., for ResNet) by cleaning data prior to training. These results are replicable using the open-source cleanlab release.

preprint2022arXiv

Discrete Representations Strengthen Vision Transformer Robustness

Vision Transformer (ViT) is emerging as the state-of-the-art architecture for image recognition. While recent studies suggest that ViTs are more robust than their convolutional counterparts, our experiments find that ViTs trained on ImageNet are overly reliant on local textures and fail to make adequate use of shape information. ViTs thus have difficulties generalizing to out-of-distribution, real-world data. To address this deficiency, we present a simple and effective architecture modification to ViT's input layer by adding discrete tokens produced by a vector-quantized encoder. Different from the standard continuous pixel tokens, discrete tokens are invariant under small perturbations and contain less information individually, which promote ViTs to learn global information that is invariant. Experimental results demonstrate that adding discrete representation on four architecture variants strengthens ViT robustness by up to 12% across seven ImageNet robustness benchmarks while maintaining the performance on ImageNet.

preprint2022arXiv

Improved Masked Image Generation with Token-Critic

Non-autoregressive generative transformers recently demonstrated impressive image generation performance, and orders of magnitude faster sampling than their autoregressive counterparts. However, optimal parallel sampling from the true joint distribution of visual tokens remains an open challenge. In this paper we introduce Token-Critic, an auxiliary model to guide the sampling of a non-autoregressive generative transformer. Given a masked-and-reconstructed real image, the Token-Critic model is trained to distinguish which visual tokens belong to the original image and which were sampled by the generative transformer. During non-autoregressive iterative sampling, Token-Critic is used to select which tokens to accept and which to reject and resample. Coupled with Token-Critic, a state-of-the-art generative transformer significantly improves its performance, and outperforms recent diffusion models and GANs in terms of the trade-off between generated image quality and diversity, in the challenging class-conditional ImageNet generation.

preprint2022arXiv

MaskGIT: Masked Generative Image Transformer

Generative transformers have experienced rapid popularity growth in the computer vision community in synthesizing high-fidelity and high-resolution images. The best generative transformer models so far, however, still treat an image naively as a sequence of tokens, and decode an image sequentially following the raster scan ordering (i.e. line-by-line). We find this strategy neither optimal nor efficient. This paper proposes a novel image synthesis paradigm using a bidirectional transformer decoder, which we term MaskGIT. During training, MaskGIT learns to predict randomly masked tokens by attending to tokens in all directions. At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation. Our experiments demonstrate that MaskGIT significantly outperforms the state-of-the-art transformer model on the ImageNet dataset, and accelerates autoregressive decoding by up to 64x. Besides, we illustrate that MaskGIT can be easily extended to various image editing tasks, such as inpainting, extrapolation, and image manipulation.

preprint2022arXiv

Pyramid Adversarial Training Improves ViT Performance

Aggressive data augmentation is a key component of the strong generalization capabilities of Vision Transformer (ViT). One such data augmentation technique is adversarial training (AT); however, many prior works have shown that this often results in poor clean accuracy. In this work, we present pyramid adversarial training (PyramidAT), a simple and effective technique to improve ViT's overall performance. We pair it with a "matched" Dropout and stochastic depth regularization, which adopts the same Dropout and stochastic depth configuration for the clean and adversarial samples. Similar to the improvements on CNNs by AdvProp (not directly applicable to ViT), our pyramid adversarial training breaks the trade-off between in-distribution accuracy and out-of-distribution robustness for ViT and related architectures. It leads to 1.82% absolute improvement on ImageNet clean accuracy for the ViT-B model when trained only on ImageNet-1K data, while simultaneously boosting performance on 7 ImageNet robustness metrics, by absolute numbers ranging from 1.76% to 15.68%. We set a new state-of-the-art for ImageNet-C (41.42 mCE), ImageNet-R (53.92%), and ImageNet-Sketch (41.04%) without extra data, using only the ViT-B/16 backbone and our pyramid adversarial training. Our code is publicly available at pyramidat.github.io.

preprint2020arXiv

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

In this paper, we propose a new adversarial augmentation method for Neural Machine Translation (NMT). The main idea is to minimize the vicinal risk over virtual sentences sampled from two vicinity distributions, of which the crucial one is a novel vicinity distribution for adversarial sentences that describes a smooth interpolated embedding space centered around observed training sentence pairs. We then discuss our approach, AdvAug, to train NMT models using the embeddings of virtual sentences in sequence-to-sequence learning. Experiments on Chinese-English, English-French, and English-German translation benchmarks show that AdvAug achieves significant improvements over the Transformer (up to 4.9 BLEU points), and substantially outperforms other data augmentation techniques (e.g. back-translation) without using extra corpora.

preprint2020arXiv

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

Performing controlled experiments on noisy data is essential in understanding deep learning across noise levels. Due to the lack of suitable datasets, previous research has only examined deep learning on controlled synthetic label noise, and real-world label noise has never been studied in a controlled setting. This paper makes three contributions. First, we establish the first benchmark of controlled real-world label noise from the web. This new benchmark enables us to study the web label noise in a controlled setting for the first time. The second contribution is a simple but effective method to overcome both synthetic and real noisy labels. We show that our method achieves the best result on our dataset as well as on two public benchmarks (CIFAR and WebVision). Third, we conduct the largest study by far into understanding deep neural networks trained on noisy labels across different noise levels, noise types, network architectures, and training settings. The data and code are released at the following link: http://www.lujiang.info/cnlw.html

preprint2020arXiv

Neural Design Network: Graphic Layout Generation with Constraints

Graphic design is essential for visual communication with layouts being fundamental to composing attractive designs. Layout generation differs from pixel-level image synthesis and is unique in terms of the requirement of mutual relations among the desired components. We propose a method for design layout generation that can satisfy user-specified constraints. The proposed neural design network (NDN) consists of three modules. The first module predicts a graph with complete relations from a graph with user-specified relations. The second module generates a layout from the predicted graph. Finally, the third module fine-tunes the predicted layout. Quantitative and qualitative experiments demonstrate that the generated layouts are visually similar to real design layouts. We also construct real designs based on predicted layouts for a better understanding of the visual quality. Finally, we demonstrate a practical application on layout recommendation.

preprint2020arXiv

RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval

Image generation from scene description is a cornerstone technique for the controlled generation, which is beneficial to applications such as content creation and image editing. In this work, we aim to synthesize images from scene description with retrieved patches as reference. We propose a differentiable retrieval module. With the differentiable retrieval module, we can (1) make the entire pipeline end-to-end trainable, enabling the learning of better feature embedding for retrieval; (2) encourage the selection of mutually compatible patches with additional objective functions. We conduct extensive quantitative and qualitative experiments to demonstrate that the proposed method can generate realistic and diverse images, where the retrieved patches are reasonable and mutually compatible.

preprint2020arXiv

Revisiting EmbodiedQA: A Simple Baseline and Beyond

In Embodied Question Answering (EmbodiedQA), an agent interacts with an environment to gather necessary information for answering user questions. Existing works have laid a solid foundation towards solving this interesting problem. But the current performance, especially in navigation, suggests that EmbodiedQA might be too challenging for the contemporary approaches. In this paper, we empirically study this problem and introduce 1) a simple yet effective baseline that achieves promising performance; 2) an easier and practical setting for EmbodiedQA where an agent has a chance to adapt the trained model to a new environment before it actually answers users questions. In this new setting, we randomly place a few objects in new environments, and upgrade the agent policy by a distillation network to retain the generalization ability from the trained model. On the EmbodiedQA v1 benchmark, under the standard setting, our simple baseline achieves very competitive results to the-state-of-the-art; in the new setting, we found the introduced small change in settings yields a notable gain in navigation.

preprint2020arXiv

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction

This paper studies the problem of predicting future trajectories of people in unseen cameras of novel scenarios and views. We approach this problem through the real-data-free setting in which the model is trained only on 3D simulation data and applied out-of-the-box to a wide variety of real cameras. We propose a novel approach to learn robust representation through augmenting the simulation training data such that the representation can better generalize to unseen real-world test data. The key idea is to mix the feature of the hardest camera view with the adversarial feature of the original view. We refer to our method as SimAug. We show that SimAug achieves promising results on three real-world benchmarks using zero real training data, and state-of-the-art performance in the Stanford Drone and the VIRAT/ActEV dataset when using in-domain training data.

preprint2020arXiv

Simplifying Reinforced Feature Selection via Restructured Choice Strategy of Single Agent

Feature selection aims to select a subset of features to optimize the performances of downstream predictive tasks. Recently, multi-agent reinforced feature selection (MARFS) has been introduced to automate feature selection, by creating agents for each feature to select or deselect corresponding features. Although MARFS enjoys the automation of the selection process, MARFS suffers from not just the data complexity in terms of contents and dimensionality, but also the exponentially-increasing computational costs with regard to the number of agents. The raised concern leads to a new research question: Can we simplify the selection process of agents under reinforcement learning context so as to improve the efficiency and costs of feature selection? To address the question, we develop a single-agent reinforced feature selection approach integrated with restructured choice strategy. Specifically, the restructured choice strategy includes: 1) we exploit only one single agent to handle the selection task of multiple features, instead of using multiple agents. 2) we develop a scanning method to empower the single agent to make multiple selection/deselection decisions in each round of scanning. 3) we exploit the relevance to predictive labels of features to prioritize the scanning orders of the agent for multiple features. 4) we propose a convolutional auto-encoder algorithm, integrated with the encoded index information of features, to improve state representation. 5) we design a reward scheme that take into account both prediction accuracy and feature redundancy to facilitate the exploration process. Finally, we present extensive experimental results to demonstrate the efficiency and effectiveness of the proposed method.

preprint2020arXiv

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

This paper studies the problem of predicting the distribution over multiple possible future paths of people as they move through various visual scenes. We make two main contributions. The first contribution is a new dataset, created in a realistic 3D simulator, which is based on real world trajectory data, and then extrapolated by human annotators to achieve different latent goals. This provides the first benchmark for quantitative evaluation of the models to predict multi-future trajectories. The second contribution is a new model to generate multiple plausible future trajectories, which contains novel designs of using multi-scale location encodings and convolutional RNNs over graphs. We refer to our model as Multiverse. We show that our model achieves the best results on our dataset, as well as on the real-world VIRAT/ActEV dataset (which just contains one possible future).

preprint2016arXiv

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning

Learning video concept detectors automatically from the big but noisy web data with no additional manual annotations is a novel but challenging area in the multimedia and the machine learning community. A considerable amount of videos on the web are associated with rich but noisy contextual information, such as the title, which provides weak annotations or labels about the video content. To leverage the big noisy web labels, this paper proposes a novel method called WEbly-Labeled Learning (WELL), which is established on the state-of-the-art machine learning algorithm inspired by the learning process of human. WELL introduces a number of novel multi-modal approaches to incorporate meaningful prior knowledge called curriculum from the noisy web videos. To investigate this problem, we empirically study the curriculum constructed from the multi-modal features of the videos collected from YouTube and Flickr. The efficacy and the scalability of WELL have been extensively demonstrated on two public benchmarks, including the largest multimedia dataset and the largest manually-labeled video set. The comprehensive experimental results demonstrate that WELL outperforms state-of-the-art studies by a statically significant margin on learning concepts from noisy web video data. In addition, the results also verify that WELL is robust to the level of noisiness in the video data. Notably, WELL trained on sufficient noisy web labels is able to achieve a comparable accuracy to supervised learning methods trained on the clean manually-labeled data.

preprint2016arXiv

Strategies for Searching Video Content with Text Queries or Video Examples

The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search. However, metadata is often lacking for user-generated videos, thus these videos are unsearchable by current search engines. Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity problem by directly analyzing the visual and audio streams of each video. CBVR encompasses multiple research topics, including low-level feature design, feature fusion, semantic detector training and video search/reranking. We present novel strategies in these topics to enhance CBVR in both accuracy and speed under different query inputs, including pure textual queries and query by video examples. Our proposed strategies have been incorporated into our submission for the TRECVID 2014 Multimedia Event Detection evaluation, where our system outperformed other submissions in both text queries and video example queries, thus demonstrating the effectiveness of our proposed approaches.

preprint2016arXiv

What Objective Does Self-paced Learning Indeed Optimize?

Self-paced learning (SPL) is a recently raised methodology designed through simulating the learning principle of humans/animals. A variety of SPL realization schemes have been designed for different computer vision and pattern recognition tasks, and empirically substantiated to be effective in these applications. However, the investigation on its theoretical insight is still a blank. To this issue, this study attempts to provide some new theoretical understanding under the SPL scheme. Specifically, we prove that the solving strategy on SPL accords with a majorization minimization algorithm implemented on a latent objective function. Furthermore, we find that the loss function contained in this latent objective has a similar configuration with non-convex regularized penalty (NSPR) known in statistics and machine learning. Such connection inspires us discovering more intrinsic relationship between SPL regimes and NSPR forms, like SCAD, LOG and EXP. The robustness insight under SPL can then be finely explained. We also analyze the capability of SPL on its easy loss prior embedding property, and provide an insightful interpretation to the effectiveness mechanism under previous SPL variations. Besides, we design a group-partial-order loss prior, which is especially useful to weakly labeled large-scale data processing tasks. Through applying SPL with this loss prior to the FCVID dataset, which is currently one of the biggest manually annotated video dataset, our method achieves state-of-the-art performance beyond previous methods, which further helps supports the proposed theoretical arguments.

preprint2015arXiv

Selective control of oxygen sublattice stability by epitaxial strain in Ruddlesden-Popper films

Oxygen-defect control has long been considered an influential tuning knob for producing various property responses in complex oxide films. In addition to physical property changes, modification to the lattice structure, specifically lattice expansion, with increasing oxygen vacancy concentrations has been reported often and has become the convention for oxide materials. However, the current understanding of the lattice behavior in oxygen-deficient films becomes disputable when considering compounds containing different bonding environments or atomic layering. Moreover, tensile strain has recently been discovered to stabilize oxygen vacancies in epitaxial films, which further complicates the interpretation of lattice behavior resulting from their appearance. Here, we report on the selective strain control of oxygen vacancy formation and resulting lattice responses in the layered, Ruddlesden-Popper phases, La1.85Sr0.15CuO4. We found that a drastically reduced Gibbs free energy for oxygen vacancy formation near the typical growth temperature for tensile-strained epitaxial LSCO accounts for the large oxygen non-stoichiometry. Additionally, oxygen vacancies form preferentially in the equatorial position of the CuO2 plane, leading to a lattice contraction, rather than the expected expansion, observed with apical oxygen vacancies. Since oxygen stoichiometry plays a key role in determining the physical properties of many complex oxides, the strong strain coupling of oxygen nonstoichiometry and the unusual structural response reported here can provide new perspectives and understanding to the structure and property relationships of many other functional oxide materials.

preprint2014arXiv

Active control of magnetoresistance of organic spin valves using ferroelectricity

Organic spintronic devices have been appealing because of the long spin life time of the charge carriers in the organic materials and their low cost, flexibility and chemical diversity. In previous studies, the control of resistance of organic spin valves is generally achieved by the alignment of the magnetization directions of the two ferromagnetic electrodes, generating magnetoresistance.1 Here we employ a new knob to tune the resistance of organic spin valves by adding a thin ferroelectric interfacial layer between the ferromagnetic electrode and the organic spacer. We show that the resistance can be controlled by not only the spin alignment of the two ferromagnetic electrodes, but also by the electric polarization of the interfacial ferroelectric layer: the MR of the spin valve depends strongly on the history of the bias voltage which is correlated with the polarization of the ferroelectric layer; the MR even changes sign when the electric polarization of the ferroelectric layer is reversed. This new tunability can be understood in terms of the change of relative energy level alignment between ferromagnetic electrode and the organic spacer caused by the electric dipole moment of the ferroelectric layer. These findings enable active control of resistance using both electric and magnetic fields, opening up possibility for multi-state organic spin valves and shed light on the mechanism of the spin transport in organic spin valves.

preprint2014arXiv

Discrete-Time Output-Feedback Robust Repetitive Control for a Class of Nonlinear Systems by Additive State Decomposition

The discrete-time robust repetitive control (RC, or repetitive controller, also designated RC) problem for nonlinear systems is both challenging and practical. This paper proposes a discrete-time output-feedback RC design for a class of systems subject to measurable nonlinearities to track reference robustly with respect to the period variation. The design relies on additive state decomposition, by which the output-feedback RC problem is decomposed into an output-feedback RC problem for a linear time-invariant system and a state-feedback stabilization problem for a nonlinear system. Thanks to the decomposition, existing controller design methods in both the frequency domain and time domain can be employed to make the robustness and discretization for a nonlinear system tractable. To demonstrate the effectiveness, an illustrative example is given.

preprint2013arXiv

Growth diagram of La0.7Sr0.3MnO3 thin films using pulsed laser deposition

An experimental study was conducted on controlling the growth mode of La0.7Sr0.3MnO3 thin films on SrTiO3 substrates using pulsed laser deposition (PLD) by tuning growth temperature, pressure and laser fluence. Different thin film morphology, crystallinity and stoichiometry have been observed depending on growth parameters. To understand the microscopic origin, the adatom nucleation, step advance processes and their relationship to film growth were theoretically analyzed and a growth diagram was constructed. Three boundaries between highly and poorly crystallized growth, 2D and 3D growth, stoichiometric and non-stoichiometric growth were identified in the growth diagram. A good fit of our experimental observation with the growth diagram was found. This case study demonstrates that a more comprehensive understanding of the growth mode in PLD is possible.

preprint2013arXiv

Tunneling Electroresistance Induced by Interfacial Phase Transitions in Ultrathin Oxide Heterostructures

The ferroelectric (FE) control of electronic transport is one of the emerging technologies in oxide heterostructures. Many previous studies in FE tunnel junctions (FTJs) exploited solely the differences in the electrostatic potential across the FTJs that are induced by changes in the FE polarization direction. Here, we show that in practice the junction current ratios between the two polarization states can be further enhanced by the electrostatic modification in the correlated electron oxide electrodes, and that FTJs with nanometer thin layers can effectively produce a considerably large electroresistance ratio at room temperature. To understand these surprising results, we employed an additional control parameter, which is related to the crossing of electronic and magnetic phase boundaries of the correlated electron oxide. The FE-induced phase modulation at the heterointerface ultimately results in an enhanced electroresistance effect. Our study highlights that the strong coupling between degrees of freedom across heterointerfaces could yield versatile and novel applications in oxide electronics.

preprint2012arXiv

Strongly coupled phase transition in ferroelectric/correlated electron oxide heterostructures

We fabricated ultrathin ferroelectric/correlated electron oxide heterostructures composed of the ferroelectric Pb(Zr0.2Ti0.8)O3 and the correlated electron oxide (CEO) La0.8Sr0.2MnO3 on SrTiO3 substrates by pulsed laser epitaxy. The hole accumulation in the ultrathin CEO layer was substantially modified by heterostructuring with the ferroelectric layer, resulting in an insulator-metal transition. In particular, our thickness dependent study showed that drastic changes in transport and magnetic properties were strongly coupled to the modulation of charge carriers by ferroelectric field effect, which was confined to the vicinity of the interface. Thus, our results provide crucial evidence that strong ferroelectric field effect control can be achieved in ultrathin (10 nm) heterostructures, yielding at least a 100,000-fold change in resistivity.

Lu Jiang

What is connected

Connect this record

See the researcher in context

Building this map preview

28 published item(s)

From Passive Reuse to Active Reasoning: Grounding Large Language Models for Neuro-Symbolic Experience Replay

Uni-FinLLM: A Unified Multimodal Large Language Model with Modular Task Heads for Micro-Level Stock Prediction and Macro-Level Systemic Risk Assessment

UVE: Are MLLMs Unified Evaluators for AI-Generated Videos?

A Multi-Source Information Learning Framework for Airbnb Price Prediction

Multi-View MOOC Quality Evaluation via Information-Aware Graph Representation Learning

Muse: Text-To-Image Generation via Masked Generative Transformers

Confident Learning: Estimating Uncertainty in Dataset Labels

Discrete Representations Strengthen Vision Transformer Robustness

Improved Masked Image Generation with Token-Critic

MaskGIT: Masked Generative Image Transformer

Pyramid Adversarial Training Improves ViT Performance

AdvAug: Robust Adversarial Augmentation for Neural Machine Translation

Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels

Neural Design Network: Graphic Layout Generation with Constraints

RetrieveGAN: Image Synthesis via Differentiable Patch Retrieval

Revisiting EmbodiedQA: A Simple Baseline and Beyond

SimAug: Learning Robust Representations from Simulation for Trajectory Prediction

Simplifying Reinforced Feature Selection via Restructured Choice Strategy of Single Agent

The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction

Exploiting Multi-modal Curriculum in Noisy Web Data for Large-scale Concept Learning

Strategies for Searching Video Content with Text Queries or Video Examples

What Objective Does Self-paced Learning Indeed Optimize?

Selective control of oxygen sublattice stability by epitaxial strain in Ruddlesden-Popper films

Active control of magnetoresistance of organic spin valves using ferroelectricity

Discrete-Time Output-Feedback Robust Repetitive Control for a Class of Nonlinear Systems by Additive State Decomposition

Growth diagram of La0.7Sr0.3MnO3 thin films using pulsed laser deposition

Tunneling Electroresistance Induced by Interfacial Phase Transitions in Ultrathin Oxide Heterostructures

Strongly coupled phase transition in ferroelectric/correlated electron oxide heterostructures