Researcher profile

Lei Hou

Lei Hou contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
11topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2026arXiv

MAIC-UI: Making Interactive Courseware with Generative UI

Creating interactive STEM courseware traditionally requires HTML/CSS/JavaScript expertise, leaving barriers for educators. While generative AI can produce HTML codes, existing tools generate static presentations rather than interactive simulations, struggle with long documents, and lack pedagogical accuracy mechanisms. Furthermore, full regeneration for modifications requires 200--600 seconds, disrupting creative flow. We present MAIC-UI, a zero-code authoring system that enables educators to create and rapidly edit interactive courseware from textbooks, PPTs, and PDFs. MAIC-UI employs: (1) structured knowledge analysis with multi-modal understanding to ensure pedagogical rigor; (2) a two-stage generate-verify-optimize pipeline separating content alignment from visual refinement; and (3) Click-to-Locate editing with Unified Diff-based incremental generation achieving sub-10-second iteration cycles. A controlled lab study with 40 participants shows MAIC-UI reduces editing iterations (4.9 vs. 7.0) and significantly improves learnability and controllability compared to direct Text-to-HTML generation. A three-month classroom deployment with 53 high school students demonstrates that MAIC-UI fosters learning agency and reduces outcome disparities -- the pilot class achieved 9.21-point gains in STEM subjects compared to -2.32 points in control classes. Our code is available at https://github.com/THU-MAIC/MAIC-UI.

preprint2023arXiv

Syntactically Robust Training on Partially-Observed Data for Open Information Extraction

Open Information Extraction models have shown promising results with sufficient supervision. However, these models face a fundamental challenge that the syntactic distribution of training data is partially observable in comparison to the real world. In this paper, we propose a syntactically robust training framework that enables models to be trained on a syntactic-abundant distribution based on diverse paraphrase generation. To tackle the intrinsic problem of knowledge deformation of paraphrasing, two algorithms based on semantic similarity matching and syntactic tree walking are used to restore the expressionally transformed knowledge. The training framework can be generally applied to other syntactic partial observable domains. Based on the proposed framework, we build a new evaluation set called CaRB-AutoPara, a syntactically diverse dataset consistent with the real-world setting for validating the robustness of the models. Experiments including a thorough analysis show that the performance of the model degrades with the increase of the difference in syntactic distribution, while our framework gives a robust boundary. The source code is publicly available at https://github.com/qijimrc/RobustOIE.

preprint2022arXiv

A Roadmap for Big Model

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.

preprint2022arXiv

A Wavelet Transform and self-supervised learning-based framework for bearing fault diagnosis with limited labeled data

Traditional supervised bearing fault diagnosis methods rely on massive labelled data, yet annotations may be very time-consuming or infeasible. The fault diagnosis approach that utilizes limited labelled data is becoming increasingly popular. In this paper, a Wavelet Transform (WT) and self-supervised learning-based bearing fault diagnosis framework is proposed to address the lack of supervised samples issue. Adopting the WT and cubic spline interpolation technique, original measured vibration signals are converted to the time-frequency maps (TFMs) with a fixed scale as inputs. The Vision Transformer (ViT) is employed as the encoder for feature extraction, and the self-distillation with no labels (DINO) algorithm is introduced in the proposed framework for self-supervised learning with limited labelled data and sufficient unlabeled data. Two rolling bearing fault datasets are used for validations. In the case of both datasets only containing 1% labelled samples, utilizing the feature vectors extracted by the trained encoder without fine-tuning, over 90\% average diagnosis accuracy can be obtained based on the simple K-Nearest Neighbor (KNN) classifier. Furthermore, the superiority of the proposed method is demonstrated in comparison with other self-supervised fault diagnosis methods.

preprint2022arXiv

KQA Pro: A Dataset with Explicit Compositional Programs for Complex Question Answering over Knowledge Base

Complex question answering over knowledge base (Complex KBQA) is challenging because it requires various compositional reasoning capabilities, such as multi-hop inference, attribute comparison, set operation. Existing benchmarks have some shortcomings that limit the development of Complex KBQA: 1) they only provide QA pairs without explicit reasoning processes; 2) questions are poor in diversity or scale. To this end, we introduce KQA Pro, a dataset for Complex KBQA including ~120K diverse natural language questions. We introduce a compositional and interpretable programming language KoPL to represent the reasoning process of complex questions. For each question, we provide the corresponding KoPL program and SPARQL query, so that KQA Pro serves for both KBQA and semantic parsing tasks. Experimental results show that SOTA KBQA methods cannot achieve promising results on KQA Pro as on current datasets, which suggests that KQA Pro is challenging and Complex KBQA requires further research efforts. We also treat KQA Pro as a diagnostic dataset for testing multiple reasoning skills, conduct a thorough evaluation of existing models and discuss further directions for Complex KBQA. Our codes and datasets can be obtained from https://github.com/shijx12/KQAPro_Baselines.

preprint2022arXiv

LEVEN: A Large-Scale Chinese Legal Event Detection Dataset

Recognizing facts is the most fundamental step in making judgments, hence detecting events in the legal documents is important to legal case analysis tasks. However, existing Legal Event Detection (LED) datasets only concern incomprehensive event types and have limited annotated data, which restricts the development of LED methods and their downstream applications. To alleviate these issues, we present LEVEN a large-scale Chinese LEgal eVENt detection dataset, with 8,116 legal documents and 150,977 human-annotated event mentions in 108 event types. Not only charge-related events, LEVEN also covers general events, which are critical for legal case understanding but neglected in existing LED datasets. To our knowledge, LEVEN is the largest LED dataset and has dozens of times the data scale of others, which shall significantly promote the training and evaluation of LED methods. The results of extensive experiments indicate that LED is challenging and needs further effort. Moreover, we simply utilize legal events as side information to promote downstream applications. The method achieves improvements of average 2.2 points precision in low-resource judgment prediction, and 1.5 points mean average precision in unsupervised case retrieval, which suggests the fundamentality of LED. The source code and dataset can be obtained from https://github.com/thunlp/LEVEN.

preprint2022arXiv

Multimodal Entity Tagging with Multimodal Knowledge Base

To enhance research on multimodal knowledge base and multimodal information processing, we propose a new task called multimodal entity tagging (MET) with a multimodal knowledge base (MKB). We also develop a dataset for the problem using an existing MKB. In an MKB, there are entities and their associated texts and images. In MET, given a text-image pair, one uses the information in the MKB to automatically identify the related entity in the text-image pair. We solve the task by using the information retrieval paradigm and implement several baselines using state-of-the-art methods in NLP and CV. We conduct extensive experiments and make analyses on the experimental results. The results show that the task is challenging, but current technologies can achieve relatively high performance. We will release the dataset, code, and models for future research.

preprint2022arXiv

Program Transfer for Answering Complex Questions over Knowledge Bases

Program induction for answering complex questions over knowledge bases (KBs) aims to decompose a question into a multi-step program, whose execution against the KB produces the final answer. Learning to induce programs relies on a large number of parallel question-program pairs for the given KB. However, for most KBs, the gold program annotations are usually lacking, making learning difficult. In this paper, we propose the approach of program transfer, which aims to leverage the valuable program annotations on the rich-resourced KBs as external supervision signals to aid program induction for the low-resourced KBs that lack program annotations. For program transfer, we design a novel two-stage parsing framework with an efficient ontology-guided pruning strategy. First, a sketch parser translates the question into a high-level program sketch, which is the composition of functions. Second, given the question and sketch, an argument parser searches the detailed arguments from the KB for functions. During the searching, we incorporate the KB ontology to prune the search space. The experiments on ComplexWebQuestions and WebQuestionSP show that our method outperforms SOTA methods significantly, demonstrating the effectiveness of program transfer and our framework. Our codes and datasets can be obtained from https://github.com/THU-KEG/ProgramTransfer.

preprint2022arXiv

Schema-Free Dependency Parsing via Sequence Generation

Dependency parsing aims to extract syntactic dependency structure or semantic dependency structure for sentences. Existing methods suffer the drawbacks of lacking universality or highly relying on the auxiliary decoder. To remedy these drawbacks, we propose to achieve universal and schema-free Dependency Parsing (DP) via Sequence Generation (SG) DPSG by utilizing only the pre-trained language model (PLM) without any auxiliary structures or parsing algorithms. We first explore different serialization designing strategies for converting parsing structures into sequences. Then we design dependency units and concatenate these units into the sequence for DPSG. Thanks to the high flexibility of the sequence generation, our DPSG can achieve both syntactic DP and semantic DP using a single model. By concatenating the prefix to indicate the specific schema with the sequence, our DPSG can even accomplish multi-schemata parsing. The effectiveness of our DPSG is demonstrated by the experiments on widely used DP benchmarks, i.e., PTB, CODT, SDP15, and SemEval16. DPSG achieves comparable results with the first-tier methods on all the benchmarks and even the state-of-the-art (SOTA) performance in CODT and SemEval16. This paper demonstrates our DPSG has the potential to be a new parsing paradigm. We will release our codes upon acceptance.

preprint2022arXiv

Towards a General Pre-training Framework for Adaptive Learning in MOOCs

Adaptive learning aims to stimulate and meet the needs of individual learners, which requires sophisticated system-level coordination of diverse tasks, including modeling learning resources, estimating student states, and making personalized recommendations. Existing deep learning methods have achieved great success over statistical models; however, they still lack generalization for diverse tasks and suffer from insufficient capacity since they are composed of highly-coupled task-specific architectures and rely on small-scale, coarse-grained recommendation scenarios. To realize the idea of general adaptive systems proposed in pedagogical theory, with the emerging pre-training techniques in NLP, we try to conduct a practical exploration on applying pre-training to adaptive learning, to propose a unified framework based on data observation and learning style analysis, properly leveraging heterogeneous learning elements. Through a series of downstream tasks of Learning Recommendation, Learning Resource Evaluation, Knowledge Tracing, and Dropout Prediction, we find that course structures, text, and knowledge are helpful for modeling and inherently coherent to student non-sequential learning behaviors and that indirectly relevant information included in the pre-training foundation can be shared across downstream tasks to facilitate effectiveness. We finally build a simplified systematic application of adaptive learning and reflect on the insights brought back to pedagogy. The source code and dataset will be released.

preprint2019arXiv

Memories in the Photoluminescence Intermittency of Single Cesium Lead Bromide Nanocrystals

Single cesium lead bromide (CsPbBr3) nanocrystals show strong photoluminescence blinking, with on- and off- dwelling times following power-law distributions. We investigate the memory effect in the photoluminescence blinking of single CsPbBr3 nanocrystals and find positive correlations for successive on-times and successive off-times. This memory effect is not sensitive to the nature of the surface capping ligand and the embedding polymer. These observations suggest that photoluminescence intermittency and its memory are mainly controlled by intrinsic traps in the nanocrystals. These findings will help optimizing light-emitting devices based on inorganic perovskite nanocrystals.