Researcher profile

Kai Gao

Kai Gao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
11works
0followers
8topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

11 published item(s)

preprint2022arXiv

Consistent Representation Learning for Continual Relation Extraction

Continual relation extraction (CRE) aims to continuously train a model on data with new relations while avoiding forgetting old ones. Some previous work has proved that storing a few typical samples of old relations and replaying them when learning new relations can effectively avoid forgetting. However, these memory-based methods tend to overfit the memory samples and perform poorly on imbalanced datasets. To solve these challenges, a consistent representation learning method is proposed, which maintains the stability of the relation embedding by adopting contrastive learning and knowledge distillation when replaying memory. Specifically, supervised contrastive learning based on a memory bank is first used to train each new task so that the model can effectively learn the relation representation. Then, contrastive replay is conducted of the samples in memory and makes the model retain the knowledge of historical relations through memory knowledge distillation to prevent the catastrophic forgetting of the old task. The proposed method can better learn consistent representations to alleviate forgetting effectively. Extensive experiments on FewRel and TACRED datasets show that our method significantly outperforms state-of-the-art baselines and yield strong robustness on the imbalanced dataset.

preprint2022arXiv

Continual Machine Reading Comprehension via Uncertainty-aware Fixed Memory and Adversarial Domain Adaptation

Continual Machine Reading Comprehension aims to incrementally learn from a continuous data stream across time without access the previous seen data, which is crucial for the development of real-world MRC systems. However, it is a great challenge to learn a new domain incrementally without catastrophically forgetting previous knowledge. In this paper, MA-MRC, a continual MRC model with uncertainty-aware fixed Memory and Adversarial domain adaptation, is proposed. In MA-MRC, a fixed size memory stores a small number of samples in previous domain data along with an uncertainty-aware updating strategy when new domain data arrives. For incremental learning, MA-MRC not only keeps a stable understanding by learning both memory and new domain data, but also makes full use of the domain adaptation relationship between them by adversarial learning strategy. The experimental results show that MA-MRC is superior to strong baselines and has a substantial incremental learning ability without catastrophically forgetting under two different continual MRC settings.

preprint2022arXiv

Demystifying Software Release Note Issues on GitHub

Release notes (RNs) summarize main changes between two consecutive software versions and serve as a central source of information when users upgrade software. While producing high quality RNs can be hard and poses a variety of challenges to developers, a comprehensive empirical understanding of these challenges is still lacking. In this paper, we bridge this knowledge gap by manually analyzing 1,731 latest GitHub issues to build a comprehensive taxonomy of RN issues with four dimensions: Content, Presentation, Accessibility, and Production. Among these issues, nearly half (48.47%) of them focus on Production; Content, Accessibility, and Presentation take 25.61%, 17.65%, and 8.27%, respectively. We find that: 1) RN producers are more likely to miss information than to include incorrect information, especially for breaking changes; 2) improper layout may bury important information and confuse users; 3) many users find RNs inaccessible due to link deterioration, lack of notification, and obfuscate RN locations; 4) automating and regulating RN production remains challenging despite the great needs of RN producers. Our taxonomy not only pictures a roadmap to improve RN production in practice but also reveals interesting future research directions for automating RN production.

preprint2022arXiv

M-SENA: An Integrated Platform for Multimodal Sentiment Analysis

M-SENA is an open-sourced platform for Multimodal Sentiment Analysis. It aims to facilitate advanced research by providing flexible toolkits, reliable benchmarks, and intuitive demonstrations. The platform features a fully modular video sentiment analysis framework consisting of data management, feature extraction, model training, and result analysis modules. In this paper, we first illustrate the overall architecture of the M-SENA platform and then introduce features of the core modules. Reliable baseline results of different modality features and MSA benchmarks are also reported. Moreover, we use model evaluation and analysis tools provided by M-SENA to present intermediate representation visualization, on-the-fly instance test, and generalization ability test results. The source code of the platform is publicly available at https://github.com/thuiar/M-SENA.

preprint2022arXiv

On Minimizing the Number of Running Buffers for Tabletop Rearrangement

For tabletop rearrangement problems with overhand grasps, storage space outside the tabletop workspace, or buffers, can temporarily hold objects which greatly facilitates the resolution of a given rearrangement task. This brings forth the natural question of how many running buffers are required so that certain classes of tabletop rearrangement problems are feasible. In this work, we examine the problem for both the labeled (where each object has a specific goal pose) and the unlabeled (where goal poses of objects are interchangeable) settings. On the structural side, we observe that finding the minimum number of running buffers (MRB) can be carried out on a dependency graph abstracted from a problem instance, and show that computing MRB on dependency graphs is NP-hard. We then prove that under both labeled and unlabeled settings, even for uniform cylindrical objects, the number of required running buffers may grow unbounded as the number of objects to be rearranged increases; we further show that the bound for the unlabeled case is tight. On the algorithmic side, we develop highly effective algorithms for finding MRB for both labeled and unlabeled tabletop rearrangement problems, scalable to over a hundred objects under very high object density. Employing these algorithms, empirical evaluations show that random labeled and unlabeled instances, which more closely mimics real-world setups, have much smaller MRBs.

preprint2022arXiv

Persistent Homology for Effective Non-Prehensile Manipulation

This work explores the use of topological tools for achieving effective non-prehensile manipulation in cluttered, constrained workspaces. In particular, it proposes the use of persistent homology as a guiding principle in identifying the appropriate non-prehensile actions, such as pushing, to clean a cluttered space with a robotic arm so as to allow the retrieval of a target object. Persistent homology enables the automatic identification of connected components of blocking objects in the space without the need for manual input or tuning of parameters. The proposed algorithm uses this information to push groups of cylindrical objects together and aims to minimize the number of pushing actions needed to reach to the target. Simulated experiments in a physics engine using a model of the Baxter robot show that the proposed topology-driven solution is achieving significantly higher success rate in solving such constrained problems relatively to state-of-the-art alternatives from the literature. It manages to keep the number of pushing actions low, is computationally efficient and the resulting decisions and motion appear natural for effectively solving such tasks.

preprint2022arXiv

Toward Efficient Task Planning for Dual-Arm Tabletop Object Rearrangement

We investigate the problem of coordinating two robot arms to solve non-monotone tabletop multi-object rearrangement tasks. In a non-monotone rearrangement task, complex object-object dependencies exist that require moving some objects multiple times to solve an instance. In working with two arms in a large workspace, some objects must be handed off between the robots, which further complicates the planning process. For the challenging dual-arm tabletop rearrangement problem, we develop effective task planning algorithms for scheduling the pick-n-place sequence that can be properly distributed between the two arms. We show that, even without using a sophisticated motion planner, our method achieves significant time savings in comparison to greedy approaches and naive parallelization of single-robot plans.

preprint2022arXiv

Uniform Object Rearrangement: From Complete Monotone Primitives to Efficient Non-Monotone Informed Search

Object rearrangement is a widely-applicable and challenging task for robots. Geometric constraints must be carefully examined to avoid collisions and combinatorial issues arise as the number of objects increases. This work studies the algorithmic structure of rearranging uniform objects, where robot-object collisions do not occur but object-object collisions have to be avoided. The objective is minimizing the number of object transfers under the assumption that the robot can manipulate one object at a time. An efficiently computable decomposition of the configuration space is used to create a "region graph", which classifies all continuous paths of equivalent collision possibilities. Based on this compact but rich representation, a complete dynamic programming primitive DFSDP performs a recursive depth first search to solve monotone problems quickly, i.e., those instances that do not require objects to be moved first to an intermediate buffer. DFSDP is extended to solve single-buffer, non-monotone instances, given a choice of an object and a buffer. This work utilizes these primitives as local planners in an informed search framework for more general, non-monotone instances. The search utilizes partial solutions from the primitives to identify the most promising choice of objects and buffers. Experiments demonstrate that the proposed solution returns near-optimal paths with higher success rate, even for challenging non-monotone instances, than other leading alternatives.

preprint2021arXiv

TEXTOIR: An Integrated and Visualized Platform for Text Open Intent Recognition

TEXTOIR is the first integrated and visualized platform for text open intent recognition. It is composed of two main modules: open intent detection and open intent discovery. Each module integrates most of the state-of-the-art algorithms and benchmark intent datasets. It also contains an overall framework connecting the two modules in a pipeline scheme. In addition, this platform has visualized tools for data and model management, training, evaluation and analysis of the performance from different aspects. TEXTOIR provides useful toolkits and convenient visualized interfaces for each sub-module (Toolkit code: https://github.com/thuiar/TEXTOIR), and designs a framework to implement a complete process to both identify known intents and discover open intents (Demo code: https://github.com/thuiar/TEXTOIR-DEMO).

preprint2020arXiv

A hybrid eikonal solver for accurate first-arrival traveltime computation in anisotropic media with strong contrasts

First-arrival traveltime computation is crucial for many applications such as traveltime tomography, Kirchhoff migration, etc. There exist two major issues in conventional eikonal solvers: the source singularity issue and insufficient numerical accuracy in complex media. Some existing eikonal solvers also exhibit the stability issue in media with strong contrasts in medium properties. We develop a stable and accurate hybrid eikonal solver for 2D and 3D transversely isotropic media with a tilted symmetry axis (TTI, or tilted transversely isotropic media). Our new eikonal solver combines the traveltime field factorization technique, the third-order Lax-Friedrichs update scheme, and a new method for computing the base traveltime field. The solver has the following three advantages. First, there is no need to assign exact traveltime values in the near-source region, and the computed traveltime field near the source location is accurate even for TTI media with strong anisotropy. Second, the computed traveltime field is high-order accurate in space. Third, the solver is numerically stable for 2D and 3D TTI media with strong anisotropy, complex structures, and strong contrasts in medium properties. We verify the stability and accuracy of our hybrid eikonal solver using several 2D and 3D TTI medium examples. The results show that our solver is stable and accurate in 2D and 3D complex TTI media, producing first-arrival traveltime fields that are consistent with full-wavefield solutions.

preprint2020arXiv

Three-dimensional seismic characterization and imaging of the Soda Lake geothermal field

Accurate characterization of subsurface geophysical properties and detection of the fault system are essential for geothermal energy exploration and production. The Soda Lake geothermal field is in western Nevada with a complex fault system. Previous seismic characterization only produced a low-resolution, smooth velocity model along with a simple, conceptual fault model. Using optimized correlation-based full-waveform inversion, wavefield-separation-based reverse-time migration, and automatic fault detection techniques, we present 3D seismic characterization for the Soda Lake geothermal field using 3D surface seismic data acquired with Vibroseis sources. We obtain 3D high-resolution velocity, density, and acoustic impedance models, 3D seismic images with different grid spacings, and a high-resolution fault system. Consistency check between the constructed faults and currently active injection and production geothermal wells verifies that our seismic inversion and imaging results and detected faults are reliable. These results can provide valuable information for optimizing well placement and geothermal energy production at the Soda Lake geothermal field.