Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
23works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

23 published item(s)

preprint2026arXiv

Beyond Retrieval: A Multitask Benchmark and Model for Code Search

Code search has usually been evaluated as first-stage retrieval, even though production systems rely on broader pipelines with reranking and developer-style queries. Existing benchmarks also suffer from data contamination, label noise, and degenerate binary relevance. In this paper, we introduce \textsc{CoREB}, a contamination-limited, multitask \underline{co}de \underline{r}etrieval and r\underline{e}ranking \underline{b}enchmark, together with a fine-tuned code reranker, that goes beyond retrieval to cover the full code search pipeline. \textsc{CoREB} is built from counterfactually rewritten LiveCodeBench problems in five programming languages and delivered as timed releases with graded relevance judgments. We benchmark eleven embedding models and five rerankers across three tasks: text-to-code, code-to-text, and code-to-code. Our experiments reveal that: \circone code-specialised embeddings dominate code-to-code retrieval (${\sim}2{\times}$ over general encoders), yet no single model wins all three tasks; \circtwo short keyword queries, the format closest to real developer search, collapse every model to near-zero nDCG@10; \circthree off-the-shelf rerankers are task-asymmetric, with a 12-point swing on code-to-code and no baseline net-positive across all tasks; \circfour our fine-tuned \textsc{CoREB-Reranker} is the first to achieve consistent gains across all three tasks. The data and model are released.

preprint2026arXiv

Decompose to Understand, Fuse to Detect: Frequency-Decoupled Anomaly Detection for Encrypted Network Traffic

Network traffic anomaly detection represents a critical cybersecurity task, yet widespread encryption makes this task increasingly challenging. In response, image-based methods that model traffic as visual patterns have emerged as the dominant approach. However, this work pioneers the identification of a pervasive ``full-frequency'' characteristic and an associated limitation termed ``spectral mismatch'' within this paradigm. Specifically, while encrypted traffic exhibits prominent high-frequency components, mainstream reconstruction methods demonstrate an inherent bias toward learning low-frequency information. This fundamental mismatch results in incomplete representations that consequently degrade anomaly detection performance. To address this challenge, we propose FreeUp, a novel frequency-decoupled framework designed explicitly for encrypted traffic analysis. FreeUp decomposes traffic data into distinct low- and high-frequency bands, processing them through separate, dedicated branches along with a customized training strategy that ensures stable and independent frequency-specific learning. Furthermore, recognizing that simple reconstruction error proves inadequate for evaluating dual-branch architectures, we introduce an uncertainty-inspired fusion scoring mechanism. This mechanism quantifies the reconstruction uncertainty of the frequency-specific branches and dynamically integrates their outputs, yielding a more comprehensive and reliable anomaly score. Extensive experiments across multiple benchmarks demonstrate that FreeUp consistently outperforms state-of-the-art baselines. The code is available at https://github.com/ikun0124/FreeUp.

preprint2024arXiv

DB-GPT: Empowering Database Interactions with Private Large Language Models

The recent breakthroughs in large language models (LLMs) are positioned to transition many areas of software. Database technologies particularly have an important entanglement with LLMs as efficient and intuitive database interactions are paramount. In this paper, we present DB-GPT, a revolutionary and production-ready project that integrates LLMs with traditional database systems to enhance user experience and accessibility. DB-GPT is designed to understand natural language queries, provide context-aware responses, and generate complex SQL queries with high accuracy, making it an indispensable tool for users ranging from novice to expert. The core innovation in DB-GPT lies in its private LLM technology, which is fine-tuned on domain-specific corpora to maintain user privacy and ensure data security while offering the benefits of state-of-the-art LLMs. We detail the architecture of DB-GPT, which includes a novel retrieval augmented generation (RAG) knowledge system, an adaptive learning mechanism to continuously improve performance based on user feedback and a service-oriented multi-model framework (SMMF) with powerful data-driven agents. Our extensive experiments and user studies confirm that DB-GPT represents a paradigm shift in database interactions, offering a more natural, efficient, and secure way to engage with data repositories. The paper concludes with a discussion of the implications of DB-GPT framework on the future of human-database interaction and outlines potential avenues for further enhancements and applications in the field. The project code is available at https://github.com/eosphoros-ai/DB-GPT. Experience DB-GPT for yourself by installing it with the instructions https://github.com/eosphoros-ai/DB-GPT#install and view a concise 10-minute video at https://www.youtube.com/watch?v=KYs4nTDzEhk.

preprint2023arXiv

Gap Minimization for Knowledge Sharing and Transfer

Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades. In order to successfully transfer information from one task to another, it is critical to understand the similarities and differences between the domains. In this paper, we introduce the notion of \emph{performance gap}, an intuitive and novel measure of the distance between learning tasks. Unlike existing measures which are used as tools to bound the difference of expected risks between tasks (e.g., $\mathcal{H}$-divergence or discrepancy distance), we theoretically show that the performance gap can be viewed as a data- and algorithm-dependent regularizer, which controls the model complexity and leads to finer guarantees. More importantly, it also provides new insights and motivates a novel principle for designing strategies for knowledge sharing and transfer: gap minimization. We instantiate this principle with two algorithms: 1. gapBoost, a novel and principled boosting algorithm that explicitly minimizes the performance gap between source and target domains for transfer learning; and 2. gapMTNN, a representation learning algorithm that reformulates gap minimization as semantic conditional matching for multitask learning. Our extensive evaluation on both transfer learning and multitask learning benchmark data sets shows that our methods outperform existing baselines.

preprint2023arXiv

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

Reinforcement learning (RL) is a powerful machine learning technique that enables an intelligent agent to learn an optimal policy that maximizes the cumulative rewards in sequential decision making. Most of methods in the existing literature are developed in \textit{online} settings where the data are easy to collect or simulate. Motivated by high stake domains such as mobile health studies with limited and pre-collected data, in this paper, we study \textit{offline} reinforcement learning methods. To efficiently use these datasets for policy optimization, we propose a novel value enhancement method to improve the performance of a given initial policy computed by existing state-of-the-art RL algorithms. Specifically, when the initial policy is not consistent, our method will output a policy whose value is no worse and often better than that of the initial policy. When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired ``value enhancement" property. The proposed method is generally applicable to any parametrized policy that belongs to certain pre-specified function class (e.g., deep neural networks). Extensive numerical studies are conducted to demonstrate the superior performance of our method.

preprint2022arXiv

Automatic Registration of Images with Inconsistent Content Through Line-Support Region Segmentation and Geometrical Outlier Removal

The implementation of automatic image registration is still difficult in various applications. In this paper, an automatic image registration approach through line-support region segmentation and geometrical outlier removal (ALRS-GOR) is proposed. This new approach is designed to address the problems associated with the registration of images with affine deformations and inconsistent content, such as remote sensing images with different spectral content or noise interference, or map images with inconsistent annotations. To begin with, line-support regions, namely a straight region whose points share roughly the same image gradient angle, are extracted to address the issues of inconsistent content existing in images. To alleviate the incompleteness of line segments, an iterative strategy with multi-resolution is employed to preserve global structures that are masked at full resolution by image details or noise. Then, Geometrical Outlier Removal (GOR) is developed to provide reliable feature point matching, which is based on affineinvariant geometrical classifications for corresponding matches initialized by SIFT. The candidate outliers are selected by comparing the disparity of accumulated classifications among all matches, instead of conventional methods which only rely on local geometrical relations. Various image sets have been considered in this paper for the evaluation of the proposed approach, including aerial images with simulated affine deformations, remote sensing optical and synthetic aperture radar images taken at different situations (multispectral, multisensor, and multitemporal), and map images with inconsistent annotations. Experimental results demonstrate the superior performance of the proposed method over the existing approaches for the whole data set.

preprint2022arXiv

Domain Generalization via Optimal Transport with Metric Similarity Learning

Generalizing knowledge to unseen domains, where data and labels are unavailable, is crucial for machine learning models. We tackle the domain generalization problem to learn from multiple source domains and generalize to a target domain with unknown statistics. The crucial idea is to extract the underlying invariant features across all the domains. Previous domain generalization approaches mainly focused on learning invariant features and stacking the learned features from each source domain to generalize to a new target domain while ignoring the label information, which will lead to indistinguishable features with an ambiguous classification boundary. For this, one possible solution is to constrain the label-similarity when extracting the invariant features and to take advantage of the label similarities for class-specific cohesion and separation of features across domains. Therefore we adopt optimal transport with Wasserstein distance, which could constrain the class label similarity, for adversarial training and also further deploy a metric learning objective to leverage the label information for achieving distinguishable classification boundary. Empirical results show that our proposed method could outperform most of the baselines. Furthermore, ablation studies also demonstrate the effectiveness of each component of our method.

preprint2022arXiv

Evolving Domain Generalization

Domain generalization aims to learn a predictive model from multiple different but related source tasks that can generalize well to a target task without the need of accessing any target data. Existing domain generalization methods ignore the relationship between tasks, implicitly assuming that all the tasks are sampled from a stationary environment. Therefore, they can fail when deployed in an evolving environment. To this end, we formulate and study the \emph{evolving domain generalization} (EDG) scenario, which exploits not only the source data but also their evolving pattern to generate a model for the unseen task. Our theoretical result reveals the benefits of modeling the relation between two consecutive tasks by learning a globally consistent directional mapping function. In practice, our analysis also suggests solving the DDG problem in a meta-learning manner, which leads to \emph{directional prototypical network}, the first method for the DDG problem. Empirical evaluation of both synthetic and real-world data sets validates the effectiveness of our approach.

preprint2022arXiv

Forgetting Prevention for Cross-regional Fraud Detection with Heterogeneous Trade Graph

With the booming growth of e-commerce, detecting financial fraud has become an urgent task to avoid transaction risks. Despite the successful applications of Graph Neural Networks (GNNs) in fraud detection, the existing solutions are only suitable for a narrow scope due to the limitation in data collection. Especially when expanding a business into new territory, e.g., new cities or new countries, developing a totally new model will bring the cost issue and result in forgetting previous knowledge. Moreover, recent works strive to devise GNNs to expose the implicit interactions behind financial transactions. However, most existing GNNs-based solutions concentrate on either homogeneous graphs or decomposing heterogeneous interactions into several homogeneous connections for convenience. To this end, this study proposes a novel solution based on heterogeneous trade graphs, namely HTG-CFD, to prevent knowledge forgetting of cross-regional fraud detection. In particular, the heterogeneous trade graph (HTG) is meticulously constructed from original transaction records to explore the complex semantics among different types of entities and relationships. And motivated by recent continual learning, we present a practical and task-oriented forgetting prevention method to alleviate knowledge forgetting in the context of cross-regional detection. Extensive experiments demonstrate that the proposed HTG-CFD not only promotes the performance in cross-regional scenarios but also significantly contributes to single-regional fraud detection.

preprint2022arXiv

Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding

This paper presents a method to explain how the information of each input variable is gradually discarded during the forward propagation in a deep neural network (DNN), which provides new perspectives to explain DNNs. We define two types of entropy-based metrics, i.e. (1) the discarding of pixel-wise information used in the forward propagation, and (2) the uncertainty of the input reconstruction, to measure input information contained by a specific layer from two perspectives. Unlike previous attribution metrics, the proposed metrics ensure the fairness of comparisons between different layers of different DNNs. We can use these metrics to analyze the efficiency of information processing in DNNs, which exhibits strong connections to the performance of DNNs. We analyze information discarding in a pixel-wise manner, which is different from the information bottleneck theory measuring feature information w.r.t. the sample distribution. Experiments have shown the effectiveness of our metrics in analyzing classic DNNs and explaining existing deep-learning techniques.

preprint2022arXiv

Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks

Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs, and various other document types, a flurry of table pre-training frameworks have been proposed following the success of text and images, and they have achieved new state-of-the-arts on various tasks such as table question answering, table type recognition, column relation classification, table search, formula prediction, etc. To fully use the supervision signals in unlabeled tables, a variety of pre-training objectives have been designed and evaluated, for example, denoising cell values, predicting numerical relationships, and implicitly executing SQLs. And to best leverage the characteristics of (semi-)structured tables, various tabular language models, particularly with specially-designed attention mechanisms, have been explored. Since tables usually appear and interact with free-form text, table pre-training usually takes the form of table-text joint pre-training, which attracts significant research interests from multiple domains. This survey aims to provide a comprehensive review of different model designs, pre-training objectives, and downstream tasks for table pre-training, and we further share our thoughts and vision on existing challenges and future opportunities.

preprint2022arXiv

TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data

Existing auto-regressive pre-trained language models (PLMs) like T5 and BART, have been well applied to table question answering by UNIFIEDSKG and TAPEX, respectively, and demonstrated state-of-the-art results on multiple benchmarks. However, auto-regressive PLMs are challenged by recent emerging numerical reasoning datasets, such as TAT-QA, due to the error-prone implicit calculation. In this paper, we present TaCube, to pre-compute aggregation/arithmetic results for the table in advance, so that they are handy and readily available for PLMs to answer numerical reasoning questions. TaCube systematically and comprehensively covers a collection of computational operations over table segments. By simply concatenating TaCube to the input sequence of PLMs, it shows significant experimental effectiveness. TaCube promotes the F1 score from 49.6% to 66.2% on TAT-QA and achieves new state-of-the-art results on WikiTQ (59.6% denotation accuracy). TaCube's improvements on numerical reasoning cases are even more notable: on TAT-QA, TaCube promotes the exact match accuracy of BART-large by 39.6% on sum, 52.5% on average, 36.6% on substraction, and 22.2% on division. We believe that TaCube is a general and portable pre-computation solution that can be potentially integrated to various numerical reasoning frameworks

preprint2021arXiv

Construction D' Lattices for Power-Constrained Communications

Designs and methods for nested lattice codes using Construction D' lattices for coding and convolutional code lattices for shaping are described. Two encoding methods and a decoding algorithm for Construction D' coding lattices that can be used with shaping lattices for power-constrained channels are given. We construct nested lattice codes with good coding properties, a high shaping gain, and low-complexity encoding and decoding. Convolutional code generator polynomials for Construction A lattices with the greatest shaping gain are given, as a result of an extensive search. It is shown that rate 1/3 convolutional codes provide a more favorable performance-complexity trade-off than rate 1/2 convolutional codes. Tail-biting convolutional codes have higher shaping gain than that of zero-tailed convolutional codes. A design for quasi-cyclic low-density parity-check (QC-LDPC) codes to form Construction D' lattices which have efficient encoding and indexing is presented. The resulting QC-LDPC Construction D' lattices are evaluated using four shaping lattices: the $E_8$ lattice, the $BW_{16}$ lattice, the Leech lattice and our best-found convolutional code lattice, showing a shaping gain of approximately 0.65 dB, 0.86 dB, 1.03 dB and 1.25 dB at dimension 2304.

preprint2021arXiv

Embedding Symbolic Temporal Knowledge into Deep Sequential Models

Sequences and time-series often arise in robot tasks, e.g., in activity recognition and imitation learning. In recent years, deep neural networks (DNNs) have emerged as an effective data-driven methodology for processing sequences given sufficient training data and compute resources. However, when data is limited, simpler models such as logic/rule-based methods work surprisingly well, especially when relevant prior knowledge is applied in their construction. However, unlike DNNs, these "structured" models can be difficult to extend, and do not work well with raw unstructured data. In this work, we seek to learn flexible DNNs, yet leverage prior temporal knowledge when available. Our approach is to embed symbolic knowledge expressed as linear temporal logic (LTL) and use these embeddings to guide the training of deep models. Specifically, we construct semantic-based embeddings of automata generated from LTL formula via a Graph Neural Network. Experiments show that these learnt embeddings can lead to improvements in downstream robot tasks such as sequential action recognition and imitation learning.

preprint2021arXiv

High-Order Statistical Functional Expansion and Its Application To Some Nonsmooth Problems

Let $\bx_j = \btheta +\bep_j, j=1,...,n$, be observations of an unknown parameter $\btheta$ in a Euclidean or separable Hilbert space $\scrH$, where $\bep_j$ are noises as random elements in $\scrH$ from a general distribution. We study the estimation of $f(\btheta)$ for a given functional $f:\scrH\rightarrow \RR$ based on $\bx_j$'s. The key element of our approach is a new method which we call High-Order Degenerate Statistical Expansion. It leverages the use of classical multivariate Taylor expansion and degenerate $U$-statistic and yields an elegant explicit formula. In the univariate case of $\scrH=\R$, the formula expresses the error of the proposed estimator as a sum of order $k$ degenerate $U$-products of the noises with coefficient $f^{(k)}(\btheta)/k!$ and an explicit remainder term in the form of the Riemann-Liouville integral as in the Taylor expansion around the true $\btheta$. For general $\scrH$, the formula expresses the estimation error in terms of the inner product of $f^{(k)}(\btheta)/k!$ and the average of the tensor products of $k$ noises with distinct indices and a parallel extension of the remainder term from the univariate case. This makes the proposed method a natural statistical version of the classical Taylor expansion. The proposed estimator can be viewed as a jackknife estimator of an ideal degenerate expansion of $f(\cdot)$ around the true $\btheta$ with the degenerate $U$-product of the noises, and can be approximated by bootstrap. Thus, the jackknife, bootstrap and Taylor expansion approaches all converge to the proposed estimator. We develop risk bounds for the proposed estimator and a central limit theorem under a second moment condition (even in expansions of higher than the second order). We apply this new method to generalize several existing results with smooth and nonsmooth $f$ to universal $\bep_j$'s with only minimum moment constraints.

preprint2021arXiv

Multi-task Learning by Leveraging the Semantic Information

One crucial objective of multi-task learning is to align distributions across tasks so that the information between them can be transferred and shared. However, existing approaches only focused on matching the marginal feature distribution while ignoring the semantic information, which may hinder the learning performance. To address this issue, we propose to leverage the label information in multi-task learning by exploring the semantic conditional relations among tasks. We first theoretically analyze the generalization bound of multi-task learning based on the notion of Jensen-Shannon divergence, which provides new insights into the value of label information in multi-task learning. Our analysis also leads to a concrete algorithm that jointly matches the semantic distribution and controls label distribution divergence. To confirm the effectiveness of the proposed method, we first compare the algorithm with several baselines on some benchmarks and then test the algorithms under label space shift conditions. Empirical results demonstrate that the proposed method could outperform most baselines and achieve state-of-the-art performance, particularly showing the benefits under the label shift conditions.

preprint2021arXiv

Overcoming Catastrophic Forgetting in Graph Neural Networks with Experience Replay

Graph Neural Networks (GNNs) have recently received significant research attention due to their superior performance on a variety of graph-related learning tasks. Most of the current works focus on either static or dynamic graph settings, addressing a single particular task, e.g., node/graph classification, link prediction. In this work, we investigate the question: can GNNs be applied to continuously learning a sequence of tasks? Towards that, we explore the Continual Graph Learning (CGL) paradigm and present the Experience Replay based framework ER-GNN for CGL to alleviate the catastrophic forgetting problem in existing GNNs. ER-GNN stores knowledge from previous tasks as experiences and replays them when learning new tasks to mitigate the catastrophic forgetting issue. We propose three experience node selection strategies: mean of feature, coverage maximization, and influence maximization, to guide the process of selecting experience nodes. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our ER-GNN and shed light on the incremental graph (non-Euclidean) structure learning.

preprint2020arXiv

A Heterogeneous Dynamical Graph Neural Networks Approach to Quantify Scientific Impact

Quantifying and predicting the long-term impact of scientific writings or individual scholars has important implications for many policy decisions, such as funding proposal evaluation and identifying emerging research fields. In this work, we propose an approach based on Heterogeneous Dynamical Graph Neural Network (HDGNN) to explicitly model and predict the cumulative impact of papers and authors. HDGNN extends heterogeneous GNNs by incorporating temporally evolving characteristics and capturing both structural properties of attributed graph and the growing sequence of citation behavior. HDGNN is significantly different from previous models in its capability of modeling the node impact in a dynamic manner while taking into account the complex relations among nodes. Experiments conducted on a real citation dataset demonstrate its superior performance of predicting the impact of both papers and authors.

preprint2020arXiv

Beyond $\mathcal{H}$-Divergence: Domain Adaptation Theory With Jensen-Shannon Divergence

We reveal the incoherence between the widely-adopted empirical domain adversarial training and its generally-assumed theoretical counterpart based on $\mathcal{H}$-divergence. Concretely, we find that $\mathcal{H}$-divergence is not equivalent to Jensen-Shannon divergence, the optimization objective in domain adversarial training. To this end, we establish a new theoretical framework by directly proving the upper and lower target risk bounds based on joint distributional Jensen-Shannon divergence. We further derive bi-directional upper bounds for marginal and conditional shifts. Our framework exhibits inherent flexibilities for different transfer learning problems, which is usable for various scenarios where $\mathcal{H}$-divergence-based theory fails to adapt. From an algorithmic perspective, our theory enables a generic guideline unifying principles of semantic conditional matching, feature marginal matching, and label marginal shift correction. We employ algorithms for each principle and empirically validate the benefits of our framework on real datasets.

preprint2020arXiv

Deep Active Learning: Unified and Principled Method for Query and Training

In this paper, we are proposing a unified and principled method for both the querying and training processes in deep batch active learning. We are providing theoretical insights from the intuition of modeling the interactive procedure in active learning as distribution matching, by adopting the Wasserstein distance. As a consequence, we derived a new training loss from the theoretical analysis, which is decomposed into optimizing deep neural network parameters and batch query selection through alternative optimization. In addition, the loss for training a deep neural network is naturally formulated as a min-max optimization problem through leveraging the unlabeled data information. Moreover, the proposed principles also indicate an explicit uncertainty-diversity trade-off in the query batch selection. Finally, we evaluate our proposed method on different benchmarks, consistently showing better empirical performances and a better time-efficient query strategy compared to the baselines.

preprint2020arXiv

Discriminative Active Learning for Domain Adaptation

Domain Adaptation aiming to learn a transferable feature between different but related domains has been well investigated and has shown excellent empirical performances. Previous works mainly focused on matching the marginal feature distributions using the adversarial training methods while assuming the conditional relations between the source and target domain remained unchanged, $i.e.$, ignoring the conditional shift problem. However, recent works have shown that such a conditional shift problem exists and can hinder the adaptation process. To address this issue, we have to leverage labelled data from the target domain, but collecting labelled data can be quite expensive and time-consuming. To this end, we introduce a discriminative active learning approach for domain adaptation to reduce the efforts of data annotation. Specifically, we propose three-stage active adversarial training of neural networks: invariant feature space learning (first stage), uncertainty and diversity criteria and their trade-off for query strategy (second stage) and re-training with queried target labels (third stage). Empirical comparisons with existing domain adaptation methods using four benchmark datasets demonstrate the effectiveness of the proposed approach.

preprint2020arXiv

Metric learning by Similarity Network for Deep Semi-Supervised Learning

Deep semi-supervised learning has been widely implemented in the real-world due to the rapid development of deep learning. Recently, attention has shifted to the approaches such as Mean-Teacher to penalize the inconsistency between two perturbed input sets. Although these methods may achieve positive results, they ignore the relationship information between data instances. To solve this problem, we propose a novel method named Metric Learning by Similarity Network (MLSN), which aims to learn a distance metric adaptively on different domains. By co-training with the classification network, similarity network can learn more information about pairwise relationships and performs better on some empirical tasks than state-of-art methods.

preprint2020arXiv

Relational State-Space Model for Stochastic Multi-Object Systems

Real-world dynamical systems often consist of multiple stochastic subsystems that interact with each other. Modeling and forecasting the behavior of such dynamics are generally not easy, due to the inherent hardness in understanding the complicated interactions and evolutions of their constituents. This paper introduces the relational state-space model (R-SSM), a sequential hierarchical latent variable model that makes use of graph neural networks (GNNs) to simulate the joint state transitions of multiple correlated objects. By letting GNNs cooperate with SSM, R-SSM provides a flexible way to incorporate relational information into the modeling of multi-object dynamics. We further suggest augmenting the model with normalizing flows instantiated for vertex-indexed random variables and propose two auxiliary contrastive objectives to facilitate the learning. The utility of R-SSM is empirically evaluated on synthetic and real time-series datasets.