Source author record

Jinlong Li

Jinlong Li appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Machine Learning Artificial Intelligence Computation and Language eess.IV Robotics Social and Information Networks

Catalog footprint

What is connected

8works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Adaptive Negative Scheduling for Graph Contrastive Learning

Graph contrastive learning (GCL) has become a central paradigm for self-supervised representation learning in computational intelligence, with applications spanning recommendation, anomaly detection, and personalization. A key limitation of existing methods is their reliance on static negative sampling, which fails to account for the dynamic informativeness and computational cost of negatives during training. We propose AdNGCL, an adaptive negative scheduling framework with a hardness-aware scheduler (HANS) that formulates negative selection as a loss-gated, budget-constrained process across hard, intermediate, and easy strata. The scheduler dynamically adjusts step sizes based on contrastive loss trends under both global and per-category budgets, while periodically refreshing samples to maintain diversity without exceeding compute constraints. Experiments on nine benchmark graph datasets demonstrate that AdNGCL consistently advances state-of-the-art performance, achieving the best accuracy on seven datasets and second-best on the remaining two, while offering explicit control over computational cost. These results highlight the value of budget-aware, loss-sensitive scheduling as a general strategy for improving the robustness and efficiency of representation learning in emerging computational intelligence applications.

preprint2026arXiv

P2DNav: Panorama-to-Downview Reasoning for Zero-shot Vision-and-Language Navigation

Vision-and-language navigation (VLN) requires an embodied agent to ground natural-language instructions into executable navigation actions in unseen environments. Existing zero-shot methods typically rely on additional waypoint prediction modules, which often entangle high-level directional reasoning with fine-grained local grounding, leading to error-prone and unstable decisions. In this paper, we propose P2DNav, a hierarchical framework for zero-shot vision-and-language navigation. P2DNav consists of three core components: Panorama-to-Downview (P2D), Sliding-Window Dialogue Memory (SDM), and Reflective Reorientation Mechanism (RRM). P2D explicitly decomposes navigation decision-making into two stages: panoramic direction selection and downview local grounding. It first selects the instruction-relevant direction from a 360° panorama, and then predicts a pixel-level target point from the downview RGB observation in that direction. In addition, SDM organizes navigation history as a multi-turn dialogue context and maintains recent visual observations within a sliding window to support long-horizon navigation. RRM further enables reflective reorientation by assessing the reliability of local grounding based on the downview observation and returning to panoramic direction selection when necessary. Experiments on the R2R-CE benchmark show that P2DNav achieves strong performance among zero-shot methods. In particular, compared with the state-of-the-art (SOTA) zero-shot waypoint-based and waypoint-free methods, P2DNav achieves SR gains of 146.6% and 58.9%, respectively, demonstrating the effectiveness of P2D, SDM, and RRM for zero-shot VLN. Code will be released for public use.

preprint2022arXiv

Features Based Adaptive Augmentation for Graph Contrastive Learning

Self-Supervised learning aims to eliminate the need for expensive annotation in graph representation learning, where graph contrastive learning (GCL) is trained with the self-supervision signals containing data-data pairs. These data-data pairs are generated with augmentation employing stochastic functions on the original graph. We argue that some features can be more critical than others depending on the downstream task, and applying stochastic function uniformly, will vandalize the influential features, leading to diminished accuracy. To fix this issue, we introduce a Feature Based Adaptive Augmentation (FebAA) approach, which identifies and preserves potentially influential features and corrupts the remaining ones. We implement FebAA as plug and play layer and use it with state-of-the-art Deep Graph Contrastive Learning (GRACE) and Bootstrapped Graph Latents (BGRL). We successfully improved the accuracy of GRACE and BGRL on eight graph representation learning's benchmark datasets.

preprint2022arXiv

OPV2V: An Open Benchmark Dataset and Fusion Pipeline for Perception with Vehicle-to-Vehicle Communication

Employing Vehicle-to-Vehicle communication to enhance perception performance in self-driving technology has attracted considerable attention recently; however, the absence of a suitable open dataset for benchmarking algorithms has made it difficult to develop and assess cooperative perception technologies. To this end, we present the first large-scale open simulated dataset for Vehicle-to-Vehicle perception. It contains over 70 interesting scenes, 11,464 frames, and 232,913 annotated 3D vehicle bounding boxes, collected from 8 towns in CARLA and a digital town of Culver City, Los Angeles. We then construct a comprehensive benchmark with a total of 16 implemented models to evaluate several information fusion strategies~(i.e. early, late, and intermediate fusion) with state-of-the-art LiDAR detection algorithms. Moreover, we propose a new Attentive Intermediate Fusion pipeline to aggregate information from multiple connected vehicles. Our experiments show that the proposed pipeline can be easily integrated with existing 3D LiDAR detectors and achieve outstanding performance even with large compression rates. To encourage more researchers to investigate Vehicle-to-Vehicle perception, we will release the dataset, benchmark methods, and all related codes in https://mobility-lab.seas.ucla.edu/opv2v/.

preprint2022arXiv

ROMNet: Renovate the Old Memories

Renovating the memories in old photos is an intriguing research topic in computer vision fields. These legacy images often suffer from severe and commingled degradations such as cracks, noise, and color-fading, while lack of large-scale paired old photo datasets makes this restoration task very challenging. In this work, we present a novel reference-based end-to-end learning framework that can jointly repair and colorize the degraded legacy pictures. Specifically, the proposed framework consists of three modules: a restoration sub-network for degradation restoration, a similarity sub-network for color histogram matching and transfer, and a colorization subnet that learns to predict the chroma elements of the images conditioned on chromatic reference signals. The whole system takes advantage of the color histogram priors in a given reference image, which vastly reduces the dependency on large-scale training data. Apart from the proposed method, we also create, to our knowledge, the first public and real-world old photo dataset with paired ground truth for evaluating old photo restoration models, wherein each old photo is paired with a manually restored pristine image by PhotoShop experts. Our extensive experiments conducted on both synthetic and real-world datasets demonstrate that our method significantly outperforms state-of-the-arts both quantitatively and qualitatively.

preprint2022arXiv

SMDT: Selective Memory-Augmented Neural Document Translation

Existing document-level neural machine translation (NMT) models have sufficiently explored different context settings to provide guidance for target generation. However, little attention is paid to inaugurate more diverse context for abundant context information. In this paper, we propose a Selective Memory-augmented Neural Document Translation model to deal with documents containing large hypothesis space of the context. Specifically, we retrieve similar bilingual sentence pairs from the training corpus to augment global context and then extend the two-stream attention model with selective mechanism to capture local context and diverse global contexts. This unified approach allows our model to be trained elegantly on three publicly document-level machine translation datasets and significantly outperforms previous document-level NMT models.

preprint2020arXiv

Hierarchical Modes Exploring in Generative Adversarial Networks

In conditional Generative Adversarial Networks (cGANs), when two different initial noises are concatenated with the same conditional information, the distance between their outputs is relatively smaller, which makes minor modes likely to collapse into large modes. To prevent this happen, we proposed a hierarchical mode exploring method to alleviate mode collapse in cGANs by introducing a diversity measurement into the objective function as the regularization term. We also introduced the Expected Ratios of Expansion (ERE) into the regularization term, by minimizing the sum of differences between the real change of distance and ERE, we can control the diversity of generated images w.r.t specific-level features. We validated the proposed algorithm on four conditional image synthesis tasks including categorical generation, paired and un-paired image translation and text-to-image generation. Both qualitative and quantitative results show that the proposed method is effective in alleviating the mode collapse problem in cGANs, and can control the diversity of output images w.r.t specific-level features.

preprint2020arXiv

Online Dynamic Network Embedding

Network embedding is a very important method for network data. However, most of the algorithms can only deal with static networks. In this paper, we propose an algorithm Recurrent Neural Network Embedding (RNNE) to deal with dynamic network, which can be typically divided into two categories: a) topologically evolving graphs whose nodes and edges will increase (decrease) over time; b) temporal graphs whose edges contain time information. In order to handle the changing size of dynamic networks, RNNE adds virtual node, which is not connected to any other nodes, to the networks and replaces it when new node arrives, so that the network size can be unified at different time. On the one hand, RNNE pays attention to the direct links between nodes and the similarity between the neighborhood structures of two nodes, trying to preserve the local and global network structure. On the other hand, RNNE reduces the influence of noise by transferring the previous embedding information. Therefore, RNNE can take into account both static and dynamic characteristics of the network.We evaluate RNNE on five networks and compare with several state-of-the-art algorithms. The results demonstrate that RNNE has advantages over other algorithms in reconstruction, classification and link predictions.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint