Researcher profile

Jia Song

Jia Song contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

GenCAMO: Scene-Graph Contextual Decoupling for Environment-aware and Mask-free Camouflage Image-Dense Annotation Generation

Conceal dense prediction (CDP), especially RGB-D camouflage object detection and open-vocabulary camouflage object segmentation, plays a crucial role in advancing the understanding and reasoning of complex camouflage scenes. However, high-quality and large-scale camouflage datasets with dense annotation remain scarce due to expensive data collection and labeling costs. To address this challenge, we explore leveraging generative models to synthesize realistic camouflage image-dense data for training CDP models with fine-grained representations, prior knowledge, and auxiliary reasoning. Concretely, our contributions are threefold: (i) we introduce GenCAMO-DB, a large-scale camouflage dataset with multi-modal annotations, including depth maps, scene graphs, attribute descriptions, and text prompts; (ii) we present GenCAMO, an environment-aware and mask-free generative framework that produces high-fidelity camouflage image-dense annotations; (iii) extensive experiments across multiple modalities demonstrate that GenCAMO significantly improves dense prediction performance on complex camouflage scenes by providing high-quality synthetic data. The code and datasets will be released after paper acceptance.

preprint2022arXiv

Signatures of superconducting triplet pairing in Ni--Ga-bilayer junctions

Ni-Ga bilayers are a versatile platform for exploring the competition between strongly antagonistic ferromagnetic and superconducting phases. We characterize the impact of this competition on the transport properties of highly-ballistic Al/Al2O3(/EuS)/Ni-Ga tunnel junctions from both experimental and theoretical points of view. While the conductance spectra of junctions comprising Ni (3 nm)-Ga (60 nm) bilayers can be well understood within the framework of earlier results, which associate the emerging main conductance maxima with the junction films' superconducting gaps, thinner Ni (1.6 nm)-Ga (30 nm) bilayers entail completely different physics, and give rise to novel large-bias (when compared to the superconducting gap of the thin Al film as a reference) conductance-peak subseries that we term conductance shoulders. These conductance shoulders might attract considerable attention also in similar magnetic superconducting bilayer junctions, as we predict them to offer an experimentally well-accessible transport signature of superconducting triplet pairings that are induced around the interface of the Ni-Ga bilayer. We further substantiate this claim performing complementary polarized neutron reflectometry measurements on the bilayers, from which we deduce (1) a nonuniform magnetization structure in Ga in a several nanometer-thick area around the Ni-Ga boundary and can simultaneously (2) satisfactorily fit the obtained data only considering the paramagnetic Meissner response scenario. While the latter provides independent experimental evidence of induced triplet superconductivity inside the Ni-Ga bilayer, the former might serve as the first experimental hint of its potential microscopic physical origin.

preprint2020arXiv

A Novel Video Salient Object Detection Method via Semi-supervised Motion Quality Perception

Previous video salient object detection (VSOD) approaches have mainly focused on designing fancy networks to achieve their performance improvements. However, with the slow-down in development of deep learning techniques recently, it may become more and more difficult to anticipate another breakthrough via fancy networks solely. To this end, this paper proposes a universal learning scheme to get a further 3\% performance improvement for all state-of-the-art (SOTA) methods. The major highlight of our method is that we resort the "motion quality"---a brand new concept, to select a sub-group of video frames from the original testing set to construct a new training set. The selected frames in this new training set should all contain high-quality motions, in which the salient objects will have large probability to be successfully detected by the "target SOTA method"---the one we want to improve. Consequently, we can achieve a significant performance improvement by using this new training set to start a new round of network training. During this new round training, the VSOD results of the target SOTA method will be applied as the pseudo training objectives. Our novel learning scheme is simple yet effective, and its semi-supervised methodology may have large potential to inspire the VSOD community in the future.

preprint2020arXiv

ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data

In this paper, we introduce a new vision-language pre-trained model -- ImageBERT -- for image-text joint embedding. Our model is a Transformer-based model, which takes different modalities as input and models the relationship between them. The model is pre-trained on four tasks simultaneously: Masked Language Modeling (MLM), Masked Object Classification (MOC), Masked Region Feature Regression (MRFR), and Image Text Matching (ITM). To further enhance the pre-training quality, we have collected a Large-scale weAk-supervised Image-Text (LAIT) dataset from Web. We first pre-train the model on this dataset, then conduct a second stage pre-training on Conceptual Captions and SBU Captions. Our experiments show that multi-stage pre-training strategy outperforms single-stage pre-training. We also fine-tune and evaluate our pre-trained ImageBERT model on image retrieval and text retrieval tasks, and achieve new state-of-the-art results on both MSCOCO and Flickr30k datasets.

preprint2020arXiv

Structures and Properties of $β$-Titanium Doping Trace Transition Metal Elements: a Density Functional Theory Study

We systematically calculate the structure, formation enthalpy, formation free energy, elastic constants and electronic structure of Ti$_{0.98}$X$_{0.02}$ system by density functional theory (DFT) simulations to explore the effect of transition metal X (X=Ag, Cd, Co, Cr, Cu, Fe, Mn, Mo, Nb, Ni, Pd, Rh, Ru, Tc, and Zn) on the stability mechanism of $β$-titanium. Based on our calculations, the results of formation enthalpy and free energy show that adding trace X is beneficial to the thermodynamic stability of $β$-titanium. This behavior is well explained by the density of state (DOS). However, the tetragonal shear moduli of Ti$_{0.98}$X$_{0.02}$ systems are negative, indicating that $β$-titanium doping with a low concentration of X is still elastically unstable at 0 K. Therefore, we theoretically explain that $β$-titanium doping with trace transition metal X is unstable in the ground state.