Source author record

Jia Song

Jia Song appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision cond-mat.mes-hall cond-mat.mtrl-sci cond-mat.supr-con Machine Learning physics.comp-ph physics.soc-ph

Catalog footprint

What is connected

6works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

GenCAMO: Scene-Graph Contextual Decoupling for Environment-aware and Mask-free Camouflage Image-Dense Annotation Generation

Conceal dense prediction (CDP), especially RGB-D camouflage object detection and open-vocabulary camouflage object segmentation, plays a crucial role in advancing the understanding and reasoning of complex camouflage scenes. However, high-quality and large-scale camouflage datasets with dense annotation remain scarce due to expensive data collection and labeling costs. To address this challenge, we explore leveraging generative models to synthesize realistic camouflage image-dense data for training CDP models with fine-grained representations, prior knowledge, and auxiliary reasoning. Concretely, our contributions are threefold: (i) we introduce GenCAMO-DB, a large-scale camouflage dataset with multi-modal annotations, including depth maps, scene graphs, attribute descriptions, and text prompts; (ii) we present GenCAMO, an environment-aware and mask-free generative framework that produces high-fidelity camouflage image-dense annotations; (iii) extensive experiments across multiple modalities demonstrate that GenCAMO significantly improves dense prediction performance on complex camouflage scenes by providing high-quality synthetic data. The code and datasets will be released after paper acceptance.

preprint2022arXiv

Signatures of superconducting triplet pairing in Ni--Ga-bilayer junctions

Ni-Ga bilayers are a versatile platform for exploring the competition between strongly antagonistic ferromagnetic and superconducting phases. We characterize the impact of this competition on the transport properties of highly-ballistic Al/Al2O3(/EuS)/Ni-Ga tunnel junctions from both experimental and theoretical points of view. While the conductance spectra of junctions comprising Ni (3 nm)-Ga (60 nm) bilayers can be well understood within the framework of earlier results, which associate the emerging main conductance maxima with the junction films' superconducting gaps, thinner Ni (1.6 nm)-Ga (30 nm) bilayers entail completely different physics, and give rise to novel large-bias (when compared to the superconducting gap of the thin Al film as a reference) conductance-peak subseries that we term conductance shoulders. These conductance shoulders might attract considerable attention also in similar magnetic superconducting bilayer junctions, as we predict them to offer an experimentally well-accessible transport signature of superconducting triplet pairings that are induced around the interface of the Ni-Ga bilayer. We further substantiate this claim performing complementary polarized neutron reflectometry measurements on the bilayers, from which we deduce (1) a nonuniform magnetization structure in Ga in a several nanometer-thick area around the Ni-Ga boundary and can simultaneously (2) satisfactorily fit the obtained data only considering the paramagnetic Meissner response scenario. While the latter provides independent experimental evidence of induced triplet superconductivity inside the Ni-Ga bilayer, the former might serve as the first experimental hint of its potential microscopic physical origin.

preprint2020arXiv

A Novel Video Salient Object Detection Method via Semi-supervised Motion Quality Perception

Previous video salient object detection (VSOD) approaches have mainly focused on designing fancy networks to achieve their performance improvements. However, with the slow-down in development of deep learning techniques recently, it may become more and more difficult to anticipate another breakthrough via fancy networks solely. To this end, this paper proposes a universal learning scheme to get a further 3\% performance improvement for all state-of-the-art (SOTA) methods. The major highlight of our method is that we resort the "motion quality"---a brand new concept, to select a sub-group of video frames from the original testing set to construct a new training set. The selected frames in this new training set should all contain high-quality motions, in which the salient objects will have large probability to be successfully detected by the "target SOTA method"---the one we want to improve. Consequently, we can achieve a significant performance improvement by using this new training set to start a new round of network training. During this new round training, the VSOD results of the target SOTA method will be applied as the pseudo training objectives. Our novel learning scheme is simple yet effective, and its semi-supervised methodology may have large potential to inspire the VSOD community in the future.

preprint2020arXiv

ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data

In this paper, we introduce a new vision-language pre-trained model -- ImageBERT -- for image-text joint embedding. Our model is a Transformer-based model, which takes different modalities as input and models the relationship between them. The model is pre-trained on four tasks simultaneously: Masked Language Modeling (MLM), Masked Object Classification (MOC), Masked Region Feature Regression (MRFR), and Image Text Matching (ITM). To further enhance the pre-training quality, we have collected a Large-scale weAk-supervised Image-Text (LAIT) dataset from Web. We first pre-train the model on this dataset, then conduct a second stage pre-training on Conceptual Captions and SBU Captions. Our experiments show that multi-stage pre-training strategy outperforms single-stage pre-training. We also fine-tune and evaluate our pre-trained ImageBERT model on image retrieval and text retrieval tasks, and achieve new state-of-the-art results on both MSCOCO and Flickr30k datasets.

preprint2020arXiv

Structures and Properties of $β$-Titanium Doping Trace Transition Metal Elements: a Density Functional Theory Study

We systematically calculate the structure, formation enthalpy, formation free energy, elastic constants and electronic structure of Ti$_{0.98}$X$_{0.02}$ system by density functional theory (DFT) simulations to explore the effect of transition metal X (X=Ag, Cd, Co, Cr, Cu, Fe, Mn, Mo, Nb, Ni, Pd, Rh, Ru, Tc, and Zn) on the stability mechanism of $β$-titanium. Based on our calculations, the results of formation enthalpy and free energy show that adding trace X is beneficial to the thermodynamic stability of $β$-titanium. This behavior is well explained by the density of state (DOS). However, the tetragonal shear moduli of Ti$_{0.98}$X$_{0.02}$ systems are negative, indicating that $β$-titanium doping with a low concentration of X is still elastically unstable at 0 K. Therefore, we theoretically explain that $β$-titanium doping with trace transition metal X is unstable in the ground state.

preprint2016arXiv

Towards Understanding What Contributes to Forming an Opinion

Opinion evolution mechanism can be captured by physics modeling approaches. In this context, a kinetic equation is established by defining a generalized displace (cognitive-level), a driving force and the related generalized potential, information quantity, altitude. It has been shown that the details of opinion evolution depends the type of the driving force, self-dominated driving or environment- dominated driving. In the former case, the participants can have their altitudes changed in the process of competition between the self-driving force and environment-driving force. In the latter case, all of the participants are pulled by the environment. Some regularity behind the dynamics of opinion is also revealed, for instance, the information entropy decays with time in a special way, etc. the results may help us to get some deep understandings to formation of a public opinion.

Jia Song

What is connected

Connect this record

See the researcher in context

Building this map preview

6 published item(s)

GenCAMO: Scene-Graph Contextual Decoupling for Environment-aware and Mask-free Camouflage Image-Dense Annotation Generation

Signatures of superconducting triplet pairing in Ni--Ga-bilayer junctions

A Novel Video Salient Object Detection Method via Semi-supervised Motion Quality Perception

ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data

Structures and Properties of $β$-Titanium Doping Trace Transition Metal Elements: a Density Functional Theory Study

Towards Understanding What Contributes to Forming an Opinion