Researcher profile

Song Yang

Song Yang contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
9works
0followers
13topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

9 published item(s)

preprint2023arXiv

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

Temporal sentence grounding (TSG) aims to identify the temporal boundary of a specific segment from an untrimmed video by a sentence query. All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning. However, we argue that these methods have overlooked two indispensable issues: 1) Boundary-bias: The annotated target segment generally refers to two specific frames as corresponding start and end timestamps. The video downsampling process may lose these two frames and take the adjacent irrelevant frames as new boundaries. 2) Reasoning-bias: Such incorrect new boundary frames also lead to the reasoning bias during frame-query interaction, reducing the generalization ability of model. To alleviate above limitations, in this paper, we propose a novel Siamese Sampling and Reasoning Network (SSRN) for TSG, which introduces a siamese sampling mechanism to generate additional contextual frames to enrich and refine the new boundaries. Specifically, a reasoning strategy is developed to learn the inter-relationship among these frames and generate soft labels on boundaries for more accurate frame-query reasoning. Such mechanism is also able to supplement the absent consecutive visual semantics to the sampled sparse frames for fine-grained activity understanding. Extensive experiments demonstrate the effectiveness of SSRN on three challenging datasets.

preprint2022arXiv

Bott-Chern hypercohomology and bimeromorphic invariants

The aim of this article is to study the geometry of Bott-Chern hypercohomology from the bimeromorphic point of view. We construct some new bimeromorphic invariants involving the cohomology for the sheaf of germs of pluriharmonic functions, the truncated holomorphic de Rham cohomology, and the de Rham cohomology. To define these invariants, using a sheaf-theoretic approach, we establish a blow-up formula together with a canonical morphism for the Bott-Chern hypercohomology. In particular, we compute the invariants of some compact complex threefolds, such as Iwasawa manifolds and quintic threefolds.

preprint2022arXiv

Hodge cohomology on blow-ups along subvarieties

We establish a blow-up formula for Hodge cohomology of locally free sheaves on smooth proper varieties over an algebraically closed field of positive characteristic. For this, we introduce a notion of relative Hodge sheaves and study their behavior under blow-ups along smooth centers. In particular, as an application, we study the blow-up invariance of the $E_2$-degeneracy of the Hochschild--Kostant--Rosenberg spectral sequence for smooth proper varieties.

preprint2022arXiv

Replacing the Framingham-based equation for prediction of cardiovascular disease risk and adverse outcome by using artificial intelligence and retinal imaging

Purpose: To create and evaluate the accuracy of an artificial intelligence Deep learning platform (ORAiCLE) capable of using only retinal fundus images to predict both an individuals overall 5 year cardiovascular risk (CVD) and the relative contribution of the component risk factors that comprise this risk. Methods: We used 165,907 retinal images from a database of 47,236 patient visits. Initially, each image was paired with biometric data age, ethnicity, sex, presence and duration of diabetes a HDL/LDL ratios as well as any CVD event wtihin 5 years of the retinal image acquisition. A risk score based on Framingham equations was calculated. The real CVD event rate was also determined for the individuals and overall population. Finally, ORAiCLE was trained using only age, ethnicity, sex plus retinal images. Results: Compared to Framingham-based score, ORAiCLE was up to 12% more accurate in prediciting cardiovascular event in he next 5-years, especially for the highest risk group of people. The reliability and accuracy of each of the restrictive models was suboptimal to ORAiCLE performance ,indicating that it was using data from both sets of data to derive its final results. Conclusion: Retinal photography is inexpensive and only minimal training is required to acquire them as fully automated, inexpensive camera systems are now widely available. As such, AI-based CVD risk algorithms such as ORAiCLE promise to make CV health screening more accurate, more afforadable and more accessible for all. Furthermore, ORAiCLE unique ability to assess the relative contribution of the components that comprise an individuals overall risk would inform treatment decisions based on the specific needs of an individual, thereby increasing the likelihood of positive health outcomes.

preprint2022arXiv

Space Meets Time: Local Spacetime Neural Network For Traffic Flow Forecasting

Traffic flow forecasting is a crucial task in urban computing. The challenge arises as traffic flows often exhibit intrinsic and latent spatio-temporal correlations that cannot be identified by extracting the spatial and temporal patterns of traffic data separately. We argue that such correlations are universal and play a pivotal role in traffic flow. We put forward {spacetime interval learning} as a paradigm to explicitly capture these correlations through a unified analysis of both spatial and temporal features. Unlike the state-of-the-art methods, which are restricted to a particular road network, we model the universal spatio-temporal correlations that are transferable from cities to cities. To this end, we propose a new spacetime interval learning framework that constructs a local-spacetime context of a traffic sensor comprising the data from its neighbors within close time points. Based on this idea, we introduce local spacetime neural network (STNN), which employs novel spacetime convolution and attention mechanism to learn the universal spatio-temporal correlations. The proposed STNN captures local traffic patterns, which does not depend on a specific network structure. As a result, a trained STNN model can be applied on any unseen traffic networks. We evaluate the proposed STNN on two public real-world traffic datasets and a simulated dataset on dynamic networks. The experiment results show that STNN not only improves prediction accuracy by 4% over state-of-the-art methods, but is also effective in handling the case when the traffic network undergoes dynamic changes as well as the superior generalization capability.

preprint2021arXiv

Mono-elemental saturable absorber in mode-locked fiber laser: A review

Two-dimensional mono-elemental material is an excellent saturable absorber candidate with low saturation intensity, large modulation depth, high nonlinearities, and fast recovery time of excited carriers. Typically, these mono-elemental material with two-dimensional structure possesses tunable bandgap from metallic to semiconducting according to different number of layers. The successful application of these materials as the saturable absorber has exploited the development of mode-locked fiber lasers. Therefore, this review is intended to provide an up-to-date information to the development of mono-elemental saturable absorber for the advances in mode-locked fiber laser, with emphasis on their material properties, synthesis process and material characterization. Meanwhile, issues and challenges of the review research topic will be highlighted and addressed with several concrete recommendations.

preprint2020arXiv

Multimodal Learning For Classroom Activity Detection

Classroom activity detection (CAD) focuses on accurately classifying whether the teacher or student is speaking and recording both the length of individual utterances during a class. A CAD solution helps teachers get instant feedback on their pedagogical instructions. This greatly improves educators' teaching skills and hence leads to students' achievement. However, CAD is very challenging because (1) the CAD model needs to be generalized well enough for different teachers and students; (2) data from both vocal and language modalities has to be wisely fused so that they can be complementary; and (3) the solution shouldn't heavily rely on additional recording device. In this paper, we address the above challenges by using a novel attention based neural framework. Our framework not only extracts both speech and language information, but utilizes attention mechanism to capture long-term semantic dependence. Our framework is device-free and is able to take any classroom recording as input. The proposed CAD learning framework is evaluated in two real-world education applications. The experimental results demonstrate the benefits of our approach on learning attention based neural network from classroom data with different modalities, and show our approach is able to outperform state-of-the-art baselines in terms of various evaluation metrics.

preprint2019arXiv

Chaos Phase Induced Mass-producible Monolayer Two-dimensional Material

Crystal phase is well studied and presents a periodical atom arrangement in three dimensions lattice, but the "amorphous phase" is poorly understood. Here, by starting from cage-like bicyclocalix[2]arene[2]triazines building block, a brand-new 2D MOF is constructed with extremely weak interlaminar interaction existing between two adjacent 2D-crystal layer. Inter-layer slip happens under external disturbance and leads to the loss of periodicity at one dimension in the crystal lattice, resulting in an interim phase between the crystal and amorphous phase - the chaos phase, non-periodical in microscopic scale but orderly in mesoscopic scale. This chaos phase 2D MOF is a disordered self-assembly of black-phosphorus like 3D-layer, which has excellent mechanical-strength and a thickness of 1.15 nm. The bulky 2D-MOF material is readily to be exfoliated into monolayer nanosheets in gram-scale with unprecedented evenness and homogeneity, as well as previously unattained lateral size (>10 um), which present the first mass-producible monolayer 2D material and can form wafer-scale film on substrate.

preprint2018arXiv

Bott-Chern blow-up formula and bimeromorphic invariance of the $\partial\bar{\partial}$-Lemma for threefolds

The purpose of this paper is to study the bimeromorphic invariants of compact complex manifolds in terms of Bott-Chern cohomology. We prove a blow-up formula for Bott-Chern cohomology. As an application, we show that for compact complex threefolds the non-Kählerness degrees, introduced by Angella-Tomassini [Invent. Math. 192, (2013), 71-81], are bimeromorphic invariants. Consequently, the $\partial\bar{\partial}$-Lemma on threefolds admits the bimeromorphic invariance.