Researcher profile

Cong Sun

Cong Sun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2022arXiv

$μ$Dep: Mutation-based Dependency Generation for Precise Taint Analysis on Android Native Code

The existence of native code in Android apps plays an important role in triggering inconspicuous propagation of secrets and circumventing malware detection. However, the state-of-the-art information-flow analysis tools for Android apps all have limited capabilities of analyzing native code. Due to the complexity of binary-level static analysis, most static analyzers choose to build conservative models for a selected portion of native code. Though the recent inter-language analysis improves the capability of tracking information flow in native code, it is still far from attaining similar effectiveness of the state-of-the-art information-flow analyzers that focus on non-native Java methods. To overcome the above constraints, we propose a new analysis framework, $μ$Dep, to detect sensitive information flows of the Android apps containing native code. In this framework, we combine a control-flow based static binary analysis with a mutation-based dynamic analysis to model the tainting behaviors of native code in the apps. Based on the result of the analyses, $μ$Dep conducts a stub generation for the related native functions to facilitate the state-of-the-art analyzer DroidSafe with fine-grained tainting behavior summaries of native code. The experimental results show that our framework is competitive on the accuracy, and effective in analyzing the information flows in real-world apps and malware compared with the state-of-the-art inter-language static analysis.

preprint2022arXiv

A microstructure estimation Transformer inspired by sparse representation for diffusion MRI

Diffusion magnetic resonance imaging (dMRI) is an important tool in characterizing tissue microstructure based on biophysical models, which are complex and highly non-linear. Resolving microstructures with optimization techniques is prone to estimation errors and requires dense sampling in the q-space. Deep learning based approaches have been proposed to overcome these limitations. Motivated by the superior performance of the Transformer, in this work, we present a learning-based framework based on Transformer, namely, a Microstructure Estimation Transformer with Sparse Coding (METSC) for dMRI-based microstructure estimation with downsampled q-space data. To take advantage of the Transformer while addressing its limitation in large training data requirements, we explicitly introduce an inductive bias - model bias into the Transformer using a sparse coding technique to facilitate the training process. Thus, the METSC is composed with three stages, an embedding stage, a sparse representation stage, and a mapping stage. The embedding stage is a Transformer-based structure that encodes the signal to ensure the voxel is represented effectively. In the sparse representation stage, a dictionary is constructed by solving a sparse reconstruction problem that unfolds the Iterative Hard Thresholding (IHT) process. The mapping stage is essentially a decoder that computes the microstructural parameters from the output of the second stage, based on the weighted sum of normalized dictionary coefficients where the weights are also learned. We tested our framework on two dMRI models with downsampled q-space data, including the intravoxel incoherent motion (IVIM) model and the neurite orientation dispersion and density imaging (NODDI) model. The proposed method achieved up to 11.25 folds of acceleration in scan time and outperformed the other state-of-the-art learning-based methods.

preprint2022arXiv

DeepCatra: Learning Flow- and Graph-based Behaviors for Android Malware Detection

As Android malware is growing and evolving, deep learning has been introduced into malware detection, resulting in great effectiveness. Recent work is considering hybrid models and multi-view learning. However, they use only simple features, limiting the accuracy of these approaches in practice. In this paper, we propose DeepCatra, a multi-view learning approach for Android malware detection, whose model consists of a bidirectional LSTM (BiLSTM) and a graph neural network (GNN) as subnets. The two subnets rely on features extracted from statically computed call traces leading to critical APIs derived from public vulnerabilities. For each Android app, DeepCatra first constructs its call graph and computes call traces reaching critical APIs. Then, temporal opcode features used by the BiLSTM subnet are extracted from the call traces, while flow graph features used by the GNN subnet are constructed from all the call traces and inter-component communications. We evaluate the effectiveness of DeepCatra by comparing it with several state-of-the-art detection approaches. Experimental results on over 18,000 real-world apps and prevalent malware show that DeepCatra achieves considerable improvement, e.g., 2.7% to 14.6% on F1-measure, which demonstrates the feasibility of DeepCatra in practice.

preprint2022arXiv

Resource allocation for reconfigurable intelligent surface aided broadcast channels

A two-user downlink network aided by a reconfigurable intelligent surface is considered. The weighted sum signal to interference plus noise ratio maximization and the sum rate maximization models are presented, where the precoding vectors and the RIS matrix are jointly optimized. Since the optimization problem is non-convex and difficult, new approximation models are proposed. The upper bounds of the corresponding objective functions are derived and maximized. Two new algorithms based on the alternating direction method of multiplier are proposed. It is proved that the proposed algorithms converge to the KKT points of the approximation models as long as the iteration points converge. Simulation results show the good performances of the proposed models compared to state of the art algorithms.

preprint2020arXiv

Chemical-protein Interaction Extraction via Gaussian Probability Distribution and External Biomedical Knowledge

Motivation: The biomedical literature contains a wealth of chemical-protein interactions (CPIs). Automatically extracting CPIs described in biomedical literature is essential for drug discovery, precision medicine, as well as basic biomedical research. Most existing methods focus only on the sentence sequence to identify these CPIs. However, the local structure of sentences and external biomedical knowledge also contain valuable information. Effective use of such information may improve the performance of CPI extraction. Results: In this paper, we propose a novel neural network-based approach to improve CPI extraction. Specifically, the approach first employs BERT to generate high-quality contextual representations of the title sequence, instance sequence, and knowledge sequence. Then, the Gaussian probability distribution is introduced to capture the local structure of the instance. Meanwhile, the attention mechanism is applied to fuse the title information and biomedical knowledge, respectively. Finally, the related representations are concatenated and fed into the softmax function to extract CPIs. We evaluate our proposed model on the CHEMPROT corpus. Our proposed model is superior in performance as compared with other state-of-the-art models. The experimental results show that the Gaussian probability distribution and external knowledge are complementary to each other. Integrating them can effectively improve the CPI extraction performance. Furthermore, the Gaussian probability distribution can effectively improve the extraction performance of sentences with overlapping relations in biomedical relation extraction tasks. Availability: Data and code are available at https://github.com/CongSun-dlut/CPI_extraction. Contact: yangzh@dlut.edu.cn, wangleibihami@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online.