Researcher profile

Le Yu

Le Yu contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
7works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

7 published item(s)

preprint2022arXiv

Heterogeneous Graph Representation Learning with Relation Awareness

Representation learning on heterogeneous graphs aims to obtain meaningful node representations to facilitate various downstream tasks, such as node classification and link prediction. Existing heterogeneous graph learning methods are primarily developed by following the propagation mechanism of node representations. There are few efforts on studying the role of relations for improving the learning of more fine-grained node representations. Indeed, it is important to collaboratively learn the semantic representations of relations and discern node representations with respect to different relation types. To this end, in this paper, we propose a novel Relation-aware Heterogeneous Graph Neural Network, namely R-HGNN, to learn node representations on heterogeneous graphs at a fine-grained level by considering relation-aware characteristics. Specifically, a dedicated graph convolution component is first designed to learn unique node representations from each relation-specific graph separately. Then, a cross-relation message passing module is developed to improve the interactions of node representations across different relations. Also, the relation representations are learned in a layer-wise manner to capture relation semantics, which are used to guide the node representation learning process. Moreover, a semantic fusing module is presented to aggregate relation-aware node representations into a compact representation with the learned relation representations. Finally, we conduct extensive experiments on a variety of graph learning tasks, and experimental results demonstrate that our approach consistently outperforms existing methods among all the tasks.

preprint2020arXiv

A Structured Approach to the Analysis of Remote Sensing Images

The number of studies for the analysis of remote sensing images has been growing exponentially in the last decades. Many studies, however, only report results---in the form of certain performance metrics---by a few selected algorithms on a training and testing sample. While this often yields valuable insights, it tells little about some important aspects. For example, one might be interested in understanding the nature of a study by the interaction of algorithm, features, and the sample as these collectively contribute to the outcome; among these three, which would be a more productive direction in improving a study; how to assess the sample quality or the value of a set of features etc. With a focus on land-use classification, we advocate the use of a structured analysis. The output of a study is viewed as the result of the interplay among three input dimensions: feature, sample, and algorithm. Similarly, another dimension, the error, can be decomposed into error along each input dimension. Such a structural decomposition of the inputs or error could help better understand the nature of the problem and potentially suggest directions for improvement. We use the analysis of a remote sensing image at a study site in Guangzhou, China, to demonstrate how such a structured analysis could be carried out and what insights it generates. The structured analysis could be applied to a new study, or as a diagnosis to an existing one. We expect this will inform practice in the analysis of remote sensing images, and help advance the state-of-the-art of land-use classification.

preprint2020arXiv

An Empirical Evaluation of GDPR Compliance Violations in Android mHealth Apps

The purpose of the General Data Protection Regulation (GDPR) is to provide improved privacy protection. If an app controls personal data from users, it needs to be compliant with GDPR. However, GDPR lists general rules rather than exact step-by-step guidelines about how to develop an app that fulfills the requirements. Therefore, there may exist GDPR compliance violations in existing apps, which would pose severe privacy threats to app users. In this paper, we take mobile health applications (mHealth apps) as a peephole to examine the status quo of GDPR compliance in Android apps. We first propose an automated system, named \mytool, to bridge the semantic gap between the general rules of GDPR and the app implementations by identifying the data practices declared in the app privacy policy and the data relevant behaviors in the app code. Then, based on \mytool, we detect three kinds of GDPR compliance violations, including the incompleteness of privacy policy, the inconsistency of data collections, and the insecurity of data transmission. We perform an empirical evaluation of 796 mHealth apps. The results reveal that 189 (23.7\%) of them do not provide complete privacy policies. Moreover, 59 apps collect sensitive data through different measures, but 46 (77.9\%) of them contain at least one inconsistent collection behavior. Even worse, among the 59 apps, only 8 apps try to ensure the transmission security of collected data. However, all of them contain at least one encryption or SSL misuse. Our work exposes severe privacy issues to raise awareness of privacy protection for app users and developers.

preprint2020arXiv

Cross-regional oil palm tree counting and detection via multi-level attention domain adaptation network

Providing an accurate evaluation of palm tree plantation in a large region can bring meaningful impacts in both economic and ecological aspects. However, the enormous spatial scale and the variety of geological features across regions has made it a grand challenge with limited solutions based on manual human monitoring efforts. Although deep learning based algorithms have demonstrated potential in forming an automated approach in recent years, the labelling efforts needed for covering different features in different regions largely constrain its effectiveness in large-scale problems. In this paper, we propose a novel domain adaptive oil palm tree detection method, i.e., a Multi-level Attention Domain Adaptation Network (MADAN) to reap cross-regional oil palm tree counting and detection. MADAN consists of 4 procedures: First, we adopted a batch-instance normalization network (BIN) based feature extractor for improving the generalization ability of the model, integrating batch normalization and instance normalization. Second, we embedded a multi-level attention mechanism (MLA) into our architecture for enhancing the transferability, including a feature level attention and an entropy level attention. Then we designed a minimum entropy regularization (MER) to increase the confidence of the classifier predictions through assigning the entropy level attention value to the entropy penalty. Finally, we employed a sliding window-based prediction and an IOU based post-processing approach to attain the final detection results. We conducted comprehensive ablation experiments using three different satellite images of large-scale oil palm plantation area with six transfer tasks. MADAN improves the detection accuracy by 14.98% in terms of average F1-score compared with the Baseline method (without DA), and performs 3.55%-14.49% better than existing domain adaptation methods.

preprint2020arXiv

Hybrid Micro/Macro Level Convolution for Heterogeneous Graph Learning

Heterogeneous graphs are pervasive in practical scenarios, where each graph consists of multiple types of nodes and edges. Representation learning on heterogeneous graphs aims to obtain low-dimensional node representations that could preserve both node attributes and relation information. However, most of the existing graph convolution approaches were designed for homogeneous graphs, and therefore cannot handle heterogeneous graphs. Some recent methods designed for heterogeneous graphs are also faced with several issues, including the insufficient utilization of heterogeneous properties, structural information loss, and lack of interpretability. In this paper, we propose HGConv, a novel Heterogeneous Graph Convolution approach, to learn comprehensive node representations on heterogeneous graphs with a hybrid micro/macro level convolutional operation. Different from existing methods, HGConv could perform convolutions on the intrinsic structure of heterogeneous graphs directly at both micro and macro levels: A micro-level convolution to learn the importance of nodes within the same relation, and a macro-level convolution to distinguish the subtle difference across different relations. The hybrid strategy enables HGConv to fully leverage heterogeneous information with proper interpretability. Moreover, a weighted residual connection is designed to aggregate both inherent attributes and neighbor information of the focal node adaptively. Extensive experiments on various tasks demonstrate not only the superiority of HGConv over existing methods, but also the intuitive interpretability of our approach for graph analysis.

preprint2020arXiv

Predicting Temporal Sets with Deep Neural Networks

Given a sequence of sets, where each set contains an arbitrary number of elements, the problem of temporal sets prediction aims to predict the elements in the subsequent set. In practice, temporal sets prediction is much more complex than predictive modelling of temporal events and time series, and is still an open problem. Many possible existing methods, if adapted for the problem of temporal sets prediction, usually follow a two-step strategy by first projecting temporal sets into latent representations and then learning a predictive model with the latent representations. The two-step approach often leads to information loss and unsatisfactory prediction performance. In this paper, we propose an integrated solution based on the deep neural networks for temporal sets prediction. A unique perspective of our approach is to learn element relationship by constructing set-level co-occurrence graph and then perform graph convolutions on the dynamic relationship graphs. Moreover, we design an attention-based module to adaptively learn the temporal dependency of elements and sets. Finally, we provide a gated updating mechanism to find the hidden shared patterns in different sequences and fuse both static and dynamic information to improve the prediction performance. Experiments on real-world data sets demonstrate that our approach can achieve competitive performances even with a portion of the training data and can outperform existing methods with a significant margin.

preprint2020arXiv

STAN: Towards Describing Bytecodes of Smart Contract

More than eight million smart contracts have been deployed into Ethereum, which is the most popular blockchain that supports smart contract. However, less than 1% of deployed smart contracts are open-source, and it is difficult for users to understand the functionality and internal mechanism of those closed-source contracts. Although a few decompilers for smart contracts have been recently proposed, it is still not easy for users to grasp the semantic information of the contract, not to mention the potential misleading due to decompilation errors. In this paper, we propose the first system named STAN to generate descriptions for the bytecodes of smart contracts to help users comprehend them. In particular, for each interface in a smart contract, STAN can generate four categories of descriptions, including functionality description, usage description, behavior description, and payment description, by leveraging symbolic execution and NLP (Natural Language Processing) techniques. Extensive experiments show that STAN can generate adequate, accurate, and readable descriptions for contract's bytecodes, which have practical value for users.