Researcher profile

Bokai Cao

Bokai Cao contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 15 - Baseline
3works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2015arXiv

A review of heterogeneous data mining for brain disorders

With rapid advances in neuroimaging techniques, the research on brain disorder identification has become an emerging area in the data mining community. Brain disorder data poses many unique challenges for data mining research. For example, the raw data generated by neuroimaging experiments is in tensor representations, with typical characteristics of high dimensionality, structural complexity and nonlinear separability. Furthermore, brain connectivity networks can be constructed from the tensor data, embedding subtle interactions between brain regions. Other clinical measures are usually available reflecting the disease status from different perspectives. It is expected that integrating complementary information in the tensor data and the brain network data, and incorporating other clinical parameters will be potentially transformative for investigating disease mechanisms and for informing therapeutic interventions. Many research efforts have been devoted to this area. They have achieved great success in various applications, such as tensor-based modeling, subgraph pattern mining, multi-view feature analysis. In this paper, we review some recent data mining methods that are used for analyzing brain disorders.

preprint2015arXiv

Mining Brain Networks using Multiple Side Views for Neurological Disorder Identification

Mining discriminative subgraph patterns from graph data has attracted great interest in recent years. It has a wide variety of applications in disease diagnosis, neuroimaging, etc. Most research on subgraph mining focuses on the graph representation alone. However, in many real-world applications, the side information is available along with the graph data. For example, for neurological disorder identification, in addition to the brain networks derived from neuroimaging data, hundreds of clinical, immunologic, serologic and cognitive measures may also be documented for each subject. These measures compose multiple side views encoding a tremendous amount of supplemental information for diagnostic purposes, yet are often ignored. In this paper, we study the problem of discriminative subgraph selection using multiple side views and propose a novel solution to find an optimal set of subgraph features for graph classification by exploring a plurality of side views. We derive a feature evaluation criterion, named gSide, to estimate the usefulness of subgraph patterns based upon side views. Then we develop a branch-and-bound algorithm, called gMSV, to efficiently search for optimal subgraph features by integrating the subgraph mining process and the procedure of discriminative feature selection. Empirical studies on graph classification tasks for neurological disorders using brain networks demonstrate that subgraph patterns selected by the multi-side-view guided subgraph selection approach can effectively boost graph classification performances and are relevant to disease diagnosis.

preprint2013arXiv

Meta Path-Based Collective Classification in Heterogeneous Information Networks

Collective classification has been intensively studied due to its impact in many important applications, such as web mining, bioinformatics and citation analysis. Collective classification approaches exploit the dependencies of a group of linked objects whose class labels are correlated and need to be predicted simultaneously. In this paper, we focus on studying the collective classification problem in heterogeneous networks, which involves multiple types of data objects interconnected by multiple types of links. Intuitively, two objects are correlated if they are linked by many paths in the network. However, most existing approaches measure the dependencies among objects through directly links or indirect links without considering the different semantic meanings behind different paths. In this paper, we study the collective classification problem taht is defined among the same type of objects in heterogenous networks. Moreover, by considering different linkage paths in the network, one can capture the subtlety of different types of dependencies among objects. We introduce the concept of meta-path based dependencies among objects, where a meta path is a path consisting a certain sequence of linke types. We show that the quality of collective classification results strongly depends upon the meta paths used. To accommodate the large network size, a novel solution, called HCC (meta-path based Heterogenous Collective Classification), is developed to effectively assign labels to a group of instances that are interconnected through different meta-paths. The proposed HCC model can capture different types of dependencies among objects with respect to different meta paths. Empirical studies on real-world networks demonstrate that effectiveness of the proposed meta path-based collective classification approach.