Source author record

Pan Hui

Pan Hui appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

55works

30topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2023arXiv

A Twitter Dataset for Pakistani Political Discourse

We share the largest dataset for the Pakistani Twittersphere consisting of over 49 million tweets, collected during one of the most politically active periods in the country. We collect the data after the deposition of the government by a No Confidence Vote in April 2022. This large-scale dataset can be used for several downstream tasks such as political bias, bots detection, trolling behavior, (dis)misinformation, and censorship related to Pakistani Twitter users. In addition, this dataset provides a large collection of tweets in Urdu and Roman Urdu that can be used for optimizing language processing tasks.

preprint2023arXiv

FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming

Point cloud video transmission is challenging due to high encoding/decoding complexity, high video bitrate, and low latency requirement. Consequently, conventional adaptive streaming methodologies often find themselves unsatisfactory to meet the requirements in threefold: 1) current algorithms reuse existing quality of experience (QoE) definitions while overlooking the unique features of point cloud video thus failing to provide optimal user experience, 2) most deep learning approaches require long-span data collections to learn sufficiently varied network conditions and result in long training periods and capacity occupation, 3) cloud training approaches pose privacy risks caused by leakage of user reported service usage and networking conditions. To overcome the limitations, we present FRAS, the first federated reinforcement learning framework, to the best of our knowledge, for adaptive point cloud video streaming. We define a new QoE model which takes the unique features of point cloud video into account. Each client uses reinforcement learning (RL) to train video quality selection with the objective of optimizing the user's QoE under multiple constraints. Then, a federated learning framework is integrated with the RL algorithm to enhance training performance with privacy preservation. Extensive simulations using real point cloud videos and network traces reveal the superiority of the proposed scheme over baseline schemes. We also implement a prototype that demonstrates the performance of FRAS via real-world tests.

preprint2022arXiv

A Reddit Dataset for the Russo-Ukrainian Conflict in 2022

Reddit consists of sub-communities that cover a focused topic. This paper provides a list of relevant subreddits for the ongoing Russo-Ukrainian crisis. We perform an exhaustive subreddit exploration using keyword search and shortlist 12 subreddits as potential candidates that contain nominal discourse related to the crisis. These subreddits contain over 300,000 posts and 8 million comments collectively. We provide an additional categorization of content into two categories, "R-U Conflict", and "Military Related", based on their primary focus. We further perform content characterization of those subreddits. The results show a surge of posts and comments soon after Russia launched the invasion. "Military Related" posts are more likely to receive more replies than "R-U Conflict" posts. Our textual analysis shows an apparent preference for the Pro-Ukraine stance in "R-U Conflict", while "Military Related" retain a neutral stance.

preprint2022arXiv

AICP: Augmented Informative Cooperative Perception

Connected vehicles, whether equipped with advanced driver-assistance systems or fully autonomous, require human driver supervision and are currently constrained to visual information in their line-of-sight. A cooperative perception system among vehicles increases their situational awareness by extending their perception range. Existing solutions focus on improving perspective transformation and fast information collection. However, such solutions fail to filter out large amounts of less relevant data and thus impose significant network and computation load. Moreover, presenting all this less relevant data can overwhelm the driver and thus actually hinder them. To address such issues, we present Augmented Informative Cooperative Perception (AICP), the first fast-filtering system which optimizes the informativeness of shared data at vehicles to improve the fused presentation. To this end, an informativeness maximization problem is presented for vehicles to select a subset of data to display to their drivers. Specifically, we propose (i) a dedicated system design with custom data structure and lightweight routing protocol for convenient data encapsulation, fast interpretation and transmission, and (ii) a comprehensive problem formulation and efficient fitness-based sorting algorithm to select the most valuable data to display at the application layer. We implement a proof-of-concept prototype of AICP with a bandwidth-hungry, latency-constrained real-life augmented reality application. The prototype adds only 12.6 milliseconds of latency to a current informativeness-unaware system. Next, we test the networking performance of AICP at scale and show that ACIP effectively filters out less relevant packets and decreases the channel busy time.

preprint2022arXiv

Beyond the Blue Sky of Multimodal Interaction: A Centennial Vision of Interplanetary Virtual Spaces in Turn-based Metaverse

Human habitation across multiple planets requires communication and social connection between planets. When the infrastructure of a deep space network becomes mature, immersive cyberspace, known as the Metaverse, can exchange diversified user data and host multitudinous virtual worlds. Nevertheless, such immersive cyberspace unavoidably encounters latency in minutes, and thus operates in a turn-taking manner. This Blue Sky paper illustrates a vision of an interplanetary Metaverse that connects Earthian and Martian users in a turn-based Metaverse. Accordingly, we briefly discuss several grand challenges to catalyze research initiatives for the `Digital Big Bang' on Mars.

preprint2022arXiv

Decentralized, not Dehumanized in the Metaverse: Bringing Utility to NFTs through Multimodal Interaction

User Interaction for NFTs (Non-fungible Tokens) is gaining increasing attention. Although NFTs have been traditionally single-use and monolithic, recent applications aim to connect multimodal interaction with human behaviour. This paper reviews the related technological approaches and business practices in NFT art. We highlight that multimodal interaction is a currently under-studied issue in mainstream NFT art, and conjecture that multimodal interaction is a crucial enabler for decentralization in the NFT community. We propose a framework that combines a bottom-up approach with AI multimodal process. Through this framework, we put forward integrating human behaviour data into generative NFT units, as "multimodal interactive NFT". Our work displays the possibilities of NFTs in the art world, beyond the traditional 2D and 3D static content.

preprint2022arXiv

DiOS -- An Extended Reality Operating System for the Metaverse

Driven by the recent improvements in device and networks capabilities, Extended Reality (XR) is becoming more pervasive; industry and academia alike envision ambitious projects such as the metaverse. However, XR is still limited by the current architecture of mobile systems. This paper makes the case for an XR-specific operating system (XROS). Such an XROS integrates hardware-support, computer vision algorithms, and XR-specific networking as the primitives supporting XR technology. These primitives represent the physical-digital world as a single shared resource among applications. Such an XROS allows for the development of coherent and system-wide interaction and display methods, systematic privacy preservation on sensor data, and performance improvement while simplifying application development.

preprint2022arXiv

Distributed Vehicular Computing at the Dawn of 5G: a Survey

Recent advances in information technology have revolutionized the automotive industry, paving the way for next-generation smart vehicular mobility. Vehicles, roadside units, and other road users can collaborate to deliver novel services and applications. These services and applications require 1) massive volumes of heterogeneous and continuous data to perceive the environment, 2) reliable and low-latency communication networks, 3) real-time data processing that provides decision support under application-specific constraints. Addressing such constraints introduces significant challenges for current communication and computing technologies. Relatedly, the fifth generation of cellular networks (5G) was developed to respond to communication challenges by providing for low-latency, high-reliability, and high bandwidth communications. As a major part of 5G, edge computing allows data offloading and computation at the edge of the network, ensuring low-latency and context-awareness, and 5G efficiency. In this work, we aim at providing a comprehensive overview of the state of research on vehicular computing in the emerging age of 5G and big data.

preprint2022arXiv

Exploring System Performance of Continual Learning for Mobile and Embedded Sensing Applications

Continual learning approaches help deep neural network models adapt and learn incrementally by trying to solve catastrophic forgetting. However, whether these existing approaches, applied traditionally to image-based tasks, work with the same efficacy to the sequential time series data generated by mobile or embedded sensing systems remains an unanswered question. To address this void, we conduct the first comprehensive empirical study that quantifies the performance of three predominant continual learning schemes (i.e., regularization, replay, and replay with examples) on six datasets from three mobile and embedded sensing applications in a range of scenarios having different learning complexities. More specifically, we implement an end-to-end continual learning framework on edge devices. Then we investigate the generalizability, trade-offs between performance, storage, computational costs, and memory footprint of different continual learning methods. Our findings suggest that replay with exemplars-based schemes such as iCaRL has the best performance trade-offs, even in complex scenarios, at the expense of some storage space (few MBs) for training examples (1% to 5%). We also demonstrate for the first time that it is feasible and practical to run continual learning on-device with a limited memory budget. In particular, the latency on two types of mobile and embedded devices suggests that both incremental learning time (few seconds - 4 minutes) and training time (1 - 75 minutes) across datasets are acceptable, as training could happen on the device when the embedded device is charging thereby ensuring complete data privacy. Finally, we present some guidelines for practitioners who want to apply a continual learning paradigm for mobile sensing tasks.

preprint2022arXiv

Federated Split GANs

Mobile devices and the immense amount and variety of data they generate are key enablers of machine learning (ML)-based applications. Traditional ML techniques have shifted toward new paradigms such as federated (FL) and split learning (SL) to improve the protection of user's data privacy. However, these paradigms often rely on server(s) located in the edge or cloud to train computationally-heavy parts of a ML model to avoid draining the limited resource on client devices, resulting in exposing device data to such third parties. This work proposes an alternative approach to train computationally-heavy ML models in user's devices themselves, where corresponding device data resides. Specifically, we focus on GANs (generative adversarial networks) and leverage their inherent privacy-preserving attribute. We train the discriminative part of a GAN with raw data on user's devices, whereas the generative model is trained remotely (e.g., server) for which there is no need to access sensor true data. Moreover, our approach ensures that the computational load of training the discriminative model is shared among user's devices-proportional to their computation capabilities-by means of SL. We implement our proposed collaborative training scheme of a computationally-heavy GAN model in real resource-constrained devices. The results show that our system preserves data privacy, keeps a short training time, and yields same accuracy of model training in unconstrained devices (e.g., cloud). Our code can be found on https://github.com/YukariSonz/FSL-GAN

preprint2022arXiv

HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask

Federated learning alleviates the privacy risk in distributed learning by transmitting only the local model updates to the central server. However, it faces challenges including statistical heterogeneity of clients' datasets and resource constraints of client devices, which severely impact the training performance and user experience. Prior works have tackled these challenges by combining personalization with model compression schemes including quantization and pruning. However, the pruning is data-dependent and thus must be done on the client side which requires considerable computation cost. Moreover, the pruning normally trains a binary supermask $\in \{0, 1\}$ which significantly limits the model capacity yet with no computation benefit. Consequently, the training requires high computation cost and a long time to converge while the model performance does not pay off. In this work, we propose HideNseek which employs one-shot data-agnostic pruning at initialization to get a subnetwork based on weights' synaptic saliency. Each client then optimizes a sign supermask $\in \{-1, +1\}$ multiplied by the unpruned weights to allow faster convergence with the same compression rates as state-of-the-art. Empirical results from three datasets demonstrate that compared to state-of-the-art, HideNseek improves inferences accuracies by up to 40.6\% while reducing the communication cost and training time by up to 39.7\% and 46.8\% respectively.

preprint2022arXiv

Life, the Metaverse and Everything: An Overview of Privacy, Ethics, and Governance in Metaverse

The metaverse is expected to be the next major evolution phase of the internet. The metaverse will have an impact on human society, production, and life. In this work, we analyze the current trends and challenges that building such a virtual environment will face. We focus on three major pillars to guide the development of the metaverse: privacy, governance, and ethical design, to guide the development of the metaverse. Finally, we propose a preliminary modular-based framework for an ethical design of the metaverse.

preprint2022arXiv

Nebula: Reliable Low-latency Video Transmission for Mobile Cloud Gaming

Mobile cloud gaming enables high-end games on constrained devices by streaming the game content from powerful servers through mobile networks. Mobile networks suffer from highly variable bandwidth, latency, and losses that affect the gaming experience. This paper introduces Nebula, an end-to-end cloud gaming framework to minimize the impact of network conditions on the user experience. Nebula relies on an end-to-end distortion model adapting the video source rate and the amount of frame-level redundancy based on the measured network conditions. As a result, it minimizes the motion-to-photon (MTP) latency while protecting the frames from losses. We fully implement Nebula and evaluate its performance against the state of the art techniques and latest research in real-time mobile cloud gaming transmission on a physical testbed over emulated and real wireless networks. Nebula consistently balances MTP latency (<140 ms) and visual quality (>31 dB) even in highly variable environments. A user experiment confirms that Nebula maximizes the user experience with high perceived video quality, playability, and low user load.

preprint2022arXiv

Towards Reproducible Evaluations for Flying Drone Controllers in Virtual Environments

Research attention on natural user interfaces (NUIs) for drone flights are rising. Nevertheless, NUIs are highly diversified, and primarily evaluated by different physical environments leading to hard-to-compare performance between such solutions. We propose a virtual environment, namely VRFlightSim, enabling comparative evaluations with enriched drone flight details to address this issue. We first replicated a state-of-the-art (SOTA) interface and designed two tasks (crossing and pointing) in our virtual environment. Then, two user studies with 13 participants demonstrate the necessity of VRFlightSim and further highlight the potential of open-data interface designs.

preprint2022arXiv

Towards User-Centered Metrics for Trustworthy AI in Immersive Cyberspace

AI plays a key role in current cyberspace and future immersive ecosystems that pinpoint user experiences. Thus, the trustworthiness of such AI systems is vital as failures in these systems can cause serious user harm. Although there are related works on exploring trustworthy AI (TAI) metrics in the current cyberspace, ecosystems towards user-centered services, such as the metaverse, are much more complicated in terms of system performance and user experience assessment, thus posing challenges for the applicability of existing approaches. Thus, we give an overlook on fairness, privacy and robustness, across the historical path from existing approaches. Eventually, we propose a research agenda towards systematic yet user-centered TAI in immersive ecosystems.

preprint2022arXiv

Twitter Dataset for 2022 Russo-Ukrainian Crisis

Online Social Networks (OSNs) play a significant role in information sharing during a crisis. The data collected during such a crisis can reflect the large scale public opinions and sentiment. In addition, OSN data can also be used to study different campaigns that are employed by various entities to engineer public opinions. Such information sharing campaigns can range from spreading factual information to propaganda and misinformation. We provide a Twitter dataset of the 2022 Russo-Ukrainian conflict. In the first release, we share over 1.6 million tweets shared during the 1st week of the crisis.

preprint2022arXiv

VibroWeight: Simulating Weight and Center of Gravity Changes of Objects in Virtual Reality for Enhanced Realism

Haptic feedback in virtual reality (VR) allows users to perceive the physical properties of virtual objects (e.g., their weight and motion patterns). However, the lack of haptic sensations deteriorates users' immersion and overall experience. In this work, we designed and implemented a low-cost hardware prototype with liquid metal, VibroWeight, which can work in complementarity with commercial VR handheld controllers. VibroWeight is characterized by bimodal feedback cues in VR, driven by adaptive absolute mass (weights) and gravity shift. To our knowledge, liquid metal is used in a VR haptic device for the first time. Our 29 participants show that VibroWeight delivers significantly better VR experiences in realism and comfort.

preprint2022arXiv

What is the Metaverse? An Immersive Cyberspace and Open Challenges

The Metaverse refers to a virtual-physical blended space in which multiple users can concurrently interact with a unified computer-generated environment and other users, which can be regarded as the next significant milestone of the current cyberspace. This article primarily discusses the development and challenges of the Metaverse. We first briefly describe the development of cyberspace and the necessity of technology enablers. Accordingly, our bottom-up approach highlights three critical technology enablers for the Metaverse: networks, systems, and users. Also, we highlight a number of indispensable issues, under technological and ecosystem perspectives, that build and sustain the Metaverse.

preprint2022arXiv

When Gamification Spoils Your Learning: A Qualitative Case Study of Gamification Misuse in a Language-Learning App

More and more learning apps like Duolingo are using some form of gamification (e.g., badges, points, and leaderboards) to enhance user learning. However, they are not always successful. Gamification misuse is a phenomenon that occurs when users become too fixated on gamification and get distracted from learning. This undesirable phenomenon wastes users' precious time and negatively impacts their learning performance. However, there has been little research in the literature to understand gamification misuse and inform future gamification designs. Therefore, this paper aims to fill this knowledge gap by conducting the first extensive qualitative research on gamification misuse in a popular learning app called Duolingo. Duolingo is currently the world's most downloaded learning app used to learn languages. This study consists of two phases: (I) a content analysis of data from Duolingo forums (from the past nine years) and (II) semi-structured interviews with 15 international Duolingo users. Our research contributes to the Human-Computer Interaction (HCI) and Learning at Scale (L@S) research communities in three ways: (1) elaborating the ramifications of gamification misuse on user learning, well-being, and ethics, (2) identifying the most common reasons for gamification misuse (e.g., competitiveness, overindulgence in playfulness, and herding), and (3) providing designers with practical suggestions to prevent (or mitigate) the occurrence of gamification misuse in their future designs of gamified learning apps.

preprint2021arXiv

Characterizing Student Engagement Moods for Dropout Prediction in Question Pool Websites

Problem-Based Learning (PBL) is a popular approach to instruction that supports students to get hands-on training by solving problems. Question Pool websites (QPs) such as LeetCode, Code Chef, and Math Playground help PBL by supplying authentic, diverse, and contextualized questions to students. Nonetheless, empirical findings suggest that 40% to 80% of students registered in QPs drop out in less than two months. This research is the first attempt to understand and predict student dropouts from QPs via exploiting students' engagement moods. Adopting a data-driven approach, we identify five different engagement moods for QP students, which are namely challenge-seeker, subject-seeker, interest-seeker, joy-seeker, and non-seeker. We find that students have collective preferences for answering questions in each engagement mood, and deviation from those preferences increases their probability of dropping out significantly. Last but not least, this paper contributes by introducing a new hybrid machine learning model (we call Dropout-Plus) for predicting student dropouts in QPs. The test results on a popular QP in China, with nearly 10K students, show that Dropout-Plus can exceed the rival algorithms' dropout prediction performance in terms of accuracy, F1-measure, and AUC. We wrap up our work by giving some design suggestions to QP managers and online learning professionals to reduce their student dropouts.

preprint2021arXiv

DRLE: Decentralized Reinforcement Learning at the Edge for Traffic Light Control in the IoV

The Internet of Vehicles (IoV) enables real-time data exchange among vehicles and roadside units and thus provides a promising solution to alleviate traffic jams in the urban area. Meanwhile, better traffic management via efficient traffic light control can benefit the IoV as well by enabling a better communication environment and decreasing the network load. As such, IoV and efficient traffic light control can formulate a virtuous cycle. Edge computing, an emerging technology to provide low-latency computation capabilities at the edge of the network, can further improve the performance of this cycle. However, while the collected information is valuable, an efficient solution for better utilization and faster feedback has yet to be developed for edge-empowered IoV. To this end, we propose a Decentralized Reinforcement Learning at the Edge for traffic light control in the IoV (DRLE). DRLE exploits the ubiquity of the IoV to accelerate the collection of traffic data and its interpretation towards alleviating congestion and providing better traffic light control. DRLE operates within the coverage of the edge servers and uses aggregated data from neighboring edge servers to provide city-scale traffic light control. DRLE decomposes the highly complex problem of large area control. into a decentralized multi-agent problem. We prove its global optima with concrete mathematical reasoning. The proposed decentralized reinforcement learning algorithm running at each edge node adapts the traffic lights in real time. We conduct extensive evaluations and demonstrate the superiority of this approach over several state-of-the-art algorithms.

preprint2021arXiv

Genetic Meta-Structure Search for Recommendation on Heterogeneous Information Network

In the past decade, the heterogeneous information network (HIN) has become an important methodology for modern recommender systems. To fully leverage its power, manually designed network templates, i.e., meta-structures, are introduced to filter out semantic-aware information. The hand-crafted meta-structure rely on intense expert knowledge, which is both laborious and data-dependent. On the other hand, the number of meta-structures grows exponentially with its size and the number of node types, which prohibits brute-force search. To address these challenges, we propose Genetic Meta-Structure Search (GEMS) to automatically optimize meta-structure designs for recommendation on HINs. Specifically, GEMS adopts a parallel genetic algorithm to search meaningful meta-structures for recommendation, and designs dedicated rules and a meta-structure predictor to efficiently explore the search space. Finally, we propose an attention based multi-view graph convolutional network module to dynamically fuse information from different meta-structures. Extensive experiments on three real-world datasets suggest the effectiveness of GEMS, which consistently outperforms all baseline methods in HIN recommendation. Compared with simplified GEMS which utilizes hand-crafted meta-paths, GEMS achieves over $6\%$ performance gain on most evaluation metrics. More importantly, we conduct an in-depth analysis on the identified meta-structures, which sheds light on the HIN based recommender system design.

preprint2021arXiv

Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with physical environments for mobile devices. MAR systems enable users to interact with MAR devices, such as smartphones and head-worn wearables, and performs seamless transitions from the physical world to a mixed world with digital entities. These MAR systems support user experiences by using MAR devices to provide universal accessibility to digital contents. Over the past 20 years, a number of MAR systems have been developed, however, the studies and design of MAR frameworks have not yet been systematically reviewed from the perspective of user-centric design. This article presents the first effort of surveying existing MAR frameworks (count: 37) and further discusses the latest studies on MAR through a top-down approach: 1) MAR applications; 2) MAR visualisation techniques adaptive to user mobility and contexts; 3) systematic evaluation of MAR frameworks including supported platforms and corresponding features such as tracking, feature extraction plus sensing capabilities; and 4) underlying machine learning approaches supporting intelligent operations within MAR systems. Finally, we summarise the development of emerging research fields, current state-of-the-art, and discuss the important open challenges and possible theoretical and technical directions. This survey aims to benefit both researchers and MAR system developers alike.

preprint2021arXiv

Predicting Hyperkalemia in the ICU and Evaluation of Generalizability and Interpretability

Hyperkalemia is a potentially life-threatening condition that can lead to fatal arrhythmias. Early identification of high risk patients can inform clinical care to mitigate the risk. While hyperkalemia is often a complication of acute kidney injury (AKI), it also occurs in the absence of AKI. We developed predictive models to identify intensive care unit (ICU) patients at risk of developing hyperkalemia by using the Medical Information Mart for Intensive Care (MIMIC) and the eICU Collaborative Research Database (eICU-CRD). Our methodology focused on building multiple models, optimizing for interpretability through model selection, and simulating various clinical scenarios. In order to determine if our models perform accurately on patients with and without AKI, we evaluated the following clinical cases: (i) predicting hyperkalemia after AKI within 14 days of ICU admission, (ii) predicting hyperkalemia within 14 days of ICU admission regardless of AKI status, and compared different lead times for (i) and (ii). Both clinical scenarios were modeled using logistic regression (LR), random forest (RF), and XGBoost. Using observations from the first day in the ICU, our models were able to predict hyperkalemia with an AUC of (i) 0.79, 0.81, 0.81 and (ii) 0.81, 0.85, 0.85 for LR, RF, and XGBoost respectively. We found that 4 out of the top 5 features were consistent across the models. AKI stage was significant in the models that included all patients with or without AKI, but not in the models which only included patients with AKI. This suggests that while AKI is important for hyperkalemia, the specific stage of AKI may not be as important. Our findings require further investigation and confirmation.

preprint2021arXiv

Towards Mobile Distributed Ledgers

Advances in mobile computing have paved the way for new types of distributed applications that can be executed solely by mobile devices on device-to-device (D2D) ecosystems (e.g., crowdsensing). Sophisticated applications, like cryptocurrencies, need distributed ledgers to function. Distributed ledgers, such as blockchains and directed acyclic graphs (DAGs), employ consensus protocols to add data in the form of blocks. However, such protocols are designed for resourceful devices that are interconnected via the Internet. Moreover, existing distributed ledgers are not deployable to D2D ecosystems since their storage needs are continuously increasing. In this work, we introduce and analyse Mneme, a DAG-based distributed ledger that can be maintained solely by mobile devices. Mneme utilizes two novel consensus protocols: Proof-of-Context (PoC) and Proof-of-Equivalence (PoE). PoC employs users' context to add data on Mneme. PoE is executed periodically to summarize data and produce equivalent blocks that require less storage. We analyze Mneme's security and justify the ability of PoC and PoE to guarantee the characteristics of distributed ledgers: persistence and liveness. Furthermore, we analyze potential attacks from malicious users and prove that the probability of a successful attack is inversely proportional to the square of the number of mobile users who maintain Mneme.

preprint2020arXiv

A Survey on Computational Politics

Computational Politics is the study of computational methods to analyze and moderate users' behaviors related to political activities such as election campaign persuasion, political affiliation, and opinion mining. With the rapid development and ease of access to the Internet, Information Communication Technologies (ICT) have given rise to massive numbers of users joining online communities and the digitization of analogous data such as political debates. These communities and digitized data contain both explicit and latent information about users and their behaviors related to politics. For researchers, it is essential to utilize data from these sources to develop and design systems that not only provide solutions to computational politics but also help other businesses, such as marketers to increase users, participation and interactions. In this survey, we attempt to categorize main areas in computational politics and summarize the prominent studies in one place to better understand computational politics across different and multidimensional platforms. e.g., online social networks, online forums, and political debates. We then conclude this study by highlighting future research directions, opportunities, and challenges.

preprint2020arXiv

DeepHealth: Review and challenges of artificial intelligence in health informatics

Artificial intelligence has provided us with an exploration of a whole new research era. As more data and better computational power become available, the approach is being implemented in various fields. The demand for it in health informatics is also increasing, and we can expect to see the potential benefits of its applications in healthcare. It can help clinicians diagnose disease, identify drug effects for each patient, understand the relationship between genotypes and phenotypes, explore new phenotypes or treatment recommendations, and predict infectious disease outbreaks with high accuracy. In contrast to traditional models, recent artificial intelligence approaches do not require domain-specific data pre-processing, and it is expected that it will ultimately change life in the future. Despite its notable advantages, there are some key challenges on data (high dimensionality, heterogeneity, time dependency, sparsity, irregularity, lack of label, bias) and model (reliability, interpretability, feasibility, security, scalability) for practical use. This article presents a comprehensive review of research applying artificial intelligence in health informatics, focusing on the last seven years in the fields of medical imaging, electronic health records, genomics, sensing, and online communication health, as well as challenges and promising directions for future research. We highlight ongoing popular approaches' research and identify several challenges in building models.

preprint2020arXiv

Edge Intelligence: Architectures, Challenges, and Applications

Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis in locations close to where data is captured based on artificial intelligence. The aim of edge intelligence is to enhance the quality and speed of data processing and protect the privacy and security of the data. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this paper, we present a thorough and comprehensive survey on the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, namely edge caching, edge training, edge inference, and edge offloading, based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare and analyse the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, etc. This survey article provides a comprehensive introduction to edge intelligence and its application areas. In addition, we summarise the development of the emerging research field and the current state-of-the-art and discuss the important open issues and possible theoretical and technical solutions.

preprint2020arXiv

Evaluating Transport Protocols on 5G for Mobile Augmented Reality

Mobile Augmented Reality (MAR) mixes physical environments with user-interactive virtual annotations. Immersive MAR experiences are supported by computation-intensive tasks which rely on offloading mechanisms to ease device workloads. However, this introduces additional network traffic which in turn influences the motion-to-photon latency (a determinant of user-perceived quality of experience). Therefore, a proper transport protocol is crucial to minimise transmission latency and ensure sufficient throughput to support MAR performance. Relatedly, 5G, a potential MAR supporting technology, is widely believed to be smarter, faster, and more efficient than its predecessors. However, the suitability and performance of existing transport protocols in MAR in the 5G context has not been explored. Therefore, we present an evaluation of popular transport protocols, including UDP, TCP, MPEG-TS, RTP, and QUIC, with a MAR system on a real-world 5G testbed. We also compare with their 5G performance with LTE and WiFi. Our evaluation results indicate that TCP has the lowest round-trip-time on 5G, with a median of $15.09\pm0.26$ ms, while QUIC appears to perform better on LTE. Through an additional test with varying signal quality (specifically, degrading secondary synchronisation signal reference signal received quality), we discover that protocol performance appears to be significantly impacted by signal quality.

preprint2020arXiv

Marketplace for AI Models

Artificial intelligence shows promise for solving many practical societal problems in areas such as healthcare and transportation. However, the current mechanisms for AI model diffusion such as Github code repositories, academic project webpages, and commercial AI marketplaces have some limitations; for example, a lack of monetization methods, model traceability, and model auditabilty. In this work, we sketch guidelines for a new AI diffusion method based on a decentralized online marketplace. We consider the technical, economic, and regulatory aspects of such a marketplace including a discussion of solutions for problems in these areas. Finally, we include a comparative analysis of several current AI marketplaces that are already available or in development. We find that most of these marketplaces are centralized commercial marketplaces with relatively few models.

preprint2020arXiv

Trustworthy AI in the Age of Pervasive Computing and Big Data

The era of pervasive computing has resulted in countless devices that continuously monitor users and their environment, generating an abundance of user behavioural data. Such data may support improving the quality of service, but may also lead to adverse usages such as surveillance and advertisement. In parallel, Artificial Intelligence (AI) systems are being applied to sensitive fields such as healthcare, justice, or human resources, raising multiple concerns on the trustworthiness of such systems. Trust in AI systems is thus intrinsically linked to ethics, including the ethics of algorithms, the ethics of data, or the ethics of practice. In this paper, we formalise the requirements of trustworthy AI systems through an ethics perspective. We specifically focus on the aspects that can be integrated into the design and development of AI systems. After discussing the state of research and the remaining challenges, we show how a concrete use-case in smart cities can benefit from these methods.

preprint2020arXiv

Urban Anomaly Analytics: Description, Detection, and Prediction

Urban anomalies may result in loss of life or property if not handled properly. Automatically alerting anomalies in their early stage or even predicting anomalies before happening are of great value for populations. Recently, data-driven urban anomaly analysis frameworks have been forming, which utilize urban big data and machine learning algorithms to detect and predict urban anomalies automatically. In this survey, we make a comprehensive review of the state-of-the-art research on urban anomaly analytics. We first give an overview of four main types of urban anomalies, traffic anomaly, unexpected crowds, environment anomaly, and individual anomaly. Next, we summarize various types of urban datasets obtained from diverse devices, i.e., trajectory, trip records, CDRs, urban sensors, event records, environment data, social media and surveillance cameras. Subsequently, a comprehensive survey of issues on detecting and predicting techniques for urban anomalies is presented. Finally, research challenges and open problems as discussed.

preprint2016arXiv

Dissecting the End-to-end Latency of Interactive Mobile Video Applications

In this paper we measure the step-wise latency in the pipeline of three kinds of interactive mobile video applications that are rapidly gaining popularity, namely Remote Graphics Rendering (RGR) of which we focus on mobile cloud gaming, Mobile Augmented Reality (MAR), and Mobile Virtual Reality (MVR). The applications differ from each other by the way in which the user interacts with the application, i.e., video I/O and user controls, but they all share in common the fact that their user experience is highly sensitive to end-to-end latency. Long latency between a user control event and display update renders the application unusable. Hence, understanding the nature and origins of latency of these applications is of paramount importance. We show through extensive measurements that control input and display buffering have a substantial effect on the overall delay. Our results shed light on the latency bottlenecks and the maturity of technology for seamless user experience with these applications.

preprint2016arXiv

Navigation by anomalous random walks on complex networks

Anomalous random walks having long-range jumps are a critical branch of dynamical processes on networks, which can model a number of search and transport processes. However, traditional measurements based on mean first passage time are not useful as they fail to characterize the cost associated with each jump. Here we introduce a new concept of mean first traverse distance (MFTD) to characterize anomalous random walks that represents the expected traverse distance taken by walkers searching from source node to target node, and we provide a procedure for calculating the MFTD between two nodes. We use Levy walks on networks as an example, and demonstrate that the proposed approach can unravel the interplay between diffusion dynamics of Levy walks and the underlying network structure. Interestingly, applying our framework to the famous PageRank search, we can explain why its damping factor empirically chosen to be around 0.85. The framework for analyzing anomalous random walks on complex networks offers a new useful paradigm to understand the dynamics of anomalous diffusion processes, and provides a unified scheme to characterize search and transport processes on networks.

preprint2016arXiv

Security Pricing as an Enabler of Cyber-Insurance: A First Look at Differentiated Pricing Markets

Despite the promising potential of network risk management services (e.g., cyber-insurance) to improve information security, their deployment is relatively scarce, primarily due to such service companies being unable to guarantee profitability. As a novel approach to making cyber-insurance services more viable, we explore a symbiotic relationship between security vendors (e.g., Symantec) capable of price differentiating their clients, and cyber-insurance agencies having possession of information related to the security investments of their clients. The goal of this relationship is to (i) allow security vendors to price differentiate their clients based on security investment information from insurance agencies, (ii) allow the vendors to make more profit than in homogeneous pricing settings, and (iii) subsequently transfer some of the extra profit to cyber-insurance agencies to make insurance services more viable. \noindent In this paper, we perform a theoretical study of a market for differentiated security product pricing, primarily with a view to ensuring that security vendors (SVs) make more profit in the differentiated pricing case as compared to the case of non-differentiated pricing. In order to practically realize such pricing markets, we propose novel and \emph{computationally efficient} consumer differentiated pricing mechanisms for SVs based on (i) the market structure, (ii) the communication network structure of SV consumers captured via a consumer's \emph{Bonacich centrality} in the network, and (iii) security investment amounts made by SV consumers.

preprint2015arXiv

Explaining the Power-law Distribution of Human Mobility Through Transportation Modality Decomposition

Human mobility has been empirically observed to exhibit Levy flight characteristics and behaviour with power-law distributed jump size. The fundamental mechanisms behind this behaviour has not yet been fully explained. In this paper, we analyze urban human mobility and we propose to explain the Levy walk behaviour observed in human mobility patterns by decomposing them into different classes according to the different transportation modes, such as Walk/Run, Bicycle, Train/Subway or Car/Taxi/Bus. Our analysis is based on two real-life GPS datasets containing approximately 10 and 20 million GPS samples with transportation mode information. We show that human mobility can be modelled as a mixture of different transportation modes, and that these single movement patterns can be approximated by a lognormal distribution rather than a power-law distribution. Then, we demonstrate that the mixture of the decomposed lognormal flight distributions associated with each modality is a power-law distribution, providing an explanation to the emergence of Levy Walk patterns that characterize human mobility patterns.

preprint2014arXiv

An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static Graphs

We analyze a class of distributed quantized consensus algorithms for arbitrary static networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reach consensus with quantized precision. We analyze the expected convergence time for the general quantized consensus algorithm proposed by Kashyap et al \cite{Kashyap}. We use the theory of electric networks, random walks, and couplings of Markov chains to derive an $O(N^3\log N)$ upper bound for the expected convergence time on an arbitrary graph of size $N$, improving on the state of art bound of $O(N^5)$ for quantized consensus algorithms. Our result is not dependent on graph topology. Example of complete graphs is given to show how to extend the analysis to graphs of given topology.

preprint2014arXiv

Cooperative Caching based on File Popularity Ranking in Delay Tolerant Networks

Increasing storage sizes and WiFi/Bluetooth capabilities of mobile devices have made them a good platform for opportunistic content sharing. In this work we propose a network model to study this in a setting with two characteristics: 1. delay tolerant; 2. lack of infrastructure. Mobile users generate requests and opportunistically download from other users they meet, via Bluetooth or WiFi. The difference in popularity of different web content induces a non-uniform request distribution, which is usually a Zipf's law distribution. We evaluate the performance of different caching schemes and derive the optimal scheme using convex optimization techniques. The optimal solution is found efficiently using a binary search method. It is shown that as the network mobility increases, the performance of the optimal scheme far exceeds the traditional caching scheme. To the best of our knowledge, our work is the first to consider popularity ranking in performance evaluation.

preprint2014arXiv

Privacy Leakage in Mobile Computing: Tools, Methods, and Characteristics

The number of smartphones, tablets, sensors, and connected wearable devices are rapidly increasing. Today, in many parts of the globe, the penetration of mobile computers has overtaken the number of traditional personal computers. This trend and the always-on nature of these devices have resulted in increasing concerns over the intrusive nature of these devices and the privacy risks that they impose on users or those associated with them. In this paper, we survey the current state of the art on mobile computing research, focusing on privacy risks and data leakage effects. We then discuss a number of methods, recommendations, and ongoing research in limiting the privacy leakages and associated risks by mobile computing.

preprint2014arXiv

When Augmented Reality Meets Big Data

With computing and sensing woven into the fabric of everyday life, we live in an era where we are awash in a flood of data from which we can gain rich insights. Augmented reality (AR) is able to collect and help analyze the growing torrent of data about user engagement metrics within our personal mobile and wearable devices. This enables us to blend information from our senses and the digitalized world in a myriad of ways that was not possible before. AR and big data have a logical maturity that inevitably converge them. The tread of harnessing AR and big data to breed new interesting applications is starting to have a tangible presence. In this paper, we explore the potential to capture value from the marriage between AR and big data technologies, following with several challenges that must be addressed to fully realize this potential.

preprint2013arXiv

A Random Walk Based Model Incorporating Social Information for Recommendations

Collaborative filtering (CF) is one of the most popular approaches to build a recommendation system. In this paper, we propose a hybrid collaborative filtering model based on a Makovian random walk to address the data sparsity and cold start problems in recommendation systems. More precisely, we construct a directed graph whose nodes consist of items and users, together with item content, user profile and social network information. We incorporate user's ratings into edge settings in the graph model. The model provides personalized recommendations and predictions to individuals and groups. The proposed algorithms are evaluated on MovieLens and Epinions datasets. Experimental results show that the proposed methods perform well compared with other graph-based methods, especially in the cold start case.

preprint2013arXiv

An Upper Bound on the Convergence Time for Distributed Binary Consensus

The problem addressed in this paper is the analysis of a distributed consensus algorithm for arbitrary networks, proposed by Bénézit et al.. In the initial setting, each node in the network has one of two possible states ("yes" or "no"). Nodes can update their states by communicating with their neighbors via a 2-bit message in an asynchronous clock setting. Eventually, all nodes reach consensus on the majority states. We use the theory of electric networks, random walks, and couplings of Markov chains to derive an O(N4 logN) upper bound for the expected convergence time on an arbitrary graph of size N.

preprint2013arXiv

An Upper Bound on the Convergence Time for Quantized Consensus

We analyze a class of distributed quantized consen- sus algorithms for arbitrary networks. In the initial setting, each node in the network has an integer value. Nodes exchange their current estimate of the mean value in the network, and then update their estimation by communicating with their neighbors in a limited capacity channel in an asynchronous clock setting. Eventually, all nodes reach consensus with quantized precision. We start the analysis with a special case of a distributed binary voting algorithm, then proceed to the expected convergence time for the general quantized consensus algorithm proposed by Kashyap et al. We use the theory of electric networks, random walks, and couplings of Markov chains to derive an O(N^3log N) upper bound for the expected convergence time on an arbitrary graph of size N, improving on the state of art bound of O(N^4logN) for binary consensus and O(N^5) for quantized consensus algorithms. Our result is not dependent on graph topology. Simulations on special graphs such as star networks, line graphs, lollipop graphs, and Erdös-Rényi random graphs are performed to validate the analysis. This work has applications to load balancing, coordination of autonomous agents, estimation and detection, decision-making networks, peer-to-peer systems, etc.

preprint2013arXiv

Mobile augmented reality survey: a bottom-up approach

Augmented Reality (AR) is becoming mobile. Mobile devices have many constraints but also rich new features that traditional desktop computers do not have. There are several survey papers on AR, but none is dedicated to Mobile Augmented Reality (MAR). Our work serves the purpose of closing this gap. The contents are organized with a bottom-up approach. We first present the state-of-the-art in system components including hardware platforms, software frameworks and display devices, follows with enabling technologies such as tracking and data management. We then survey the latest technologies and methods to improve run-time performance and energy efficiency for practical implementation. On top of these, we further introduce the application fields and several typical MAR applications. Finally we conclude the survey with several challenge problems, which are under exploration and require great research efforts in the future.

preprint2013arXiv

Motion and audio analysis in mobile devices for remote monitoring of physical activities and user authentication

In this article we propose the use of accelerometer embedded by default in smartphone as a cost-effective, reliable and efficient way to provide remote physical activity monitoring for the elderly and people requiring healthcare service. Mobile phones are regularly carried by users during their day-to-day work routine, physical movement information can be captured by the mobile phone accelerometer, processed and sent to a remote server for monitoring. The acceleration pattern can deliver information related to the pattern of physical activities the user is engaged in. We further show how this technique can be extended to provide implicit real-time security by analysing unexpected movements captured by the phone accelerometer, and automatically locking the phone in such situation to prevent unauthorised access. This technique is also shown to provide implicit continuous user authentication, by capturing regular user movements such as walking, and requesting for re-authentication whenever it detects a non-regular movement.

preprint2013arXiv

Privacy Preserving Recommendation System Based on Groups

Recommendation systems have received considerable attention in the recent decades. Yet with the development of information technology and social media, the risk in revealing private data to service providers has been a growing concern to more and more users. Trade-offs between quality and privacy in recommendation systems naturally arise. In this paper, we present a privacy preserving recommendation framework based on groups. The main idea is to use groups as a natural middleware to preserve users' privacy. A distributed preference exchange algorithm is proposed to ensure the anonymity of data, wherein the effective size of the anonymity set asymptotically approaches the group size with time. We construct a hybrid collaborative filtering model based on Markov random walks to provide recommendations and predictions to group members. Experimental results on the MovieLens and Epinions datasets show that our proposed methods outperform the baseline methods, L+ and ItemRank, two state-of-the-art personalized recommendation algorithms, for both recommendation precision and hit rate despite the absence of personal preference information.

preprint2013arXiv

Wisdom of the Crowd: Incorporating Social Influence in Recommendation Models

Recommendation systems have received considerable attention recently. However, most research has been focused on improving the performance of collaborative filtering (CF) techniques. Social networks, indispensably, provide us extra information on people's preferences, and should be considered and deployed to improve the quality of recommendations. In this paper, we propose two recommendation models, for individuals and for groups respectively, based on social contagion and social influence network theory. In the recommendation model for individuals, we improve the result of collaborative filtering prediction with social contagion outcome, which simulates the result of information cascade in the decision-making process. In the recommendation model for groups, we apply social influence network theory to take interpersonal influence into account to form a settled pattern of disagreement, and then aggregate opinions of group members. By introducing the concept of susceptibility and interpersonal influence, the settled rating results are flexible, and inclined to members whose ratings are "essential".

preprint2012arXiv

Comparing Background Subtraction Algorithms and Method of Car Counting

In this paper, we compare various image background subtraction algorithms with the ground truth of cars counted. We have given a sample of thousand images, which are the snap shots of current traffic as records at various intersections and highways. We have also counted an approximate number of cars that are visible in these images. In order to ascertain the accuracy of algorithms to be used for the processing of million images, we compare them on many metrics that includes (i) Scalability (ii) Accuracy (iii) Processing time.

preprint2012arXiv

The Impact of Secure OSs on Internet Security: What Cyber-Insurers Need to Know

In recent years, researchers have proposed \emph{cyber-insurance} as a suitable risk-management technique for enhancing security in Internet-like distributed systems. However, amongst other factors, information asymmetry between the insurer and the insured, and the inter-dependent and correlated nature of cyber risks have contributed in a big way to the failure of cyber-insurance markets. Security experts have argued in favor of operating system (OS) platform switching (ex., from Windows to Unix-based OSs) or secure OS adoption as being one of the techniques that can potentially mitigate the problems posing a challenge to successful cyber-insurance markets. In this regard we model OS platform switching dynamics using a \emph{social gossip} mechanism and study three important questions related to the nature of the dynamics, for Internet-like distributed systems: (i) which type of networks should cyber-insurers target for insuring?, (ii) what are the bounds on the asymptotic performance level of a network, where the performance parameter is an average function of the long-run individual user willingness to adopt secure OSs?, and (iii) how can cyber-insurers use the topological information of their clients to incentivize/reward them during offering contracts? Our analysis is important to a profit-minded cyber-insurer, who wants to target the right network, design optimal contracts to resolve information asymmetry problems, and at the same time promote the increase of overall network security through increasing secure OS adoption amongst users.

preprint2011arXiv

Intra-City Urban Network and Traffic Flow Analysis from GPS Mobility Trace

We analyse two large-scale intra-city urban networks and traffic flows therein measured by GPS traces of taxis in San Francisco and Shanghai. Our results coincide with previous findings that, based purely on topological means, it is often insufficient to characterise traffic flow. Traditional shortest-path betweenness analysis, where shortest paths are calculated from each pairs of nodes, carries an unrealistic implicit assumption that each node or junction in the urban network generates and attracts an equal amount of traffic. We also argue that weighting edges based only on euclidean distance is inadequate, as primary roads are commonly favoured over secondary roads due to the perceived and actual travel time required. We show that betweenness traffic analysis can be improved by a simple extended framework which incorporates both the notions of node weights and fastest-path betweenness. We demonstrate that the framework is superior to traditional methods based solely on simple topological perspectives.

preprint2011arXiv

Modeling Internet Security Investments: The Case of Dealing with Information Uncertainty

Modern distributed communication networks like the Internet and censorship-resistant networks (also a part of the Internet) are characterized by nodes (users) interconnected with one another via communication links. In this regard, the security of individual nodes depend not only on their own efforts, but also on the efforts and underlying connectivity structure of neighboring network nodes. By the term 'effort', we imply the amount of investments made by a user in security mechanisms like antivirus softwares, firewalls, etc., to improve its security. However, often due to the large magnitude of such networks, it is not always possible for nodes to have complete effort and connectivity structure information about all their neighbor nodes. Added to this is the fact that in many applications, the Internet users are selfish and are not willing to co-operate with other users on sharing effort information. In this paper, we adopt a non-cooperative game-theoretic approach to analyze individual user security in a communication network by accounting for both, the partial information that a network node possess about its underlying neighborhood connectivity structure, as well as the presence of positive externalities arising from efforts exerted by neighboring nodes. We investigate the equilibrium behavior of nodes and show 1) the existence of symmetric Bayesian Nash equilibria of efforts and 2) better connected nodes choose lower efforts to exert but earn higher utilities with respect to security improvement irrespective of the nature of node degree correlations amongst the neighboring nodes. Our results provide ways for Internet users to appropriately invest in security mechanisms under realistic environments of information uncertainty.

preprint2011arXiv

Offloadable Apps using SmartDiet: Towards an analysis toolkit for mobile application developers

Offloading work to cloud is one of the proposed solutions for increasing the battery life of mobile devices. Most prior research has focused on computation-intensive applications, even though such applications are not the most popular ones. In this paper, we first study the feasibility of method-level offloading in network-intensive applications, using an open source Twitter client as an example. Our key observation is that implementing offloading transparently to the developer is difficult: various constraints heavily limit the offloading possibilities, and estimation of the potential benefit is challenging. We then propose a toolkit, SmartDiet, to assist mobile application developers in creating code which is suitable for energy-efficient offloading. SmartDiet provides fine-grained offloading constraint identification and energy usage analysis for Android applications. In addition to outlining the overall functionality of the toolkit, we study some of its key mechanisms and identify the remaining challenges.

preprint2011arXiv

On the Economics of Cloud Markets

Cloud computing is a paradigm that has the potential to transform and revolutionalize the next generation IT industry by making software available to end-users as a service. A cloud, also commonly known as a cloud network, typically comprises of hardware (network of servers) and a collection of softwares that is made available to end-users in a pay-as-you-go manner. Multiple public cloud providers (ex., Amazon) co-existing in a cloud computing market provide similar services (software as a service) to its clients, both in terms of the nature of an application, as well as in quality of service (QoS) provision. The decision of whether a cloud hosts (or finds it profitable to host) a service in the long-term would depend jointly on the price it sets, the QoS guarantees it provides to its customers, and the satisfaction of the advertised guarantees. In this paper, we devise and analyze three inter-organizational economic models relevant to cloud networks. We formulate our problems as non co-operative price and QoS games between multiple cloud providers existing in a cloud market. We prove that a unique pure strategy Nash equilibrium (NE) exists in two of the three models. Our analysis paves the path for each cloud provider to 1) know what prices and QoS level to set for end-users of a given service type, such that the provider could exist in the cloud market, and 2) practically and dynamically provision appropriate capacity for satisfying advertised QoS guarantees.

preprint2011arXiv

Towards Realistic Vehicular Network Modeling Using Planet-scale Public Webcams

Realistic modeling of vehicular mobility has been particularly challenging due to a lack of large libraries of measurements in the research community. In this paper we introduce a novel method for large-scale monitoring, analysis, and identification of spatio-temporal models for vehicular mobility using the freely available online webcams in cities across the globe. We collect vehicular mobility traces from 2,700 traffic webcams in 10 different cities for several months and generate a mobility dataset of 7.5 Terabytes consisting of 125 million of images. To the best of our knowl- edge, this is the largest data set ever used in such study. To process and analyze this data, we propose an efficient and scalable algorithm to estimate traffic density based on background image subtraction. Initial results show that at least 82% of individual cameras with less than 5% deviation from four cities follow Loglogistic distribution and also 94% cameras from Toronto follow gamma distribution. The aggregate results from each city also demonstrate that Log- Logistic and gamma distribution pass the KS-test with 95% confidence. Furthermore, many of the camera traces exhibit long range dependence, with self-similarity evident in the aggregates of traffic (per city). We believe our novel data collection method and dataset provide a much needed contribution to the research community for realistic modeling of vehicular networks and mobility.

preprint2011arXiv

Unleashing the Power of Mobile Cloud Computing using ThinkAir

Smartphones have exploded in popularity in recent years, becoming ever more sophisticated and capable. As a result, developers worldwide are building increasingly complex applications that require ever increasing amounts of computational power and energy. In this paper we propose ThinkAir, a framework that makes it simple for developers to migrate their smartphone applications to the cloud. ThinkAir exploits the concept of smartphone virtualization in the cloud and provides method level computation offloading. Advancing on previous works, it focuses on the elasticity and scalability of the server side and enhances the power of mobile cloud computing by parallelizing method execution using multiple Virtual Machine (VM) images. We evaluate the system using a range of benchmarks starting from simple micro-benchmarks to more complex applications. First, we show that the execution time and energy consumption decrease two orders of magnitude for the N-queens puzzle and one order of magnitude for a face detection and a virus scan application, using cloud offloading. We then show that if a task is parallelizable, the user can request more than one VM to execute it, and these VMs will be provided dynamically. In fact, by exploiting parallelization, we achieve a greater reduction on the execution time and energy consumption for the previous applications. Finally, we use a memory-hungry image combiner tool to demonstrate that applications can dynamically request VMs with more computational power in order to meet their computational requirements.

Pan Hui

What is connected

Connect this record

See the researcher in context

Building this map preview

55 published item(s)

A Twitter Dataset for Pakistani Political Discourse

FRAS: Federated Reinforcement Learning empowered Adaptive Point Cloud Video Streaming

A Reddit Dataset for the Russo-Ukrainian Conflict in 2022

AICP: Augmented Informative Cooperative Perception

Beyond the Blue Sky of Multimodal Interaction: A Centennial Vision of Interplanetary Virtual Spaces in Turn-based Metaverse

Decentralized, not Dehumanized in the Metaverse: Bringing Utility to NFTs through Multimodal Interaction

DiOS -- An Extended Reality Operating System for the Metaverse

Distributed Vehicular Computing at the Dawn of 5G: a Survey

Exploring System Performance of Continual Learning for Mobile and Embedded Sensing Applications

Federated Split GANs

HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign Supermask

Life, the Metaverse and Everything: An Overview of Privacy, Ethics, and Governance in Metaverse

Nebula: Reliable Low-latency Video Transmission for Mobile Cloud Gaming

Towards Reproducible Evaluations for Flying Drone Controllers in Virtual Environments

Towards User-Centered Metrics for Trustworthy AI in Immersive Cyberspace

Twitter Dataset for 2022 Russo-Ukrainian Crisis

VibroWeight: Simulating Weight and Center of Gravity Changes of Objects in Virtual Reality for Enhanced Realism

What is the Metaverse? An Immersive Cyberspace and Open Challenges

When Gamification Spoils Your Learning: A Qualitative Case Study of Gamification Misuse in a Language-Learning App

Characterizing Student Engagement Moods for Dropout Prediction in Question Pool Websites

DRLE: Decentralized Reinforcement Learning at the Edge for Traffic Light Control in the IoV

Genetic Meta-Structure Search for Recommendation on Heterogeneous Information Network

Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

Predicting Hyperkalemia in the ICU and Evaluation of Generalizability and Interpretability

Towards Mobile Distributed Ledgers

A Survey on Computational Politics

DeepHealth: Review and challenges of artificial intelligence in health informatics

Edge Intelligence: Architectures, Challenges, and Applications

Evaluating Transport Protocols on 5G for Mobile Augmented Reality

Marketplace for AI Models

Trustworthy AI in the Age of Pervasive Computing and Big Data

Urban Anomaly Analytics: Description, Detection, and Prediction

Dissecting the End-to-end Latency of Interactive Mobile Video Applications

Navigation by anomalous random walks on complex networks

Security Pricing as an Enabler of Cyber-Insurance: A First Look at Differentiated Pricing Markets

Explaining the Power-law Distribution of Human Mobility Through Transportation Modality Decomposition

An Upper Bound on the Convergence Time for Quantized Consensus of Arbitrary Static Graphs

Cooperative Caching based on File Popularity Ranking in Delay Tolerant Networks

Privacy Leakage in Mobile Computing: Tools, Methods, and Characteristics

When Augmented Reality Meets Big Data

A Random Walk Based Model Incorporating Social Information for Recommendations

An Upper Bound on the Convergence Time for Distributed Binary Consensus

An Upper Bound on the Convergence Time for Quantized Consensus

Mobile augmented reality survey: a bottom-up approach

Motion and audio analysis in mobile devices for remote monitoring of physical activities and user authentication

Privacy Preserving Recommendation System Based on Groups

Wisdom of the Crowd: Incorporating Social Influence in Recommendation Models

Comparing Background Subtraction Algorithms and Method of Car Counting

The Impact of Secure OSs on Internet Security: What Cyber-Insurers Need to Know

Intra-City Urban Network and Traffic Flow Analysis from GPS Mobility Trace

Modeling Internet Security Investments: The Case of Dealing with Information Uncertainty

Offloadable Apps using SmartDiet: Towards an analysis toolkit for mobile application developers

On the Economics of Cloud Markets

Towards Realistic Vehicular Network Modeling Using Planet-scale Public Webcams

Unleashing the Power of Mobile Cloud Computing using ThinkAir