Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
30works
0followers
27topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

30 published item(s)

preprint2026arXiv

Separate First, Fuse Later: Mitigating Cross-Modal Interference in Audio-Visual LLMs Reasoning with Modality-Specific Chain-of-Thought

Audio and vision provide complementary evidence for audio-visual question answering, yet current audio-visual large language models may suffer from cross-modal interference: information from one modality misguides the interpretation of another, thereby inducing hallucinations. We attribute this issue to uncontrolled cross-modal interactions during intermediate reasoning. To mitigate this, we propose Separate First, Fuse Later (SFFL), an audio-visual reasoning framework designed to reduce cross-modal interference. SFFL enforces modality-specific chain-of-thought reasoning, producing separate audio and visual reasoning traces and integrating evidence for answering. We construct modality-preference labels via a data pipeline under different modality input settings. We use these labels as an auxiliary reward in reinforcement learning to encourage a instance-dependent preference for modality cues when answering. We further introduce a modality-specific reasoning mechanism that preserves modality isolation during the separated reasoning stage while enabling full access to cross-modal information at the evidence fusion stage. Experiments demonstrate consistent improvements in both accuracy and robustness, yielding an average relative gain of 5.16\% on general AVQA benchmarks and 11.17\% on a cross-modal hallucination benchmark.

preprint2022arXiv

Activation Map Adaptation for Effective Knowledge Distillation

Model compression becomes a recent trend due to the requirement of deploying neural networks on embedded and mobile devices. Hence, both accuracy and efficiency are of critical importance. To explore a balance between them, a knowledge distillation strategy is proposed for general visual representation learning. It utilizes our well-designed activation map adaptive module to replace some blocks of the teacher network, exploring the most appropriate supervisory features adaptively during the training process. Using the teacher's hidden layer output to prompt the student network to train so as to transfer effective semantic information.To verify the effectiveness of our strategy, this paper applied our method to cifar-10 dataset. Results demonstrate that the method can boost the accuracy of the student network by 0.6% with 6.5% loss reduction, and significantly improve its training speed.

preprint2022arXiv

Computer Vision for Road Imaging and Pothole Detection: A State-of-the-Art Review of Systems and Algorithms

Computer vision algorithms have been prevalently utilized for 3-D road imaging and pothole detection for over two decades. Nonetheless, there is a lack of systematic survey articles on state-of-the-art (SoTA) computer vision techniques, especially deep learning models, developed to tackle these problems. This article first introduces the sensing systems employed for 2-D and 3-D road data acquisition, including camera(s), laser scanners, and Microsoft Kinect. Afterward, it thoroughly and comprehensively reviews the SoTA computer vision algorithms, including (1) classical 2-D image processing, (2) 3-D point cloud modeling and segmentation, and (3) machine/deep learning, developed for road pothole detection. This article also discusses the existing challenges and future development trends of computer vision-based road pothole detection approaches: classical 2-D image processing-based and 3-D point cloud modeling and segmentation-based approaches have already become history; and Convolutional neural networks (CNNs) have demonstrated compelling road pothole detection results and are promising to break the bottleneck with the future advances in self/un-supervised learning for multi-modal semantic segmentation. We believe that this survey can serve as practical guidance for developing the next-generation road condition assessment systems.

preprint2022arXiv

Market Design for Tradable Mobility Credits

Tradable mobility credit (TMC) schemes are an approach to travel demand management that have received significant attention in recent years. This paper proposes and analyzes alternative market models for a TMC system -- focusing on market design aspects such as allocation/expiration of tokens, rules governing trading, transaction fees, and regulator intervention -- and develops a methodology to explicitly model the dis-aggregate behavior of individuals within the market. Extensive simulation experiments are conducted within a combined mode and departure time context for the morning commute problem to compare the performance of the alternative designs relative to congestion pricing and a no-control scenario. The simulation experiments employ a day-to-day assignment framework wherein transportation demand is modeled using a logit-mixture model with income effects and supply is modeled using a standard bottleneck model. The results indicate that small fixed transaction fees can effectively mitigate undesirable behavior in the market without a significant loss in efficiency (total welfare) whereas proportional transaction fees are less effective both in terms of efficiency and in avoiding undesirable market behavior. Further, an allocation of tokens in continuous time can be beneficial in dealing with non-recurrent events and avoiding concentrated trading activity. In the presence of income effects, despite small fixed transaction fees, the TMC system yields a marginally higher social welfare than congestion pricing while attaining revenue neutrality. Further, it is more robust in the presence of forecasting errors and non-recurrent events due to the adaptiveness of the market. Finally, as expected, the TMC scheme is more equitable (when revenues from congestion pricing are not redistributed) although it is not guaranteed to be Pareto-improving when tokens are distributed equally.

preprint2022arXiv

Medication Error Detection Using Contextual Language Models

Medication errors most commonly occur at the ordering or prescribing stage, potentially leading to medical complications and poor health outcomes. While it is possible to catch these errors using different techniques; the focus of this work is on textual and contextual analysis of prescription information to detect and prevent potential medication errors. In this paper, we demonstrate how to use BERT-based contextual language models to detect anomalies in written or spoken text based on a data set extracted from real-world medical data of thousands of patient records. The proposed models are able to learn patterns of text dependency and predict erroneous output based on contextual information such as patient data. The experimental results yield accuracy up to 96.63% for text input and up to 79.55% for speech input, which is satisfactory for most real-world applications.

preprint2022arXiv

Minimizing Fleet Size and Improving Bike Allocation of Bike Sharing under Future Uncertainty

As a rapidly expanding service, bike sharing is facing severe problems of bike over-supply and demand fluctuation in many Chinese cities. This study develops a large-scale method to determine the minimum fleet size under uncertainty, based on the bike sharing data of millions of trips in Nanjing. It is found that the algorithm of minimizing fleet size under the incomplete-information scenario is effective in handling future uncertainty. For a dockless bike sharing system, supplying 14.5% of the original fleet could meet 96.8% of trip demands. Meanwhile, the results suggest that providing a integrated service platform that integrates multiple companies can significantly reduce the total fleet size by 44.6%. Moreover, in view of the COVID-19 pandemic, this study proposes a social distancing policy that maintains a suitable usage interval. These findings provide useful insights for improving the resource efficiency and operational service of bike sharing and shared mobility.

preprint2022arXiv

Parameters identification for an inverse problem arising from a binary option using a Bayesian inference approach

No--arbitrage property provides a simple method for pricing financial derivatives. However, arbitrage opportunities exist among different markets in various fields, even for a very short time. By knowing that an arbitrage property exists, we can adopt a financial trading strategy. This paper investigates the inverse option problems (IOP) in the extended Black--Scholes model. We identify the model coefficients from the measured data and attempt to find arbitrage opportunities in different financial markets using a Bayesian inference approach, which is presented as an IOP solution. The posterior probability density function of the parameters is computed from the measured data.The statistics of the unknown parameters are estimated by a Markov Chain Monte Carlo (MCMC) algorithm, which exploits the posterior state space. The efficient sampling strategy of the MCMC algorithm enables us to solve inverse problems by the Bayesian inference technique. Our numerical results indicate that the Bayesian inference approach can simultaneously estimate the unknown trend and volatility coefficients from the measured data.

preprint2022arXiv

Skeleton-based Action Recognition via Temporal-Channel Aggregation

Skeleton-based action recognition methods are limited by the semantic extraction of spatio-temporal skeletal maps. However, current methods have difficulty in effectively combining features from both temporal and spatial graph dimensions and tend to be thick on one side and thin on the other. In this paper, we propose a Temporal-Channel Aggregation Graph Convolutional Networks (TCA-GCN) to learn spatial and temporal topologies dynamically and efficiently aggregate topological features in different temporal and channel dimensions for skeleton-based action recognition. We use the Temporal Aggregation module to learn temporal dimensional features and the Channel Aggregation module to efficiently combine spatial dynamic channel-wise topological features with temporal dynamic topological features. In addition, we extract multi-scale skeletal features on temporal modeling and fuse them with an attention mechanism. Extensive experiments show that our model results outperform state-of-the-art methods on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.

preprint2022arXiv

Terrain-based vehicle localization using an active suspension system

This paper, for the first time, presents a terrain-based localization approach using sensor data from an active suspension system. The contribution is four-fold. First, it is shown that a location dependent road height profile can be created from sensor data of the active suspension system. Second, an algorithm is developed to extract a pitch profile from the road height profile data. The ideal pitch profile is vehicle-independent and only depends on the road. This pitch profile generated from an on-board computer is matched with a known terrain map to achieve real-time positioning. Third, a crowd-sourced map creation algorithm is developed to create and improve the terrain map that contains pitch profile. Fourth, experiments have been conducted to validate the accuracy and robustness of the proposed localization approach.

preprint2022arXiv

Transfer learning for cross-modal demand prediction of bike-share and public transit

The urban transportation system is a combination of multiple transport modes, and the interdependencies across those modes exist. This means that the travel demand across different travel modes could be correlated as one mode may receive demand from or create demand for another mode, not to mention natural correlations between different demand time series due to general demand flow patterns across the network. It is expectable that cross-modal ripple effects become more prevalent, with Mobility as a Service. Therefore, by propagating demand data across modes, a better demand prediction could be obtained. To this end, this study explores various machine learning models and transfer learning strategies for cross-modal demand prediction. The trip data of bike-share, metro, and taxi are processed as the station-level passenger flows, and then the proposed prediction method is tested in the large-scale case studies of Nanjing and Chicago. The results suggest that prediction models with transfer learning perform better than unimodal prediction models. Furthermore, stacked Long Short-Term Memory model performs particularly well in cross-modal demand prediction. These results verify our combined method's forecasting improvement over existing benchmarks and demonstrate the good transferability for cross-modal demand prediction in multiple cities.

preprint2021arXiv

Industry Practice of Coverage-Guided Enterprise-Level DBMS Fuzzing

As an infrastructure for data persistence and analysis, Database Management Systems (DBMSs) are the cornerstones of modern enterprise software. To improve their correctness, the industry has been applying blackbox fuzzing for decades. Recently, the research community achieved impressive fuzzing gains using coverage guidance. However, due to the complexity and distributed nature of enterprise-level DBMSs, seldom are these researches applied to the industry. In this paper, we apply coverage-guided fuzzing to enterprise-level DBMSs from Huawei and Bloomberg LP. In our practice of testing GaussDB and Comdb2, we found major challenges in all three testing stages. The challenges are collecting precise coverage, optimizing fuzzing performance, and analyzing root causes. In search of a general method to overcome these challenges, we propose Ratel, a coverage-guided fuzzer for enterprise-level DBMSs. With its industry-oriented design, Ratel improves the feedback precision, enhances the robustness of input generation, and performs an on-line investigation on the root cause of bugs. As a result, Ratel outperformed other fuzzers in terms of coverage and bugs. Compared to industrial black box fuzzers SQLsmith and SQLancer, as well as coverage-guided academic fuzzer Squirrel, Ratel covered 38.38%, 106.14%, 583.05% more basic blocks than the best results of other three fuzzers in GaussDB, PostgreSQL, and Comdb2, respectively. More importantly, Ratel has discovered 32, 42, and 5 unknown bugs in GaussDB, Comdb2, and PostgreSQL.

preprint2021arXiv

IntelliGen: Automatic Driver Synthesis for FuzzTesting

Fuzzing is a technique widely used in vulnerability detection. The process usually involves writing effective fuzz driver programs, which, when done manually, can be extremely labor intensive. Previous attempts at automation leave much to be desired, in either degree of automation or quality of output. In this paper, we propose IntelliGen, a framework that constructs valid fuzz drivers automatically. First, IntelliGen determines a set of entry functions and evaluates their respective chance of exhibiting a vulnerability. Then, IntelliGen generates fuzz drivers for the entry functions through hierarchical parameter replacement and type inference. We implemented IntelliGen and evaluated its effectiveness on real-world programs selected from the Android Open-Source Project, Google's fuzzer-test-suite and industrial collaborators. IntelliGen covered on average 1.08X-2.03X more basic blocks and 1.36X-2.06X more paths over state-of-the-art fuzz driver synthesizers FUDGE and FuzzGen. IntelliGen performed on par with manually written drivers and found 10 more bugs.

preprint2021arXiv

Managing network congestion with a tradable credit scheme: a trip-based MFD approach

This study investigates the efficiency and effectiveness of an area-based tradable credit scheme (TCS) using the trip-based Macroscopic Fundamental Diagram model for the morning commute problem. In the proposed TCS, the regulator distributes initial credits to all travelers and designs a time-varying and trip length specific credit tariff. Credits are traded between travelers and the regulator via a credit market, and the credit price is determined by the demand and supply of credits. The heterogeneity of travelers is considered in terms of desired arrival time, trip length and departure-time choice preferences. The TCS is incorporated into a day-to-day modelling framework to examine the travelers' learning process, the evolution of network, and the properties of the credit market. The existence of an equilibrium solution and the uniqueness of the credit price at the equilibrium state are established analytically. Furthermore, an open-source simulation framework is developed to validate the analytical properties of the proposed TCS and compare it with alternative control strategies in terms of mobility, network performance, and social welfare. Bayesian optimization is then adopted to optimize the credit toll scheme. The numerical results demonstrate that the proposed TCS outperforms the no-control case and matches the performance of the time-of-day pricing strategy, while maintaining revenue-neutral nature.

preprint2021arXiv

On some trivial source Specht modules

The paper presented here focuses on the classification of trivial source Specht modules. We completely classify the trivial source Specht modules labelled by hook partitions. We also classify the trivial source Specht modules labelled by two-part partitions in the odd characteristic case. Moreover, in the even characteristic case, we prove a result for the classification of the trivial source Specht modules labelled by partitions with 2-weight 2, which justifies a conjecture of [16].

preprint2021arXiv

On Terwilliger $\mathbb{F}$-algebras of quasi-thin association schemes

In [3], Hanaki defined the Terwilliger algebras of association schemes over a commutative unital ring. In this paper, we call the Terwilliger algebras of association schemes over a field $\mathbb{F}$ the Terwilliger $\mathbb{F}$-algebras of association schemes and study the Terwilliger $\mathbb{F}$-algebras of quasi-thin association schemes. As main results, we determine the $\mathbb{F}$-dimensions, the semisimplicity, the Jacobson radicals, and the algebraic structures of the Terwilliger $\mathbb{F}$-algebras of quasi-thin association schemes. We also get some results with independent interests.

preprint2021arXiv

RNN-Test: Towards Adversarial Testing for Recurrent Neural Network Systems

While massive efforts have been investigated in adversarial testing of convolutional neural networks (CNN), testing for recurrent neural networks (RNN) is still limited and leaves threats for vast sequential application domains. In this paper, we propose an adversarial testing framework RNN-Test for RNN systems, focusing on the main sequential domains, not only classification tasks. First, we design a novel search methodology customized for RNN models by maximizing the inconsistency of RNN states to produce adversarial inputs. Next, we introduce two state-based coverage metrics according to the distinctive structure of RNNs to explore more inference logics. Finally, RNN-Test solves the joint optimization problem to maximize state inconsistency and state coverage, and crafts adversarial inputs for various tasks of different kinds of inputs. For evaluations, we apply RNN-Test on three sequential models of common RNN structures. On the tested models, the RNN-Test approach is demonstrated to be competitive in generating adversarial inputs, outperforming FGSM-based and DLFuzz-based methods to reduce the model performance more sharply with 2.78% to 32.5% higher success (or generation) rate. RNN-Test could also achieve 52.65% to 66.45% higher adversary rate on MNIST-LSTM model than relevant work testRNN. Compared with the neuron coverage, the proposed state coverage metrics as guidance excel with 4.17% to 97.22% higher success (or generation) rate.

preprint2021arXiv

V-Gas: Generating High Gas Consumption Inputs to Avoid Out-of-Gas Vulnerability

The out-of-gas error occurs when smart contract programs are provided with inputs that cause excessive gas consumption, and would be easily exploited to make the DoS attack. Multiple approaches have been proposed to estimate the gas limit of a function in smart contracts to avoid such error. However, under estimation often happens when the contract is complicated. In this work, we propose V-Gas, which could automatically generate inputs that maximizes the gas cost and reduce the under estimation cases. V-Gas is designed based on feedback-directed mutational fuzz testing. First, V-Gas builds the gas weighted control flow graph (CFG) of functions in smart contracts. Then, V-Gas develops gas consumption guided selection and mutation strategies to generate the input that maximize the gas consumption. For evaluation, we implement V-Gas based on js-evm, a widely used ethereum virtual machine written in javascript, and conduct experiments on 736 real-world transactions recorded on Ethereum. 44.02\% of the transactions would have out-of-gas errors under the estimation results given by solc, means that the recorded real gas consumption for those recorded transactions is larger than the gas limit value estimated by solc. While V-Gas could reduce the under estimation ratio to 13.86\%. Furthermore, V-Gas has exposed 25 previously unknown out-of-gas vulnerabilities in those widely-used smart contracts, 5 of which have been assigned unique CVE identifiers in the US National Vulnerability Database.

preprint2021arXiv

WRICNet:A Weighted Rich-scale Inception Coder Network for Multi-Resolution Remote Sensing Image Change Detection

Majority models of remote sensing image changing detection can only get great effect in a specific resolution data set. With the purpose of improving change detection effectiveness of the model in the multi-resolution data set, a weighted rich-scale inception coder network (WRICNet) is proposed in this article, which can make a great fusion of shallow multi-scale features, and deep multi-scale features. The weighted rich-scale inception module of the proposed can obtain shallow multi-scale features, the weighted rich-scale coder module can obtain deep multi-scale features. The weighted scale block assigns appropriate weights to features of different scales, which can strengthen expressive ability of the edge of the changing area. The performance experiments on the multi-resolution data set demonstrate that, compared to the comparative methods, the proposed can further reduce the false alarm outside the change area, and the missed alarm in the change area, besides, the edge of the change area is more accurate. The ablation study of the proposed shows that the training strategy, and improvements of this article can improve the effectiveness of change detection.

preprint2020arXiv

A Time Delay Dynamic System with External Source for the Local Outbreak of 2019-nCoV

How to model the 2019 CoronaVirus (2019-nCov) spread in China is one of the most urgent and interesting problems in applied mathematics. In this paper, we propose a novel time delay dynamic system with external source to describe the trend of local outbreak for the 2019-nCoV. The external source is introduced in the newly proposed dynamic system, which can be considered as the suspected people travel to different areas. The numerical simulations exhibit the dynamic system with the external source is more reliable than the one without it, and the rate of isolation is extremely important for controlling the increase of cumulative confirmed people of 2019-nCoV. Based on our numerical simulation results with the public data, we suggest that the local government should have some more strict measures to maintain the rate of isolation. Otherwise the local cumulative confirmed people of 2019-nCoV might be out of control.

preprint2020arXiv

A Time Delay Dynamical Model for Outbreak of 2019-nCoV and the Parameter Identification

In this paper, we propose a novel dynamical system with time delay to describe the outbreak of 2019-nCoV in China. One typical feature of this epidemic is that it can spread in latent period, which is therefore described by the time delay process in the differential equations. The accumulated numbers of classified populations are employed as variables, which is consistent with the official data and facilitates the parameter identification. The numerical methods for the prediction of outbreak of 2019-nCoV and parameter identification are provided, and the numerical results show that the novel dynamic system can well predict the outbreak trend so far. Based on the numerical simulations, we suggest that the transmission of individuals should be greatly controlled with high isolation rate by the government.

preprint2020arXiv

Debris cloud of India Anti-Satellite Test to Microsat-R Satellite

Understanding the motion of debris cloud produced by the anti-satellite test can help us to know the danger of these tests. This study presents the orbit status of 57 fragments observed by the CelesTrak and presented in the NORAD Two-Line Element Sets of India Anti-Satellite Test. There are 10 of these observed fragments have altitudes of the apogee larger than 1000.0km, the maximum one is 1725.7km. We also numerical calculated the number of debris, the results show that the number of debris with the diameter larger than 0.2m is 14, the number of debris with the diameter larger than 0.01m is 6587, and the number of debris with the diameter larger than 0.001m is 7.22e+5. The results of the secondary collision of the debris will produced more fragments in the space. The life time of the fragments depends on the initial orbit parameters and the sizes of the debris.

preprint2020arXiv

End-to-End Vision-Based Adaptive Cruise Control (ACC) Using Deep Reinforcement Learning

This paper presented a deep reinforcement learning method named Double Deep Q-networks to design an end-to-end vision-based adaptive cruise control (ACC) system. A simulation environment of a highway scene was set up in Unity, which is a game engine that provided both physical models of vehicles and feature data for training and testing. Well-designed reward functions associated with the following distance and throttle/brake force were implemented in the reinforcement learning model for both internal combustion engine (ICE) vehicles and electric vehicles (EV) to perform adaptive cruise control. The gap statistics and total energy consumption are evaluated for different vehicle types to explore the relationship between reward functions and powertrain characteristics. Compared with the traditional radar-based ACC systems or human-in-the-loop simulation, the proposed vision-based ACC system can generate either a better gap regulated trajectory or a smoother speed trajectory depending on the preset reward function. The proposed system can be well adaptive to different speed trajectories of the preceding vehicle and operated in real-time.

preprint2020arXiv

Equitable Transit Network Design Under Uncertainty

This paper proposes a bilevel transit network design problem considering supply side uncertainty. The upper level problem determines frequency settings to simultaneously maximize the efficiency and equity measures, which are defined by the reduction in the total effective travel cost and the minimum reduction in the effective travel cost of all OD pairs, respectively. The lower level problem is the reliability based transit assignment problem that captures the effects of supply-side uncertainty on passengers route choice behavior.Numerical studies demonstrate that 1) the Pareto frontier may not be convex; 2) it is possible to improve the efficiency and equity objectives simultaneously; 3) increasing the frequency could worsen the equity measure; 4) passengers risk attitude affects the rate of substitution between the two objectives.

preprint2020arXiv

Joint Optimization of Transfer Location and Capacity in a Multimodal Transport Network: Bilevel Modeling and Paradoxes

With the growing attention towards developing the multimodal transport system to enhance urban mobility, there is an increasing need to construct new, rebuild or expand existing infrastructure to facilitate existing and accommodate newly generated travel demand. Therefore, this paper develops a bilevel model to simultaneously determine the location and capacity of the transfer infrastructure to be built considering elastic demand in a multimodal transport network. The upper level problem is formulated as a mixed integer linear programming problem, while the lower level problem is the capacitated combined trip distribution assignment model that depicts both destination and route choices of travelers via the multinomial logit formula. To solve the model, the paper develops a matheuristics algorithm that integrates a Genetic Algorithm and a successive linear programming solution approach. Numerical studies are conducted to demonstrate the existence and examine two Braess like paradox phenomena in a multimodal transport network. The first one states that under fixed demand constructing parking spaces to stimulate the usage of Park and Ride service could deteriorate the system performance, measured by the total passengers travel time, while the second one reveals that under variable demand increasing the parking capacity for the Park and Ride services to promote the usages may fail, represented by the decline in its modal share. Meanwhile, the last experiment suggests that constructing transfer infrastructures at distributed stations outperforms building a large transfer center in terms of attracting travelers using sustainable transit modes.

preprint2020arXiv

LEOPARD: Identifying Vulnerable Code for Vulnerability Assessment through Program Metrics

Identifying potentially vulnerable locations in a code base is critical as a pre-step for effective vulnerability assessment; i.e., it can greatly help security experts put their time and effort to where it is needed most. Metric-based and pattern-based methods have been presented for identifying vulnerable code. The former relies on machine learning and cannot work well due to the severe imbalance between non-vulnerable and vulnerable code or lack of features to characterize vulnerabilities. The latter needs the prior knowledge of known vulnerabilities and can only identify similar but not new types of vulnerabilities. In this paper, we propose and implement a generic, lightweight and extensible framework, LEOPARD, to identify potentially vulnerable functions through program metrics. LEOPARD requires no prior knowledge about known vulnerabilities. It has two steps by combining two sets of systematically derived metrics. First, it uses complexity metrics to group the functions in a target application into a set of bins. Then, it uses vulnerability metrics to rank the functions in each bin and identifies the top ones as potentially vulnerable. Our experimental results on 11 real-world projects have demonstrated that, LEOPARD can cover 74.0% of vulnerable functions by identifying 20% of functions as vulnerable and outperform machine learning-based and static analysis-based techniques. We further propose three applications of LEOPARD for manual code review and fuzzing, through which we discovered 22 new bugs in real applications like PHP, radare2 and FFmpeg, and eight of them are new vulnerabilities.

preprint2020arXiv

On some $p$-transitive association schemes

In this paper, for any prime $p$, we propose the notion of a $p$-transitive association scheme. This notion aims to generalize the fact that the regular module of a group algebra of a finite group has a unique trivial submodule to the case of the regular modules of modular adjacency algebras. We completely determine the $p$-transitive quasi-thin association schemes and the $p$-transitive association schemes with thin thin residue by their structure theory properties.

preprint2020arXiv

Prediction and analysis of Coronavirus Disease 2019

In December 2019, a novel coronavirus was found in a seafood wholesale market in Wuhan, China. WHO officially named this coronavirus as COVID-19. Since the first patient was hospitalized on December 12, 2019, China has reported a total of 78,824 confirmed CONID-19 cases and 2,788 deaths as of February 28, 2020. Wuhan's cumulative confirmed cases and deaths accounted for 61.1% and 76.5% of the whole China mainland , making it the priority center for epidemic prevention and control. Meanwhile, 51 countries and regions outside China have reported 4,879 confirmed cases and 79 deaths as of February 28, 2020. COVID-19 epidemic does great harm to people's daily life and country's economic development. This paper adopts three kinds of mathematical models, i.e., Logistic model, Bertalanffy model and Gompertz model. The epidemic trends of SARS were first fitted and analyzed in order to prove the validity of the existing mathematical models. The results were then used to fit and analyze the situation of COVID-19. The prediction results of three different mathematical models are different for different parameters and in different regions. In general, the fitting effect of Logistic model may be the best among the three models studied in this paper, while the fitting effect of Gompertz model may be better than Bertalanffy model. According to the current trend, based on the three models, the total number of people expected to be infected is 49852-57447 in Wuhan,12972-13405 in non-Hubei areas and 80261-85140 in China respectively. The total death toll is 2502-5108 in Wuhan, 107-125 in Non-Hubei areas and 3150-6286 in China respetively. COVID-19 will be over p robably in late-April, 2020 in Wuhan and before late-March, 2020 in other areas respectively.

preprint2020arXiv

Robust Bayesian variable selection for gene-environment interactions

Gene-environment (G$\times$E) interactions have important implications to elucidate the etiology of complex diseases beyond the main genetic and environmental effects. Outliers and data contamination in disease phenotypes of G$\times$E studies have been commonly encountered, leading to the development of a broad spectrum of robust regularization methods. Nevertheless, within the Bayesian framework, the issue has not been taken care of in existing studies. We develop a fully Bayesian robust variable selection method for G$\times$E interaction studies. The proposed Bayesian method can effectively accommodate heavy-tailed errors and outliers in the response variable while conducting variable selection by accounting for structural sparsity. In particular, for the robust sparse group selection, the spike-and-slab priors have been imposed on both individual and group levels to identify important main and interaction effects robustly. An efficient Gibbs sampler has been developed to facilitate fast computation. Extensive simulation studies and analysis of both the diabetes data with SNP measurements from the Nurses' Health Study and TCGA melanoma data with gene expression measurements demonstrate the superior performance of the proposed method over multiple competing alternatives.

preprint2019arXiv

Online Predictive Optimization Framework for Stochastic Demand-Responsive Transit Services

This study develops an online predictive optimization framework for dynamically operating a transit service in an area of crowd movements. The proposed framework integrates demand prediction and supply optimization to periodically redesign the service routes based on recently observed demand. To predict demand for the service, we use Quantile Regression to estimate the marginal distribution of movement counts between each pair of serviced locations. The framework then combines these marginals into a joint demand distribution by constructing a Gaussian copula, which captures the structure of correlation between the marginals. For supply optimization, we devise a linear programming model, which simultaneously determines the route structure and the service frequency according to the predicted demand. Importantly, our framework both preserves the uncertainty structure of future demand and leverages this for robust route optimization, while keeping both components decoupled. We evaluate our framework using a real-world case study of autonomous mobility in a university campus in Denmark. The results show that our framework often obtains the ground truth optimal solution, and can outperform conventional methods for route optimization, which do not leverage full predictive distributions.