Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
29works
0followers
18topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

29 published item(s)

preprint2026arXiv

BioTool: A Comprehensive Tool-Calling Dataset for Enhancing Biomedical Capabilities of Large Language Models

Despite the success of large language models (LLMs) on general-purpose tasks, their performance in highly specialized domains such as biomedicine remains unsatisfactory. A key limitation is the inability of LLMs to effectively leverage biomedical tools, which clinical experts and biomedical researchers rely on extensively in daily workflows. While recent general-domain tool-calling datasets have substantially improved the capabilities of LLM agents, existing efforts in the biomedical domain largely rely on in-context learning and restrict models to a small set of tools. To address this gap, we introduce BioTool, a comprehensive biomedical tool-calling dataset designed for fine-tuning LLMs. BioTool comprises 34 frequently used tools collected from the NCBI, Ensembl, and UniProt databases, along with 7,040 high-quality, human-verified query-API call pairs spanning variation, genomics, proteomics, evolution, and general biology. Fine-tuning a 4-billion-parameter LLM on BioTool yields substantial improvements in biomedical tool-calling performance, outperforming cutting-edge commercial LLMs such as GPT-5.1. Furthermore, human expert evaluations demonstrate that integrating a BioTool-fine-tuned tool caller significantly improves downstream answer quality compared to the same LLM without tool usage, highlighting the effectiveness of BioTool in enhancing the biomedical capabilities of LLMs. The full dataset and evaluation code are available at https://github.com/gxx27/BioTool

preprint2023arXiv

Drug Synergistic Combinations Predictions via Large-Scale Pre-Training and Graph Structure Learning

Drug combination therapy is a well-established strategy for disease treatment with better effectiveness and less safety degradation. However, identifying novel drug combinations through wet-lab experiments is resource intensive due to the vast combinatorial search space. Recently, computational approaches, specifically deep learning models have emerged as an efficient way to discover synergistic combinations. While previous methods reported fair performance, their models usually do not take advantage of multi-modal data and they are unable to handle new drugs or cell lines. In this study, we collected data from various datasets covering various drug-related aspects. Then, we take advantage of large-scale pre-training models to generate informative representations and features for drugs, proteins, and diseases. Based on that, a message-passing graph is built on top to propagate information together with graph structure learning flexibility. This is first introduced in the biological networks and enables us to generate pseudo-relations in the graph. Our framework achieves state-of-the-art results in comparison with other deep learning-based methods on synergistic prediction benchmark datasets. We are also capable of inferencing new drug combination data in a test on an independent set released by AstraZeneca, where 10% of improvement over previous methods is observed. In addition, we're robust against unseen drugs and surpass almost 15% AU ROC compared to the second-best model. We believe our framework contributes to both the future wet-lab discovery of novel drugs and the building of promising guidance for precise combination medicine.

preprint2023arXiv

Follow the Timeline! Generating Abstractive and Extractive Timeline Summary in Chronological Order

Nowadays, time-stamped web documents related to a general news query floods spread throughout the Internet, and timeline summarization targets concisely summarizing the evolution trajectory of events along the timeline. Unlike traditional document summarization, timeline summarization needs to model the time series information of the input events and summarize important events in chronological order. To tackle this challenge, in this paper, we propose a Unified Timeline Summarizer (UTS) that can generate abstractive and extractive timeline summaries in time order. Concretely, in the encoder part, we propose a graph-based event encoder that relates multiple events according to their content dependency and learns a global representation of each event. In the decoder part, to ensure the chronological order of the abstractive summary, we propose to extract the feature of event-level attention in its generation process with sequential information remained and use it to simulate the evolutionary attention of the ground truth summary. The event-level attention can also be used to assist in extracting summary, where the extracted summary also comes in time sequence. We augment the previous Chinese large-scale timeline summarization dataset and collect a new English timeline dataset. Extensive experiments conducted on these datasets and on the out-of-domain Timeline 17 dataset show that UTS achieves state-of-the-art performance in terms of both automatic and human evaluations.

preprint2023arXiv

Multi-Target Landmark Detection with Incomplete Images via Reinforcement Learning and Shape Prior

Medical images are generally acquired with limited field-of-view (FOV), which could lead to incomplete regions of interest (ROI), and thus impose a great challenge on medical image analysis. This is particularly evident for the learning-based multi-target landmark detection, where algorithms could be misleading to learn primarily the variation of background due to the varying FOV, failing the detection of targets. Based on learning a navigation policy, instead of predicting targets directly, reinforcement learning (RL)-based methods have the potential totackle this challenge in an efficient manner. Inspired by this, in this work we propose a multi-agent RL framework for simultaneous multi-target landmark detection. This framework is aimed to learn from incomplete or (and) complete images to form an implicit knowledge of global structure, which is consolidated during the training stage for the detection of targets from either complete or incomplete test images. To further explicitly exploit the global structural information from incomplete images, we propose to embed a shape model into the RL process. With this prior knowledge, the proposed RL model can not only localize dozens of targetssimultaneously, but also work effectively and robustly in the presence of incomplete images. We validated the applicability and efficacy of the proposed method on various multi-target detection tasks with incomplete images from practical clinics, using body dual-energy X-ray absorptiometry (DXA), cardiac MRI and head CT datasets. Results showed that our method could predict whole set of landmarks with incomplete training images up to 80% missing proportion (average distance error 2.29 cm on body DXA), and could detect unseen landmarks in regions with missing image information outside FOV of target images (average distance error 6.84 mm on 3D half-head CT).

preprint2022arXiv

Applying machine learning to the Calabi-Yau orientifolds with string vacua

We use the machine learning technique to search the polytope which can result in an orientifold Calabi-Yau hypersurface and the "naive Type IIB string vacua". We show that neural networks can be trained to give a high accuracy for classifying the orientifold property and vacua based on the newly generated orientifold Calabi-Yau database with $h^{1,1}(X) \leq 6$ arXiv:2111.03078. This indicates the orientifold symmetry may already be encoded in the polytope structure. In the end, we try to use the trained neural networks model to go beyond the database and predict the orientifold signal of polytope for higher $h^{1,1}(X)$.

preprint2022arXiv

Context Attention Network for Skeleton Extraction

Skeleton extraction is a task focused on providing a simple representation of an object by extracting the skeleton from the given binary or RGB image. In recent years many attractive works in skeleton extraction have been made. But as far as we know, there is little research on how to utilize the context information in the binary shape of objects. In this paper, we propose an attention-based model called Context Attention Network (CANet), which integrates the context extraction module in a UNet architecture and can effectively improve the ability of network to extract the skeleton pixels. Meanwhile, we also use some novel techniques including distance transform, weight focal loss to achieve good results on the given dataset. Finally, without model ensemble and with only 80% of the training images, our method achieves 0.822 F1 score during the development phase and 0.8507 F1 score during the final phase of the Pixel SkelNetOn Competition, ranking 1st place on the leaderboard.

preprint2022arXiv

Evolution of barchan dune interactions investigated by a downscaled water tunnel experiment: the temporal characteristics and a soliton-like behavior

This paper reports a downscaled water tunnel experiment to study the temporal characteristics of a double dune interaction system and the new pattern of dune interaction when the initial mass ratio of the two dunes is large. These topics are useful for a comprehensive understanding of the dune interaction system but were rarely covered before. The turnover time scale under dune interaction is defined, and its time averaged value is found to have a nonmonotonic relationship with the initial mass ratio. A nonmonotonic relationship is also found between the convexity of the downstream dune tip and the initial mass ratio. The stationary points of the two nonmonotonic curves above correspond to the same dune interaction pattern named 'exchange-chasing', which is considered indispensable in the classification map of dune interactions. The upstream dune acts as an energy transmitter between fluid flow and the downstream dune. A soliton-like behavior occurs when the downstream dune enlarges, where a small dune is detached from the downstream dune tip and gets passed by the upstream dune approximately without mass exchange. The activity of such temporary soliton is found to be negatively related with the initial dune spacing and positively related with the initial mass ratio.

preprint2022arXiv

Learning Towards the Largest Margins

One of the main challenges for feature representation in deep learning-based classification is the design of appropriate loss functions that exhibit strong discriminative power. The classical softmax loss does not explicitly encourage discriminative learning of features. A popular direction of research is to incorporate margins in well-established losses in order to enforce extra intra-class compactness and inter-class separability, which, however, were developed through heuristic means, as opposed to rigorous mathematical principles. In this work, we attempt to address this limitation by formulating the principled optimization objective as learning towards the largest margins. Specifically, we firstly define the class margin as the measure of inter-class separability, and the sample margin as the measure of intra-class compactness. Accordingly, to encourage discriminative representation of features, the loss function should promote the largest possible margins for both classes and samples. Furthermore, we derive a generalized margin softmax loss to draw general conclusions for the existing margin-based losses. Not only does this principled framework offer new perspectives to understand and interpret existing margin-based losses, but it also provides new insights that can guide the design of new tools, including sample margin regularization and largest margin softmax loss for the class-balanced case, and zero-centroid regularization for the class-imbalanced case. Experimental results demonstrate the effectiveness of our strategy on a variety of tasks, including visual classification, imbalanced classification, person re-identification, and face verification.

preprint2022arXiv

Modeling COVID-19 vaccine-induced immunological memory development and its links to antibody level and infectiousness

COVID-19 vaccines have proven to be effective against SARS-CoV-2 infection. However, the dynamics of vaccine-induced immunological memory development and neutralizing antibodies generation are not fully understood, limiting vaccine development and vaccination regimen determination. Herein, we constructed a mathematical model to characterize the vaccine-induced immune response based on fitting the viral infection and vaccination datasets. With the example of CoronaVac, we revealed the association between vaccine-induced immunological memory development and neutralizing antibody levels. The establishment of the intact immunological memory requires more than 6 months after the first and second doses, after that a booster shot can induce high levels neutralizing antibodies. By introducing the maximum viral load and recovery time after viral infection, we quantitatively studied the protective effect of vaccines against viral infection. Accordingly, we optimized the vaccination regimen, including dose and vaccination timing, and predicted the effect of the fourth dose. Last, by combining the viral transmission model, we showed the suppression of virus transmission by vaccination, which may be instructive for the development of public health policies.

preprint2022arXiv

Multimodal Machine Learning for Automated ICD Coding

This study presents a multimodal machine learning model to predict ICD-10 diagnostic codes. We developed separate machine learning models that can handle data from different modalities, including unstructured text, semi-structured text and structured tabular data. We further employed an ensemble method to integrate all modality-specific models to generate ICD-10 codes. Key evidence was also extracted to make our prediction more convincing and explainable. We used the Medical Information Mart for Intensive Care III (MIMIC -III) dataset to validate our approach. For ICD code prediction, our best-performing model (micro-F1 = 0.7633, micro-AUC = 0.9541) significantly outperforms other baseline models including TF-IDF (micro-F1 = 0.6721, micro-AUC = 0.7879) and Text-CNN model (micro-F1 = 0.6569, micro-AUC = 0.9235). For interpretability, our approach achieves a Jaccard Similarity Coefficient (JSC) of 0.1806 on text data and 0.3105 on tabular data, where well-trained physicians achieve 0.2780 and 0.5002 respectively.

preprint2022arXiv

Orientifold Calabi-Yau Threefolds with Divisor Involutions and String Landscape

We establish an orientifold Calabi-Yau threefold database for $h^{1,1}(X) \leq 6$ by considering non-trivial $\mathbb{Z}_{2}$ divisor exchange involutions, using a toric Calabi-Yau database (http://www.rossealtman.com/toriccy/). We first determine the topology for each individual divisor (Hodge diamond), then identify and classify the proper involutions which are globally consistent across all disjoint phases of the Kähler cone for each unique geometry. Each of the proper involutions will result in an orientifold Calabi-Yau manifold. Then we clarify all possible fixed loci under the proper involution, thereby determining the locations of different types of $O$-planes. It is shown that under the proper involutions, one typically ends up with a system of $O3/O7$-planes, and most of these will further admit naive Type IIB string vacua.The geometries with freely acting involutions are also determined. We further determine the splitting of the Hodge numbers into odd/even parity in the orbifold limit. The final result is a class of orientifold Calabi-Yau threefolds with non-trivial odd class cohomology $h^{1,1}_{-}(X / σ^*) \neq 0$.

preprint2022arXiv

Prototype-Anchored Learning for Learning with Imperfect Annotations

The success of deep neural networks greatly relies on the availability of large amounts of high-quality annotated data, which however are difficult or expensive to obtain. The resulting labels may be class imbalanced, noisy or human biased. It is challenging to learn unbiased classification models from imperfectly annotated datasets, on which we usually suffer from overfitting or underfitting. In this work, we thoroughly investigate the popular softmax loss and margin-based loss, and offer a feasible approach to tighten the generalization error bound by maximizing the minimal sample margin. We further derive the optimality condition for this purpose, which indicates how the class prototypes should be anchored. Motivated by theoretical analysis, we propose a simple yet effective method, namely prototype-anchored learning (PAL), which can be easily incorporated into various learning-based classification schemes to handle imperfect annotation. We verify the effectiveness of PAL on class-imbalanced learning and noise-tolerant learning by extensive experiments on synthetic and real-world datasets.

preprint2022arXiv

Target-aware Abstractive Related Work Generation with Contrastive Learning

The related work section is an important component of a scientific paper, which highlights the contribution of the target paper in the context of the reference papers. Authors can save their time and effort by using the automatically generated related work section as a draft to complete the final related work. Most of the existing related work section generation methods rely on extracting off-the-shelf sentences to make a comparative discussion about the target work and the reference papers. However, such sentences need to be written in advance and are hard to obtain in practice. Hence, in this paper, we propose an abstractive target-aware related work generator (TAG), which can generate related work sections consisting of new sentences. Concretely, we first propose a target-aware graph encoder, which models the relationships between reference papers and the target paper with target-centered attention mechanisms. In the decoding process, we propose a hierarchical decoder that attends to the nodes of different levels in the graph with keyphrases as semantic indicators. Finally, to generate a more informative related work, we propose multi-level contrastive optimization objectives, which aim to maximize the mutual information between the generated related work with the references and minimize that with non-references. Extensive experiments on two public scholar datasets show that the proposed model brings substantial improvements over several strong baselines in terms of automatic and tailored human evaluations.

preprint2022arXiv

Towards artificial general intelligence via a multimodal foundation model

The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks. To achieve this goal, we propose to pre-train our foundation model by self-supervised learning with weak semantic correlation data crawled from the Internet and show that promising results can be obtained on a wide range of downstream tasks. Particularly, with the developed model-interpretability tools, we demonstrate that strong imagination ability is now possessed by our foundation model. We believe that our work makes a transformative stride towards AGI, from our common practice of "weak or narrow AI" to that of "strong or generalized AI".

preprint2020arXiv

Bayesian model selection approach for colored graphical Gaussian models

We consider a class of colored graphical Gaussian models obtained by placing symmetry constraints on the precision matrix in a Bayesian framework. The prior distribution on the precision matrix is the colored $G$-Wishart prior which is the Diaconis-Ylvisaker conjugate prior. In this paper, we develop a computationally efficient model search algorithm which combines linear regression with a double reversible jump Markov chain Monte Carlo (MCMC) method. The latter is to estimate the Bayes factors expressed as the ratio of posterior probabilities of two competing models. We also establish the asymptotic consistency property of the model selection procedure based on the Bayes factors. Our procedure avoids an exhaustive search which is computationally impossible. Our method is illustrated with simulations and a real-world application with a protein signalling data set.

preprint2020arXiv

Computational Drug Repositioning and Elucidation of Mechanism of Action of Compounds against SARS-CoV-2

The COVID-19 crisis called for rapid reaction from all the fields of biomedical research. Traditional drug development involves time consuming pipelines that conflict with the urgence of identifying effective therapies during a health and economic emergency. Drug repositioning, that is the discovery of new clinical applications for drugs already approved for different therapeutic contexts, could provide an effective shortcut to bring COVID-19 treatments to the bedside in a timely manner. Moreover, computational approaches can help accelerate the process even further. Here we present the application of computational drug repositioning tools based on transcriptomics data to identify drugs that are potentially able to counteract SARS-CoV-2 infection, and also to provide insights on their mode of action. We believe that mucolytics and HDAC inhibitors warrant further investigation. In addition, we found that the DNA Mismatch repair pathway is strongly modulated by drugs with experimental in vitro activity against SARS-CoV-2 infection. Both full results and methods are publicly available.

preprint2020arXiv

Data-Free Knowledge Amalgamation via Group-Stack Dual-GAN

Recent advances in deep learning have provided procedures for learning one network to amalgamate multiple streams of knowledge from the pre-trained Convolutional Neural Network (CNN) models, thus reduce the annotation cost. However, almost all existing methods demand massive training data, which may be unavailable due to privacy or transmission issues. In this paper, we propose a data-free knowledge amalgamate strategy to craft a well-behaved multi-task student network from multiple single/multi-task teachers. The main idea is to construct the group-stack generative adversarial networks (GANs) which have two dual generators. First one generator is trained to collect the knowledge by reconstructing the images approximating the original dataset utilized for pre-training the teachers. Then a dual generator is trained by taking the output from the former generator as input. Finally we treat the dual part generator as the target network and regroup it. As demonstrated on several benchmarks of multi-label classification, the proposed method without any training data achieves the surprisingly competitive results, even compared with some full-supervised methods.

preprint2020arXiv

Decomposition of the Total Effect for Two Mediators: A Natural Counterfactual Interaction Effect Framework

Mediation analysis has been used in many disciplines to explain the mechanism or process that underlies an observed relationship between an exposure variable and an outcome variable via the inclusion of mediators. Decompositions of the total causal effect of an exposure variable into effects characterizing mediation pathways and interactions have gained an increasing amount of interest in the last decade. In this work, we develop decompositions for scenarios where the two mediators are causally sequential or non-sequential. Current developments in this area have primarily focused on either decompositions without interaction components or with interactions but assuming no causally sequential order between the mediators. We propose a new concept called natural counterfactual interaction effect that captures the two-way and three-way interactions for both scenarios that extend the two-way mediated interactions in literature. We develop a unified approach for decomposing the total effect into the effects that are due to mediation only, interaction only, both mediation and interaction, neither mediation nor interaction within the counterfactual framework. Finally, we illustrate the proposed decomposition method using a real data analysis where the two mediators are causally sequential.

preprint2020arXiv

Decomposition of Total Effect with the Notion of Natural Counterfactual Interaction Effect

Mediation analysis serves as a crucial tool to obtain causal inference based on directed acyclic graphs, which has been widely employed in the areas of biomedical science, social science, epidemiology and psychology. Decomposition of total effect provides a deep insight to fully understand the casual contribution from each path and interaction term. Since the four-way decomposition method was proposed to identify the mediated interaction effect in counterfactual framework, the idea had been extended to a more sophisticated scenario with non-sequential multiple mediators. However, the method exhibits limitations as the causal structure contains direct causal edges between mediators, such as inappropriate modeling of dependence and non-identifiability. We develop the notion of natural counterfactual interaction effect and find that the decomposition of total effect can be consistently realized with our proposed notion. Furthermore, natural counterfactual interaction effect overcomes the drawbacks and possesses a clear and significant interpretation, which may largely improve the capacity of researchers to analyze highly complex causal structures.

preprint2020arXiv

Disassembling Object Representations without Labels

In this paper, we study a new representation-learning task, which we termed as disassembling object representations. Given an image featuring multiple objects, the goal of disassembling is to acquire a latent representation, of which each part corresponds to one category of objects. Disassembling thus finds its application in a wide domain such as image editing and few- or zero-shot learning, as it enables category-specific modularity in the learned representations. To this end, we propose an unsupervised approach to achieving disassembling, named Unsupervised Disassembling Object Representation (UDOR). UDOR follows a double auto-encoder architecture, in which a fuzzy classification and an object-removing operation are imposed. The fuzzy classification constrains each part of the latent representation to encode features of up to one object category, while the object-removing, combined with a generative adversarial network, enforces the modularity of the representations and integrity of the reconstructed image. Furthermore, we devise two metrics to respectively measure the modularity of disassembled representations and the visual integrity of reconstructed images. Experimental results demonstrate that the proposed UDOR, despited unsupervised, achieves truly encouraging results on par with those of supervised methods.

preprint2020arXiv

Green Offloading in Fog-Assisted IoT Systems: An Online Perspective Integrating Learning and Control

In fog-assisted IoT systems, it is a common practice to offload tasks from IoT devices to their nearby fog nodes to reduce task processing latencies and energy consumptions. However, the design of online energy-efficient scheme is still an open problem because of various uncertainties in system dynamics such as processing capacities and transmission rates. Moreover, the decision-making process is constrained by resource limits on fog nodes and IoT devices, making the design even more complicated. In this paper, we formulate such a task offloading problem with unknown system dynamics as a combinatorial multi-armed bandit (CMAB) problem with long-term constraints on time-averaged energy consumptions. Through an effective integration of online learning and online control, we propose a \textit{Learning-Aided Green Offloading} (LAGO) scheme. In LAGO, we employ bandit learning methods to handle the exploitation-exploration tradeoff and utilize virtual queue techniques to deal with the long-term constraints. Our theoretical analysis shows that LAGO can reduce the average task latency with a tunable sublinear regret bound over a finite time horizon and satisfy the long-term time-averaged energy constraints. We conduct extensive simulations to verify such theoretical results.

preprint2020arXiv

Intermittent Pulling with Local Compensation for Communication-Efficient Federated Learning

Federated Learning is a powerful machine learning paradigm to cooperatively train a global model with highly distributed data. A major bottleneck on the performance of distributed Stochastic Gradient Descent (SGD) algorithm for large-scale Federated Learning is the communication overhead on pushing local gradients and pulling global model. In this paper, to reduce the communication complexity of Federated Learning, a novel approach named Pulling Reduction with Local Compensation (PRLC) is proposed. Specifically, each training node intermittently pulls the global model from the server in SGD iterations, resulting in that it is sometimes unsynchronized with the server. In such a case, it will use its local update to compensate the gap between the local model and the global model. Our rigorous theoretical analysis of PRLC achieves two important findings. First, we prove that the convergence rate of PRLC preserves the same order as the classical synchronous SGD for both strongly-convex and non-convex cases with good scalability due to the linear speedup with respect to the number of training nodes. Second, we show that PRLC admits lower pulling frequency than the existing pulling reduction method without local compensation. We also conduct extensive experiments on various machine learning models to validate our theoretical results. Experimental results show that our approach achieves a significant pulling reduction over the state-of-the-art methods, e.g., PRLC requiring only half of the pulling operations of LAG.

preprint2020arXiv

Learning to Stop While Learning to Predict

There is a recent surge of interest in designing deep architectures based on the update steps in traditional algorithms, or learning neural networks to improve and replace traditional algorithms. While traditional algorithms have certain stopping criteria for outputting results at different iterations, many algorithm-inspired deep models are restricted to a ``fixed-depth'' for all inputs. Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances, either to avoid ``over-thinking'', or because we want to compute less for operations converged already. In this paper, we tackle this varying depth problem using a steerable architecture, where a feed-forward deep model and a variational stopping policy are learned together to sequentially determine the optimal number of layers for each input instance. Training such architecture is very challenging. We provide a variational Bayes perspective and design a novel and effective training procedure which decomposes the task into an oracle model learning stage and an imitation stage. Experimentally, we show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks, including learning sparse recovery, few-shot meta learning, and computer vision tasks.

preprint2020arXiv

Online User-AP Association with Predictive Scheduling in Wireless Caching Networks

For wireless caching networks, the scheme design for content delivery is non-trivial in the face of the following tradeoff. On one hand, to optimize overall throughput, users can associate their nearby APs with great channel capacities; however, this may lead to unstable queue backlogs on APs and prolong request delays. On the other hand, to ensure queue stability, some users may have to associate APs with inferior channel states, which would incur throughput loss. Moreover, for such systems, how to conduct predictive scheduling to reduce delays and the fundamental limits of its benefits remain unexplored. In this paper, we formulate the problem of online user-AP association and resource allocation for content delivery with predictive scheduling under a fixed content placement as a stochastic network optimization problem. By exploiting its unique structure, we transform the problem into a series of modular maximization sub-problems with matroid constraints. Then we devise PUARA, a Predictive User-AP Association and Resource Allocation scheme which achieves a provably near-optimal throughput with queue stability. Our theoretical analysis and simulation results show that PUARA can not only perform a tunable control between throughput maximization and queue stability but also incur a notable delay reduction with predicted information.

preprint2020arXiv

Online VNF Chaining and Predictive Scheduling: Optimality and Trade-offs

For NFV systems, the key design space includes the function chaining for network requests and resource scheduling for servers. The problem is challenging since NFV systems usually require multiple (often conflicting) design objectives and the computational efficiency of real-time decision making with limited information. Furthermore, the benefits of predictive scheduling to NFV systems still remain unexplored. In this paper, we propose POSCARS, an efficient predictive and online service chaining and resource scheduling scheme that achieves tunable trade-offs among various system metrics with queue stability guarantee. Through a careful choice of granularity in system modeling, we acquire a better understanding of the trade-offs in our design space. By a non-trivial transformation, we decouple the complex optimization problem into a series of online sub-problems to achieve the optimality with only limited information. By employing randomized load balancing techniques, we propose three variants of POSCARS to reduce the overheads of decision making. Theoretical analysis and simulations show that POSCARS and its variants require only mild-value of future information to achieve near-optimal system cost with an ultra-low request response time.

preprint2020arXiv

RNA Secondary Structure Prediction By Learning Unrolled Algorithms

In this paper, we propose an end-to-end deep learning model, called E2Efold, for RNA secondary structure prediction which can effectively take into account the inherent constraints in the problem. The key idea of E2Efold is to directly predict the RNA base-pairing matrix, and use an unrolled algorithm for constrained programming as the template for deep architectures to enforce constraints. With comprehensive experiments on benchmark datasets, we demonstrate the superior performance of E2Efold: it predicts significantly better structures compared to previous SOTA (especially for pseudoknotted structures), while being as efficient as the fastest algorithms in terms of inference time.

preprint2020arXiv

SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic

Since the first alert launched by the World Health Organization (5 January, 2020), COVID-19 has been spreading out to over 180 countries and territories. As of June 18, 2020, in total, there are now over 8,400,000 cases and over 450,000 related deaths. This causes massive losses in the economy and jobs globally and confining about 58% of the global population. In this paper, we introduce SenWave, a novel sentimental analysis work using 105+ million collected tweets and Weibo messages to evaluate the global rise and falls of sentiments during the COVID-19 pandemic. To make a fine-grained analysis on the feeling when we face this global health crisis, we annotate 10K tweets in English and 10K tweets in Arabic in 10 categories, including optimistic, thankful, empathetic, pessimistic, anxious, sad, annoyed, denial, official report, and joking. We then utilize an integrated transformer framework, called simpletransformer, to conduct multi-label sentimental classification by fine-tuning the pre-trained language model on the labeled data. Meanwhile, in order for a more complete analysis, we also translate the annotated English tweets into different languages (Spanish, Italian, and French) to generated training data for building sentiment analysis models for these languages. SenWave thus reveals the sentiment of global conversation in six different languages on COVID-19 (covering English, Spanish, French, Italian, Arabic and Chinese), followed the spread of the epidemic. The conversation showed a remarkably similar pattern of rapid rise and slow decline over time across all nations, as well as on special topics like the herd immunity strategies, to which the global conversation reacts strongly negatively. Overall, SenWave shows that optimistic and positive sentiments increased over time, foretelling a desire to seek, together, a reset for an improved COVID-19 world.

preprint2020arXiv

Service Chain Composition with Failures in NFV Systems: A Game-Theoretic Perspective

For state-of-the-art network function virtualization (NFV) systems, it remains a key challenge to conduct effective service chain composition for different network services (NSs) with ultra-low request latencies and minimum network congestion. To this end, existing solutions often require full knowledge of the network state, while ignoring the privacy issues and overlooking the non-cooperative behaviors of users. What is more, they may fall short in the face of unexpected failures such as user unavailability and virtual machine breakdown. In this paper, we formulate the problem of service chain composition in NFV systems with failures as a non-cooperative game. By showing that such a game is a weighted potential game and exploiting the unique problem structure, we propose two effective distributed schemes that guide the service chain compositions of different NSs towards the Nash equilibrium (NE) state with both near-optimal latencies and minimum congestion. Besides, we develop two novel learning-aided schemes as comparisons, which are based on deep reinforcement learning (DRL) and Monte Carlo tree search (MCTS) techniques, respectively. Our theoretical analysis and simulation results demonstrate the effectiveness of our proposed schemes, as well as the adaptivity when faced with failures.

preprint2019arXiv

Extending the Geometry of Heterotic Spectral Cover Constructions

In this work we extend the well-known spectral cover construction first developed by Friedman, Morgan, and Witten to describe more general vector bundles on elliptically fibered Calabi-Yau geometries. In particular, we consider the case in which the Calabi-Yau fibration is not in Weierstrass form, but can rather contain fibral divisors or multiple sections (i.e. a higher rank Mordell-Weil group). In these cases, general vector bundles defined over such Calabi-Yau manifolds cannot be described by ordinary spectral data. To accomplish this we employ well established tools from the mathematics literature of Fourier-Mukai functors. We also generalize existing tools for explicitly computing Fourier-Mukai transforms of stable bundles on elliptic Calabi-Yau manifolds. As an example of these new tools we produce novel examples of chirality changing small instanton transitions. The goal of this work is to provide a geometric formalism that can substantially increase the understood regimes of heterotic/F-theory duality.