Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
27works
0followers
31topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

27 published item(s)

preprint2026arXiv

CMKL: Modality-Aware Continual Learning for Evolving Biomedical Knowledge Graphs

Biomedical knowledge graphs are increasingly large, dynamic, and multimodal, driven by rapid advances in biotechnology such as high-throughput sequencing. Machine learning models can infer previously unobserved biomedical relationships and characterize biomedical entities in these graphs, but existing knowledge graph embedding methods and their continual learning extensions either assume static graph structure or fail to exploit multimodal information under evolving data distributions. They also apply uniform regularization across all model parameters, ignoring that different modalities may exhibit distinct forgetting dynamics as the graph evolves. We propose the Continual Multimodal Knowledge Graph Learner (CMKL), a CL framework for biomedical KGs that natively encodes structure, text, and molecules, fuses them through a Mixture-of-Experts (MoE) router, and protects previously learned knowledge with standard EWC regularization and a K-means-diverse multimodal replay buffer. We evaluate CMKL on a 129K-entity biomedical continual benchmark with 10 tasks. On continual biomedical entity classification, CMKL reaches AP 0.591 versus 0.370 for the strongest structural baseline, a 60% gain that is driven by access to multimodal features and preserved across the sequence with near-zero forgetting (AF 0.008). On continual relationship prediction, CMKL reaches AP $0.062$, matching Naive Sequential and EWC (0.058) within seed noise and outperforming Joint Training (0.047, p=0.045) and LKGE (0.039). A frozen-text ablation reaches AP 0.136, more than double any jointly trained model, yet that signal is unreachable by margin-ranking gradients: the greedy-modality asymmetry lives at the representation level, not the fusion level, and MoE routing manages it by suppressing the unreachable modality without forcing it through a learned bottleneck. Code: github.com/yradwan147/cmkl-neurips2026

preprint2026arXiv

PrimeKG-CL: A Continual Graph Learning Benchmark on Evolving Biomedical Knowledge Graphs

Biomedical knowledge graphs underwrite drug repurposing and clinical decision support, yet the upstream ontologies they depend on update on independent cycles that add millions of edges and deprecate hundreds of thousands more between releases. Yet existing continual graph learning has been studied almost exclusively on synthetic random splits of static, generic KGs, a regime that cannot reproduce the asynchronous, structured evolution real biomedical KGs undergo. To this end, we introduce PrimeKG-CL, a CGL benchmark built from nine authoritative biomedical databases (129K+ nodes, 8.1M+ edges, 10 node types, 30 relation types) with two genuine temporal snapshots (June 2021, July 2023; 5.83M edges added, 889K removed, 7.21M persistent), 10 entity-type-grouped tasks, multimodal node features, and a per-task persistent/added/removed test stratification. On three tasks (biomedical relationship prediction, entity classification, KGQA), we evaluate six CL strategies across four KGE decoders, plus LKGE, an LLM-RAG agent, and CMKL. We find that decoder choice and continual learning strategy interact strongly: no single strategy performs best across all decoders, and mismatched combinations can significantly degrade performance. Moreover, only DistMult exhibits a clear separation between persistent and deprecated knowledge, indicating that standard metrics conflate retention of still-valid facts with failure to forget outdated ones; this effect is absent under RotatE. In addition, multimodal features improve entity-level tasks by up to 60%, and a recent CKGE framework (IncDE) failed to scale to our 5.67M-triple base task across five attempts up to 350GB RAM. Data, pipeline, baselines, and the stratified split are released openly. Dataset:huggingface.co/datasets/yradwan147/PrimeKGCL|Code:github.com/yradwan147/primekg-cl-neurips2026

preprint2024arXiv

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

The rapid development of open-source large language models (LLMs) has been truly remarkable. However, the scaling law described in previous literature presents varying conclusions, which casts a dark cloud over scaling LLMs. We delve into the study of scaling laws and present our distinctive findings that facilitate scaling of large scale models in two commonly used open-source configurations, 7B and 67B. Guided by the scaling laws, we introduce DeepSeek LLM, a project dedicated to advancing open-source language models with a long-term perspective. To support the pre-training phase, we have developed a dataset that currently consists of 2 trillion tokens and is continuously expanding. We further conduct supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) on DeepSeek LLM Base models, resulting in the creation of DeepSeek Chat models. Our evaluation results demonstrate that DeepSeek LLM 67B surpasses LLaMA-2 70B on various benchmarks, particularly in the domains of code, mathematics, and reasoning. Furthermore, open-ended evaluations reveal that DeepSeek LLM 67B Chat exhibits superior performance compared to GPT-3.5.

preprint2022arXiv

$\rm ^{83}Rb$/$\rm ^{83m}Kr$ production and cross-section measurement with 3.4 MeV and 20 MeV proton beams

$\rm ^{83m}Kr$, with a short lifetime, is an ideal calibration source for liquid xenon or liquid argon detectors. The $\rm ^{83m}Kr$ isomer can be generated through the decay of $\rm ^{83} Rb$ isotope which is usually produced by proton beams bombarding natural krypton atoms. In this paper, we report a successful production of $\rm ^{83}Rb/^{83m}Kr$ with a proton beam energy of 3.4 MeV, and the first measurement of the production rate with such low energy proton beams. Another production attempt is performed using the newly available 20 MeV proton beam in China, and the measured production rate is consistent with previous measurements. The produced $\rm ^{83m}Kr$ source has been successfully injected into the PandaX-II liquid xenon detector, yielding enough statistics for detector calibration.

preprint2022arXiv

Forecasting SQL Query Cost at Twitter

With the advent of the Big Data era, it is usually computationally expensive to calculate the resource usages of a SQL query with traditional DBMS approaches. Can we estimate the cost of each query more efficiently without any computation in a SQL engine kernel? Can machine learning techniques help to estimate SQL query resource utilization? The answers are yes. We propose a SQL query cost predictor service, which employs machine learning techniques to train models from historical query request logs and rapidly forecasts the CPU and memory resource usages of online queries without any computation in a SQL engine. At Twitter, infrastructure engineers are maintaining a large-scale SQL federation system across on-premises and cloud data centers for serving ad-hoc queries. The proposed service can help to improve query scheduling by relieving the issue of imbalanced online analytical processing (OLAP) workloads in the SQL engine clusters. It can also assist in enabling preemptive scaling. Additionally, the proposed approach uses plain SQL statements for the model training and online prediction, indicating it is both hardware and software-agnostic. The method can be generalized to broader SQL systems and heterogeneous environments. The models can achieve 97.9\% accuracy for CPU usage prediction and 97\% accuracy for memory usage prediction.

preprint2022arXiv

Improving Pedestrian Priority via Grouping and Virtual Lanes

The shared space design is applied in urban streets to support barrier-free movement and integrate traffic participants (such as pedestrians, cyclists and vehicles) into a common road space. Regardless of the low-speed environment, sharing space with motor vehicles can make vulnerable road users feel uneasy. Yet, walking in groups increases their confidence as well as influence the yielding behavior of drivers. Therefore, we propose an innovative approach to support the crossing of pedestrians via grouping and project the virtual lanes in shared spaces. This paper presents the important components of the crowd steering system, discusses the enablers and gaps in the current approach, and illustrates the proposed idea with concept diagrams.

preprint2022arXiv

LAMOST MRS-N Observations of the W80 Region

The spectral observations and analysis for the W80 Region are presented by using the data of Medium-Resolution Spectroscopic Survey of Nebulae (MRS-N) with the Large Sky Area Multi-Object Fiber Spectroscopy Telescope (LAMOST). A total of 2982 high-quality nebular spectra have been obtained in the 20 square degree field of view (FoV) which covers the W80 complex, and the largest sample of spectral data have been established for the first time. The relative intensities, radial velocities (RVs), and Full Widths at Half Maximum (FWHMs) are measured with the high spectral resolution of LAMOST MRS, for H$α$ $λ$ 6563 Å, [\ion{N}{ii}] $λ$$λ$ 6548 Å, 6584 Å\ , and [\ion{S}{ii}] $λ$$λ$ 6716 Å, 6731 Å\ emission lines. In the field of view of whole W80 Region, the strongest line emissions are found to be consistent with the bright nebulae, NGC 7000, IC 5070, and LBN 391, and weak line emissions also truly exist in the Middle Region, where no bright nebulae are detected by the wide-band optical observations. The large-scale spectral observations to the W80 Region reveal the systematic spatial variations of RVs and FWHMs, and several unique structural features. A 'curved feature' to the east of the NGC 7000, and a 'jet feature' to the west of the LBN 391 are detected to be showing with larger radial velocities. A 'wider FWHM region' is identified in the eastern part of the NGC 7000. The variations of [\ion{S}{ii}] / H$α$ ratios display a gradient from southwest to northeast in the NGC 7000 region, and manifest a ring shape around the 'W80 bubble' ionized by an O-type star in the L935. Further spectral and multi-band observations are guaranteed to investigate in detail the structural features.

preprint2022arXiv

N-Cloth: Predicting 3D Cloth Deformation with Mesh-Based Networks

We present a novel mesh-based learning approach (N-Cloth) for plausible 3D cloth deformation prediction. Our approach is general and can handle cloth or obstacles represented by triangle meshes with arbitrary topologies. We use graph convolution to transform the cloth and object meshes into a latent space to reduce the non-linearity in the mesh space. Our network can predict the target 3D cloth mesh deformation based on the initial state of the cloth mesh template and the target obstacle mesh. Our approach can handle complex cloth meshes with up to 100K triangles and scenes with various objects corresponding to SMPL humans, non-SMPL humans or rigid bodies. In practice, our approach can be used to generate plausible cloth simulation at 30-45 fps on an NVIDIA GeForce RTX 3090 GPU. We highlight its benefits over prior learning-based methods and physically-based cloth simulators.

preprint2022arXiv

On the improved conditions for some primal-dual algorithms

The convex minimization of $f(\mathbf{x})+g(\mathbf{x})+h(\mathbf{A}\mathbf{x})$ over $\mathbb{R}^n$ with differentiable $f$ and linear operator $\mathbf{A}: \mathbb{R}^n\rightarrow \mathbb{R}^m$, has been well-studied in the literature. By considering the primal-dual optimality of the problem, many algorithms are proposed from different perspectives such as monotone operator scheme and fixed point theory. In this paper, we start with a base algorithm to reveal the connection between several algorithms such as AFBA, PD3O and Chambolle-Pock. Then, we prove its convergence under a relaxed assumption associated with the linear operator and characterize the general constraint on primal and dual stepsizes. The result improves the upper bound of stepsizes of AFBA and indicates that Chambolle-Pock, as the special case of the base algorithm when $f=0$, can take the stepsize of the dual iteration up to $4/3$ of the previously proven one.

preprint2022arXiv

Program Adverbs and Tlön Embeddings

Free monads (and their variants) have become a popular general-purpose tool for representing the semantics of effectful programs in proof assistants. These data structures support the compositional definition of semantics parameterized by uninterpreted events, while admitting a rich equational theory of equivalence. But monads are not the only way to structure effectful computation, why should we limit ourselves? In this paper, inspired by applicative functors, selective functors, and other structures, we define a collection of data structures and theories, which we call program adverbs, that capture a variety of computational patterns. Program adverbs are themselves composable, allowing them to be used to specify the semantics of languages with multiple computation patterns. We use program adverbs as the basis for a new class of semantic embeddings called Tlön embeddings. Compared with embeddings based on free monads, Tlön embeddings allow more flexibility in computational modeling of effects, while retaining more information about the program's syntactic structure.

preprint2022arXiv

Scattering Amplitudes of Kaluza-Klein Strings and Extended Massive Double-Copy

We study the scattering amplitudes of massive Kaluza-Klein (KK) states of open and closed bosonic strings under toroidal compactification. We analyze the structure of vertex operators for the KK strings and derive an extended massive KLT-like relation which connects the $N$-point KK closed-string amplitude to the products of two KK open-string amplitudes at tree level. Taking the low energy field-theory limit of vanishing Regge slope, we derive double-copy construction formula of the $N$-point massive KK graviton amplitude from the sum of proper products of the corresponding KK gauge boson amplitudes. Then, using the string-based massive double-copy formula, we derive the exact tree-level four-point KK gauge boson amplitudes and KK graviton amplitudes, which fully agree with those given by the KK field-theory calculations. With these, we give an explicit prescription on constructing the exact four-point KK graviton amplitudes from the sum of proper products of the corresponding color-ordered KK gauge boson amplitudes. We further analyze the string-based double-copy construction of five-point and six-point scattering amplitudes of massive KK gauge bosons and KK gravitons.

preprint2022arXiv

Taming Hybrid-Cloud Fast and Scalable Graph Analytics at Twitter

We have witnessed a boosted demand for graph analytics at Twitter in recent years, and graph analytics has become one of the key parts of Twitter's large-scale data analytics and machine learning for driving engagement, serving the most relevant content, and promoting healthier conversations. However, infrastructure for graph analytics has historically not been an area of investment at Twitter, resulting in a long timeline and huge engineering effort for each project to deal with graphs at the Twitter scale. How do we build a unified graph analytics user experience to fulfill modern data analytics on various graph scales spanning from thousands to hundreds of billions of vertices and edges? To bring fast and scalable graph analytics capability into production, we investigate the challenges we are facing in large-scale graph analytics at Twitter and propose a unified graph analytics platform for efficient, scalable, and reliable graph analytics across on-premises and cloud, to fulfill the requirements of diverse graph use cases and challenging scales. We also conduct quantitative benchmarking on Twitter's production-level graph use cases between popular graph analytics frameworks to certify our solution.

preprint2022arXiv

The Data Processing of the LAMOST Medium-Resolution Spectral Survey of Galactic Nebulae (LAMOST MRS-N Pipeline)

The Large sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) medium-resolution spectral survey of Galactic Nebulae (MRS-N) has conducted for three years since Sep. 2018 and observed more than 190 thousands nebular spectra and 20 thousands stellar spectra. However, there is not yet a data processing pipeline for nebular data. To significantly improve the accuracy of nebulae classification and their physical parameters, we developed the MRS-N Pipeline. This article presented in detail each data processing step of the MRS-N Pipeline, such as removing cosmic rays, merging single exposure, fitting sky light emission lines, subtracting skylight, wavelength recalibration, measuring nebular parameters, creating catalogs and packing spectra. Finally, a description of the data products, including nebular spectra files and parameter catalogs, is provided.

preprint2021arXiv

Data-driven computation methods for quasi-stationary distribution and sensitivity analysis

This paper studies computational methods for quasi-stationary distributions (QSDs). We first proposed a data-driven solver that solves Fokker-Planck equations for QSDs. Similar as the case of Fokker-Planck equations for invariant probability measures, we set up an optimization problem that minimizes the distance from a low-accuracy reference solution, under the constraint of satisfying the linear relation given by the discretized Fokker-Planck operator. Then we use coupling method to study the sensitivity of a QSD against either the change of boundary condition or the diffusion coefficient. The 1-Wasserstein distance between a QSD and the corresponding invariant probability measure can be quantitatively estimated. Some numerical results about both computation of QSDs and their sensitivity analysis are provided.

preprint2021arXiv

Exploring the Regulatory Function of the N-terminal Domain of SARS-CoV-2 Spike Protein Through Molecular Dynamics Simulation

SARS-CoV-2 is what has caused the COVID-19 pandemic. Early viral infection is mediated by the SARS-CoV-2 homo-trimeric Spike (S) protein with its receptor binding domains (RBDs) in the receptor-accessible state. We performed molecular dynamics simulation on the S protein with a focus on the function of its N-terminal domains (NTDs). Our study reveals that the NTD acts as a "wedge" and plays a crucial regulatory role in the conformational changes of the S protein. The complete RBD structural transition is allowed only when the neighboring NTD that typically prohibits the RBD's movements as a wedge detaches and swings away. Based on this NTD "wedge" model, we propose that the NTD-RBD interface should be a potential drug target.

preprint2021arXiv

Flexible daytime radiative cooling enhanced by enabling three-phase composites with scattering interfaces between silica-microspheres and hierarchical porous coatings

Daytime radiative cooling has attracted considerable attention recently due to its tremendous potential for passively exploiting the coldness of deep-sky as clean and renewable energy. Many advanced materials with novel photonic micro-nanostructures have already been developed to enable highly efficient daytime radiative coolers, among which the flexible hierarchical porous coatings (HPCs) are a more distinguished category. However, it is still hard to precisely control the size distribution of the randomized pores within the HPCs, usually resulting in a deficient solar reflection at the near-infrared optical regime under diverse fabrication conditions of the coatings. We report here a three-phase (i.e., air pore-phase, microsphere-phase and polymer-phase) self-assembled hybrid porous composite coating which dramatically increases the average solar reflectance and yields a remarkable temperature drop of ~10 degC and 30 degC compared to the ambient circumstance and black paint, respectively, according to the rooftop measurements. Mie theory and Monte Carlo simulations reveal the origin of the low reflectivity of as-prepared two-phase porous HPCs, and the optical cooling improvement of the three-phase porous composite coatings is attributed to the newly generated interfaces possessing the high scattering efficiency between the hierarchical pores and silica microspheres hybridized with appropriate mass fractions. As a result, the hybrid porous composite approach enhances the whole performance of the coatings, which provides a promising alternative to the flexible daytime radiative cooler.

preprint2021arXiv

On linear convergence of two decentralized algorithms

Decentralized algorithms solve multi-agent problems over a connected network, where the information can only be exchanged with the accessible neighbors. Though there exist several decentralized optimization algorithms, there are still gaps in convergence conditions and rates between decentralized and centralized algorithms. In this paper, we fill some gaps by considering two decentralized algorithms: EXTRA and NIDS. They both converge linearly with strongly convex objective functions. We will answer two questions regarding them. What are the optimal upper bounds for their stepsizes? Do decentralized algorithms require more properties on the functions for linear convergence than centralized ones? More specifically, we relax the required conditions for linear convergence of both algorithms. For EXTRA, we show that the stepsize is comparable to that of centralized algorithms. For NIDS, the upper bound of the stepsize is shown to be exactly the same as the centralized ones. In addition, we relax the requirement for the objective functions and the mixing matrices. We provide the linear convergence results for both algorithms under the weakest conditions.

preprint2021arXiv

Passive radiative temperature regulator: principles and absorption-emission manipulation

As a representative device exploiting both the solar energy and the radiative cooling of deep-sky, the radiative temperature regulator (RTR) could switch between heating and cooling modes self-adaptively at different temperatures. However, the concept of RTR is challenging to be implemented due to the intense parasitic absorption in phase-changing layers. Here, based on the theoretical framework of energy conservation, we quantitatively reveal the intrinsic relationships between solar heating and radiative cooling, especially addressing the fundamental limiting factors, including the parasitic absorption and the spectral emission selectivity, as well as the dynamic responses of the phase-changing device under various operating conditions. The investigation presents more insight into the underlying physics of RTRs and provides feasible architectures for realizing such a kind of new functional device.

preprint2021arXiv

Switching off microcavity polariton condensate near the exceptional point

Gain and loss modulation are ubiquitous in nature. An exceptional point arises when both the eigenvectors and eigenvalues coalesce, which in a physical system can be achieved by engineering the gain and loss coefficients, leading to a wide variety of counter-intuitive phenomena. In this work we demonstrate the existence of an exceptional point in an exciton polariton condensate in a double-well potential. Remarkably, near the exceptional point, the polariton condensate localized in one potential well can be switched off by an additional optical excitation in the other well with very low (far below threshold) laser power which surprisingly induces additional loss into the system. Increasing the power of the additional laser leads to a situation in which gain dominates in both wells again, such that the polaritons re-condense with almost the same density in the two potential wells. Our results offer a simple way to optically manipulate the polariton lasing process in a double-well potential structure. Extending such configuration to complex potential well lattices offers exciting prospects to explore high-order exceptional points and non-Hermitian topological photonics in a non-equilibrium many-body system.

preprint2020arXiv

Exciton interaction induced spin splitting in MoS$_2$ monolayer

By pumping nonresonantly a MoS$_2$ monolayer at $13$ K under a circularly polarized cw laser, we observe exciton energy redshifts that break the degeneracy between B excitons with opposite spin. The energy splitting increases monotonically with the laser power reaching as much as $18$ meV, while it diminishes with the temperature. The phenomenon can be explained theoretically by considering simultaneously the bandgap renormalization which gives rise to the redshift and exciton-exciton Coulomb exchange interaction which is responsible for the spin-dependent splitting. Our results offer a simple scheme to control the valley degree of freedom in MoS$_2$ monolayer and provide an accessible method in investigating many-body exciton exciton interaction in such materials.

preprint2020arXiv

From deterministic dynamics to thermodynamic laws II: Fourier's law and mesoscopic limit equation

This paper consider the mesoscopic limit of a stochastic energy exchange model that is numerically derived from deterministic dynamics. The law of large numbers and the central limit theorems are proved. We show that the limit of the stochastic energy exchange model is a discrete heat equation that satisfies Fourier's law. In addition, when the system size (number of particles) is large, the stochastic energy exchange is approximated by a stochastic differential equation, called the mesoscopic limit equation.

preprint2020arXiv

Massive suppression of proximity pairing in topological (Bi$_{1-x}$Sb$_{x})_2$Te$_3$ films on niobium

Interfacing bulk conducting topological Bi$_2$Se$_3$ films with s-wave superconductors initiates strong superconducting order in the nontrivial surface states. However, bulk insulating topological (Bi$_{1-x}$Sb$_{x})_2$Te$_3$ films on bulk Nb instead exhibit a giant attenuation of surface superconductivity, even for films only two-layers thick. This massive suppression of proximity pairing is evidenced by ultrahigh-resolution band mappings and by contrasting quantified superconducting gaps with those of heavily n-doped topological Bi$_2$Se$_3$/Nb. The results underscore the limitations of using superconducting proximity effects to realize topological superconductivity in nearly intrinsic systems.

preprint2020arXiv

Merger of Dark Matter Axion Clumps and Resonant Photon Emission

A portion of light scalar dark matter, especially axions, may organize into gravitationally bound clumps (stars) and be present in large number in the galaxy today. It is therefore of utmost interest to determine if there are novel observational signatures of this scenario. Work has shown that for moderately large axion-photon couplings, such clumps can undergo parametric resonance into photons, for clumps above a critical mass $M^{\star}_c$ determined precisely by some of us in Ref. [1]. In order to obtain a clump above the critical mass in the galaxy today would require mergers. In this work we perform full 3-dimensional simulations of pairs of axion clumps and determine the conditions under which mergers take place through the emission of scalar waves, including analyzing head-on and non-head-on collisions, phase dependence, and relative velocities. Consistent with other work in the literature, we find that the final mass from the merger $M^{\star}_{\text{final}}\approx 0.7(M^{\star}_1+M^{\star}_2)$ is larger than each of the original clump masses (for $M^{\star}_1\sim M^{\star}_2$). Hence, it is possible for sub-critical mass clumps to merge and become super-critical and therefore undergo parametric resonance into photons. We find that mergers are expected to be kinematically allowed in the galaxy today for high Peccei-Quinn scales, which is strongly suggested by unification ideas, although the collision rate is small. While mergers can happen for axions with lower Peccei-Quinn scales due to statistical fluctuations in relative velocities, as they have a high collision rate. We estimate the collision and merger rates within the Milky Way galaxy today. We find that a merger leads to a flux of energy on earth that can be appreciable and we mention observational search strategies.

preprint2020arXiv

Theoretical evidence for new adsorption sites of CO$_2$ on the Ag electrode surface

Nowadays, electrochemical reduction of CO$_2$ has been considered as an effective method to solve the problem of global warming. The primary challenge in studying the mechanism is to determine the adsorption states of CO$_2$, since complicated metal surfaces often result in many different adsorption sites. Based on the density functional theory (DFT) calculations, we performed a theoretical study on the adsorption of CO$_2$ on the Ag electrode surface. The results show that the adsorption populations of CO$_2$ are extremely sensitive to the adsorption sites. Importantly, we found that the preferable adsorption positions are the terrace sites, rather than the previous reported step sites. The adsorption populations were found with the order of (211) > (110) > (111) > (100). Subsequently, the adsorption characteristics were correlated with the d-band theory and the charge transfers between Ag surfaces and CO$_2$.

preprint2020arXiv

Towards Better Opioid Antagonists Using Deep Reinforcement Learning

Naloxone, an opioid antagonist, has been widely used to save lives from opioid overdose, a leading cause for death in the opioid epidemic. However, naloxone has short brain retention ability, which limits its therapeutic efficacy. Developing better opioid antagonists is critical in combating the opioid epidemic.Instead of exhaustively searching in a huge chemical space for better opioid antagonists, we adopt reinforcement learning which allows efficient gradient-based search towards molecules with desired physicochemical and/or biological properties. Specifically, we implement a deep reinforcement learning framework to discover potential lead compounds as better opioid antagonists with enhanced brain retention ability. A customized multi-objective reward function is designed to bias the generation towards molecules with both sufficient opioid antagonistic effect and enhanced brain retention ability. Thorough evaluation demonstrates that with this framework, we are able to identify valid, novel and feasible molecules with multiple desired properties, which has high potential in drug discovery.

preprint2019arXiv

A Double Residual Compression Algorithm for Efficient Distributed Learning

Large-scale machine learning models are often trained by parallel stochastic gradient descent algorithms. However, the communication cost of gradient aggregation and model synchronization between the master and worker nodes becomes the major obstacle for efficient learning as the number of workers and the dimension of the model increase. In this paper, we propose DORE, a DOuble REsidual compression stochastic gradient descent algorithm, to reduce over $95\%$ of the overall communication such that the obstacle can be immensely mitigated. Our theoretical analyses demonstrate that the proposed strategy has superior convergence properties for both strongly convex and nonconvex objective functions. The experimental results validate that DORE achieves the best communication efficiency while maintaining similar model accuracy and convergence speed in comparison with start-of-the-art baselines.

preprint2018arXiv

From C to Interaction Trees: Specifying, Verifying, and Testing a Networked Server

We present the first formal verification of a networked server implemented in C. Interaction trees, a general structure for representing reactive computations, are used to tie together disparate verification and testing tools (Coq, VST, and QuickChick) and to axiomatize the behavior of the operating system on which the server runs (CertiKOS). The main theorem connects a specification of acceptable server behaviors, written in a straightforward "one client at a time" style, with the CompCert semantics of the C program. The variability introduced by low-level buffering of messages and interleaving of multiple TCP connections is captured using network refinement, a variant of observational refinement.