Source author record

Andrew White

Andrew White appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

hep-ex hep-ph Machine Learning physics.flu-dyn Artificial Intelligence physics.acc-ph physics.chem-ph physics.ins-det Populations and Evolution

Catalog footprint

What is connected

9works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

OptimusKG: Unifying biomedical knowledge in a modern multimodal graph

Biomedical knowledge graphs (KGs) are widely used in the life sciences, yet many are derived from unstructured documents and therefore lack schema-level constrains, whereas graphs assembled from structured resources are difficult to harmonize into a unified representation. We present OptimusKG, a multimodal biomedical labeled property graph (LPG) built from structured and semi-structured resources to preserve factual, type-specific metadata across molecular, anatomical, clinical, and environmental domains. OptimusKG contains 190,531 nodes across 10 entity types, 21,813,816 edges across 26 relation types, and 67,249,863 property instances encoding 110,276,843 values across 150 distinct property keys, derived from 18 ontologies and controlled vocabularies. The graph enforces a top-level schema for nodes and edges and retains granular, type-specific properties, cross-references, and provenance across molecular, anatomical, clinical, and environmental domains. We assessed the validity of OptimusKG by evaluating whether graph relationships are supported by evidence from the scientific literature using a multimodal agent, PaperQA3. PaperQA3 identified supporting evidence for 70.0% of sampled edges, whereas 83.4% of sampled false edges received no supporting evidence. Edges without literature support were concentrated in associations derived from experimental and functional genomics resources, suggesting that OptimusKG captures biomedical knowledge that may precede synthesis in the scientific literature. OptimusKG is distributed as Apache Parquet files, providing a standardized resource for graph-based machine learning, knowledge-grounded retrieval with large language models, and biomedical discovery use cases such as hypothesis generation.

preprint2023arXiv

The International Linear Collider: Report to Snowmass 2021

The International Linear Collider (ILC) is on the table now as a new global energy-frontier accelerator laboratory taking data in the 2030s. The ILC addresses key questions for our current understanding of particle physics. It is based on a proven accelerator technology. Its experiments will challenge the Standard Model of particle physics and will provide a new window to look beyond it. This document brings the story of the ILC up to date, emphasizing its strong physics motivation, its readiness for construction, and the opportunity it presents to the US and the global particle physics community.

preprint2022arXiv

Federated Learning of Molecular Properties with Graph Neural Networks in a Heterogeneous Setting

Chemistry research has both high material and computational costs to conduct experiments. Institutions thus consider chemical data to be valuable and there have been few efforts to construct large public datasets for machine learning. Another challenge is that different intuitions are interested in different classes of molecules, creating heterogeneous data that cannot be easily joined by conventional distributed training. In this work, we introduce federated heterogeneous molecular learning to address these challenges. Federated learning allows end-users to build a global model collaboratively while keeping the training data distributed over isolated clients. Due to the lack of related research, we first simulate a heterogeneous federated learning benchmark (FedChem) by jointly performing scaffold splitting and latent Dirichlet allocation on existing datasets for heterogeneously distributed client data. Our results on FedChem show that significant learning challenges arise when working with heterogeneous molecules across clients. We then propose a method to alleviate the problem, namely Federated Learning by Instance reweighTing (FLIT(+)). FLIT(+) can align the local training across heterogeneous clients by improving the performance for uncertain samples. Comprehensive experiments conducted on our new benchmark FedChem validate the advantages of this method over other federated learning schemes. FedChem should enable a new type of collaboration for improving AI in chemistry that mitigates concerns about valuable chemical data.

preprint2014arXiv

An Easy to Use Repository for Comparing and Improving Machine Learning Algorithm Usage

The results from most machine learning experiments are used for a specific purpose and then discarded. This results in a significant loss of information and requires rerunning experiments to compare learning algorithms. This also requires implementation of another algorithm for comparison, that may not always be correctly implemented. By storing the results from previous experiments, machine learning algorithms can be compared easily and the knowledge gained from them can be used to improve their performance. The purpose of this work is to provide easy access to previous experimental results for learning and comparison. These stored results are comprehensive -- storing the prediction for each test instance as well as the learning algorithm, hyperparameters, and training set that were used. Previous results are particularly important for meta-learning, which, in a broad sense, is the process of learning from previous machine learning results such that the learning process is improved. While other experiment databases do exist, one of our focuses is on easy access to the data. We provide meta-learning data sets that are ready to be downloaded for meta-learning experiments. In addition, queries to the underlying database can be made if specific information is desired. We also differ from previous experiment databases in that our databases is designed at the instance level, where an instance is an example in a data set. We store the predictions of a learning algorithm trained on a specific training set for each instance in the test set. Data set level information can then be obtained by aggregating the results from the instances. The instance level information can be used for many tasks such as determining the diversity of a classifier or algorithmically determining the optimal subset of training instances for a learning algorithm.

preprint2013arXiv

Instrumentation for the Energy Frontier

The Instrumentation Frontier was set up as a part of the Snowmass 2013 Community Summer Study to examine the instrumentation R&D needed to support particle physics research over the coming decade. This report summarizes the findings of the Energy Frontier subgroup of the Instrumentation Frontier.

preprint2011arXiv

APS DFD 2011 video submission V045

Inhomogeneous uid mixing in a tilted-rotating cylindrical tank (radius a = 3:5 cm) is shown at Re(17-40) and low capillary numbers. A water and surfactant solu- tion (1% by mass sodium oleate) is dispersed in soybean oil (95% by volume), through varying the rotation rate, and angle of inclination, the rate of mixing is observed. A planar laser is directed down the tank axis to highlight a cross-sectional area of the fluid volume and as the water droplets begin to break up to sizes on the order of the beam width and less, more light is refracted and the mixture is illuminated. Initially, the water breaks up into large droplets that exhibit approximate solid-body rotation about the bottom of the tank. When the total combined volume is below the critical volume of the tank Vcrit = a^3 tan vortex transport of the water occurs more rapidly, breaking up the water into continually smaller droplets in a process that resembles periodic shearing. When the fluid volume is above critical the water will break up and rotate about the bottom of the tank and vortex-induced mixing is much more reticent, if occurring at all. It is noted that shallower angles with respect to the horizontal produce faster mixing while allowing a greater volume of fluid to be mixed at the sub-critical volume given a constant tank size.

preprint2011arXiv

Levitating Drop in a Tilted Rotating Tank - Gallery of Fluid Motion Entry V044

A cylindrical acrylic tank with inner diameter D = 4 in. is mounted such that its axis of symmetry is at some angle measured from the vertical plane. The mixing tank is identical to that described in [1] The tank is filled with 200 mL of 1000 cSt silicone oil and a 5 mL drop of de-ionized water is placed in the oil volume. The water drop is allowed to come to rest and then a motor rotates the tank about its axis of symmetry at a fixed frequency = 0.3 Hz. Therefore the Reynolds number is fixed at about Re ~ 5 yielding laminar flow conditions. A CCD camera (PixeLink) is used to capture video of each experiment.

preprint2010arXiv

Life history and mating systems select for male biased parasitism mediated through natural selection and ecological feedbacks

Males are often the "sicker" sex with male biased parasitism found in a taxonomically diverse range of species. There is considerable interest in the processes that could underlie the evolution of sex-biased parasitism. Mating system differences along with differences in lifespan may play a key role. We examine whether these factors are likely to lead to male-biased parasitism through natural selection taking into account the critical role that ecological feedbacks play in the evolution of defence. We use a host-parasite model with two-sexes and the techniques of adaptive dynamics to investigate how mating system and sexual differences in competitive ability and longevity can select for a bias in the rates of parasitism. Male-biased parasitism is selected for when males have a shorter average lifespan or when males are subject to greater competition for resources. Male-biased parasitism evolves as a consequence of sexual differences in life history that produce a greater proportion of susceptible females than males and therefore reduce the cost of avoiding parasitism in males. Different mating systems such as monogamy, polygamy or polyandry did not produce a bias in parasitism through these ecological feedbacks but may accentuate an existing bias.

preprint1996arXiv

Summary of the Supersymmetry Working Group

We summarize the results obtained by the Supersymmetry Working Group at the 1996 Snowmass Workshop.