Source author record

Sean Mooney

Sean Mooney appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Catalog footprint

What is connected

3works
5topics
4close collaborators

Actions

Connect this record

Log in to claim

Research graph

See the researcher in context

Open full explorer

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

3 published item(s)

preprint2022arXiv

The NLP Sandbox: an efficient model-to-data system to enable federated and unbiased evaluation of clinical NLP models

Objective The evaluation of natural language processing (NLP) models for clinical text de-identification relies on the availability of clinical notes, which is often restricted due to privacy concerns. The NLP Sandbox is an approach for alleviating the lack of data and evaluation frameworks for NLP models by adopting a federated, model-to-data approach. This enables unbiased federated model evaluation without the need for sharing sensitive data from multiple institutions. Materials and Methods We leveraged the Synapse collaborative framework, containerization software, and OpenAPI generator to build the NLP Sandbox (nlpsandbox.io). We evaluated two state-of-the-art NLP de-identification focused annotation models, Philter and NeuroNER, using data from three institutions. We further validated model performance using data from an external validation site. Results We demonstrated the usefulness of the NLP Sandbox through de-identification clinical model evaluation. The external developer was able to incorporate their model into the NLP Sandbox template and provide user experience feedback. Discussion We demonstrated the feasibility of using the NLP Sandbox to conduct a multi-site evaluation of clinical text de-identification models without the sharing of data. Standardized model and data schemas enable smooth model transfer and implementation. To generalize the NLP Sandbox, work is required on the part of data owners and model developers to develop suitable and standardized schemas and to adapt their data or model to fit the schemas. Conclusions The NLP Sandbox lowers the barrier to utilizing clinical data for NLP model evaluation and facilitates federated, multi-site, unbiased evaluation of NLP models.

preprint2021arXiv

Sub-arcsecond imaging with the International LOFAR Telescope: II. Completion of the LOFAR Long-Baseline Calibrator Survey

The Low-Frequency Array (LOFAR) Long-Baseline Calibrator Survey (LBCS) was conducted between 2014 and 2019 in order to obtain a set of suitable calibrators for the LOFAR array. In this paper we present the complete survey, building on the preliminary analysis published in 2016 which covered approximately half the survey area. The final catalogue consists of 30006 observations of 24713 sources in the northern sky, selected for a combination of high low-frequency radio flux density and flat spectral index using existing surveys (WENSS, NVSS, VLSS, and MSSS). Approximately one calibrator per square degree, suitable for calibration of $\geq$ 200 km baselines is identified by the detection of compact flux density, for declinations north of 30 degrees and away from the Galactic plane, with a considerably lower density south of this point due to relative difficulty in selecting flat-spectrum candidate sources in this area of the sky. Use of the VLBA calibrator list, together with statistical arguments by comparison with flux densities from lower-resolution catalogues, allow us to establish a rough flux density scale for the LBCS observations, so that LBCS statistics can be used to estimate compact flux densities on scales between 300 mas and 2 arcsec, for sources observed in the survey. The LBCS can be used to assess the structures of point sources in lower-resolution surveys, with significant reductions in the degree of coherence in these sources on scales between 2 arcsec and 300 mas. The LBCS survey sources show a greater incidence of compact flux density in quasars than in radio galaxies, consistent with unified schemes of radio sources. Comparison with samples of sources from interplanetary scintillation (IPS) studies with the Murchison Widefield Array (MWA) shows consistent patterns of detection of compact structure in sources observed both interferometrically with LOFAR and using IPS.

preprint2021arXiv

The resolved jet of 3C 273 at 150 MHz

Since its discovery in 1963, 3C273 has become one of the most widely studied quasars with investigations spanning the electromagnetic spectrum. While much has been discovered about this historically notable source, its low-frequency emission is far less well understood. Observations in the MHz regime have traditionally lacked the resolution required to explore small-scale structures that are key to understanding the processes that result in the observed emission. In this paper we use the first sub-arcsecond images of 3C273 at MHz frequencies to investigate the morphology of the compact jet structures and the processes that result in the observed spectrum. Using the full complement of LOFAR's international stations, we produce $0.31 \times 0.21$ arcsec images of 3C273 at 150 MHz to determine the jet's kinetic power, place constraints on the bulk speed and inclination angle of the jets, and look for evidence of the elusive counter-jet at 150 MHz. Using ancillary data at GHz frequencies, we fit free-free absorption (FFA) and synchrotron self-absorption (SSA) models to determine their validity in explaining the observed spectra. The images presented display for the first time that robust, high-fidelity imaging of low-declination complex sources is now possible with the LOFAR international baselines. We show that the main small-scale structures of 3C273 match those seen at higher frequencies and that absorption is present in the observed emission. We determine the kinetic power of the jet to be in the range of $3.5 \times 10^{43}$ - $1.5 \times 10^{44}$ erg s$^{-1}$ which agrees with estimates made using higher frequency observations. We derive lower limits for the bulk speed and Lorentz factor of $β\gtrsim 0.55$ and $Γ\geq 1.2$ respectively. The counter-jet remains undetected at $150$ MHz, placing a limit on the peak brightness of $S_\mathrm{cj\_150} < 40$ mJy beam$^{-1}$.