Source author record

Hanspeter Pfister

Hanspeter Pfister appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computer Vision Human-Computer Interaction Machine Learning Neurons and Cognition Graphics Artificial Intelligence eess.IV astro-ph.IM Computation and Language Computational Engineering, Finance, and Science Cryptography and Security cs.CY Distributed, Parallel, and Cluster Computing Other Quantitative Biology Quantitative Methods

Catalog footprint

What is connected

29works

15topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Virtual Multiplex Staining for Histological Images using a Marker-wise Conditioned Diffusion Model

Multiplex imaging is revolutionizing pathology by enabling the simultaneous visualization of multiple biomarkers within tissue samples, providing molecular-level insights that traditional hematoxylin and eosin (H&E) staining cannot provide. However, the complexity and cost of multiplex data acquisition have hindered its widespread adoption. Additionally, most existing large repositories of H&E images lack corresponding multiplex images, limiting opportunities for multimodal analysis. To address these challenges, we leverage recent advances in latent diffusion models (LDMs), which excel at modeling complex data distributions by utilizing their powerful priors for fine-tuning to a target domain. In this paper, we introduce a novel framework for virtual multiplex staining that utilizes pretrained LDM parameters to generate multiplex images from H&E images using a conditional diffusion model. Our approach enables marker-by-marker generation by conditioning the diffusion model on each marker, while sharing the same architecture across all markers. To tackle the challenge of varying pixel value distributions across different marker stains and to improve inference speed, we fine-tune the model for single-step sampling, enhancing both color contrast fidelity and inference efficiency through pixel-level loss functions. We validate our framework on two publicly available datasets, notably demonstrating its effectiveness in generating up to 18 different marker types with improved accuracy, a substantial increase over the 2-3 marker types achieved in previous approaches. This validation highlights the potential of our framework, pioneering virtual multiplex staining. Finally, this paper bridges the gap between H&E and multiplex imaging, potentially enabling retrospective studies and large-scale analyses of existing H&E image repositories.

preprint2022arXiv

Discrete Cosine Transform Network for Guided Depth Map Super-Resolution

Guided depth super-resolution (GDSR) is an essential topic in multi-modal image processing, which reconstructs high-resolution (HR) depth maps from low-resolution ones collected with suboptimal conditions with the help of HR RGB images of the same scene. To solve the challenges in interpreting the working mechanism, extracting cross-modal features and RGB texture over-transferred, we propose a novel Discrete Cosine Transform Network (DCTNet) to alleviate the problems from three aspects. First, the Discrete Cosine Transform (DCT) module reconstructs the multi-channel HR depth features by using DCT to solve the channel-wise optimization problem derived from the image domain. Second, we introduce a semi-coupled feature extraction module that uses shared convolutional kernels to extract common information and private kernels to extract modality-specific information. Third, we employ an edge attention mechanism to highlight the contours informative for guided upsampling. Extensive quantitative and qualitative evaluations demonstrate the effectiveness of our DCTNet, which outperforms previous state-of-the-art methods with a relatively small number of parameters. The code is available at \url{https://github.com/Zhaozixiang1228/GDSR-DCTNet}.

preprint2022arXiv

Instance Segmentation of Unlabeled Modalities via Cyclic Segmentation GAN

Instance segmentation for unlabeled imaging modalities is a challenging but essential task as collecting expert annotation can be expensive and time-consuming. Existing works segment a new modality by either deploying a pre-trained model optimized on diverse training data or conducting domain translation and image segmentation as two independent steps. In this work, we propose a novel Cyclic Segmentation Generative Adversarial Network (CySGAN) that conducts image translation and instance segmentation jointly using a unified framework. Besides the CycleGAN losses for image translation and supervised losses for the annotated source domain, we introduce additional self-supervised and segmentation-based adversarial objectives to improve the model performance by leveraging unlabeled target domain images. We benchmark our approach on the task of 3D neuronal nuclei segmentation with annotated electron microscopy (EM) images and unlabeled expansion microscopy (ExM) data. Our CySGAN outperforms both pretrained generalist models and the baselines that sequentially conduct image translation and segmentation. Our implementation and the newly collected, densely annotated ExM nuclei dataset, named NucExM, are available at https://connectomics-bazaar.github.io/proj/CySGAN/index.html.

preprint2022arXiv

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

State-of-the-art neural language models can now be used to solve ad-hoc language tasks through zero-shot prompting without the need for supervised training. This approach has gained popularity in recent years, and researchers have demonstrated prompts that achieve strong accuracy on specific NLP tasks. However, finding a prompt for new tasks requires experimentation. Different prompt templates with different wording choices lead to significant accuracy differences. PromptIDE allows users to experiment with prompt variations, visualize prompt performance, and iteratively optimize prompts. We developed a workflow that allows users to first focus on model feedback using small data before moving on to a large data regime that allows empirical grounding of promising prompts using quantitative measures of the task. The tool then allows easy deployment of the newly created ad-hoc models. We demonstrate the utility of PromptIDE (demo at http://prompt.vizhub.ai) and our workflow using several real-world use cases.

preprint2022arXiv

Learning to Generate Realistic Noisy Images via Pixel-level Noise-aware Adversarial Training

Existing deep learning real denoising methods require a large amount of noisy-clean image pairs for supervision. Nonetheless, capturing a real noisy-clean dataset is an unacceptable expensive and cumbersome procedure. To alleviate this problem, this work investigates how to generate realistic noisy images. Firstly, we formulate a simple yet reasonable noise model that treats each real noisy pixel as a random variable. This model splits the noisy image generation problem into two sub-problems: image domain alignment and noise domain alignment. Subsequently, we propose a novel framework, namely Pixel-level Noise-aware Generative Adversarial Network (PNGAN). PNGAN employs a pre-trained real denoiser to map the fake and real noisy images into a nearly noise-free solution space to perform image domain alignment. Simultaneously, PNGAN establishes a pixel-level adversarial training to conduct noise domain alignment. Additionally, for better noise fitting, we present an efficient architecture Simple Multi-scale Network (SMNet) as the generator. Qualitative validation shows that noise generated by PNGAN is highly similar to real noise in terms of intensity and distribution. Quantitative experiments demonstrate that a series of denoisers trained with the generated noisy images achieve state-of-the-art (SOTA) results on four real denoising benchmarks. Part of codes, pre-trained models, and results are available at https://github.com/caiyuanhao1998/PNGAN for comparisons.

preprint2022arXiv

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

Existing leading methods for spectral reconstruction (SR) focus on designing deeper or wider convolutional neural networks (CNNs) to learn the end-to-end mapping from the RGB image to its hyperspectral image (HSI). These CNN-based methods achieve impressive restoration performance while showing limitations in capturing the long-range dependencies and self-similarity prior. To cope with this problem, we propose a novel Transformer-based method, Multi-stage Spectral-wise Transformer (MST++), for efficient spectral reconstruction. In particular, we employ Spectral-wise Multi-head Self-attention (S-MSA) that is based on the HSI spatially sparse while spectrally self-similar nature to compose the basic unit, Spectral-wise Attention Block (SAB). Then SABs build up Single-stage Spectral-wise Transformer (SST) that exploits a U-shaped structure to extract multi-resolution contextual information. Finally, our MST++, cascaded by several SSTs, progressively improves the reconstruction quality from coarse to fine. Comprehensive experiments show that our MST++ significantly outperforms other state-of-the-art methods. In the NTIRE 2022 Spectral Reconstruction Challenge, our approach won the First place. Code and pre-trained models are publicly available at https://github.com/caiyuanhao1998/MST-plus-plus.

preprint2022arXiv

Revisiting RCAN: Improved Training for Image Super-Resolution

Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight. However, most SR models were optimized with dated training strategies. In this work, we revisit the popular RCAN model and examine the effect of different training options in SR. Surprisingly (or perhaps as expected), we show that RCAN can outperform or match nearly all the CNN-based SR architectures published after RCAN on standard benchmarks with a proper training strategy and minimal architecture change. Besides, although RCAN is a very large SR architecture with more than four hundred convolutional layers, we draw a notable conclusion that underfitting is still the main problem restricting the model capability instead of overfitting. We observe supportive evidence that increasing training iterations clearly improves the model performance while applying regularization techniques generally degrades the predictions. We denote our simply revised RCAN as RCAN-it and recommend practitioners to use it as baselines for future research. Code is publicly available at https://github.com/zudi-lin/rcan-it.

preprint2022arXiv

The Pattern is in the Details: An Evaluation of Interaction Techniques for Locating, Searching, and Contextualizing Details in Multivariate Matrix Visualizations

Matrix visualizations are widely used to display large-scale network, tabular, set, or sequential data. They typically only encode a single value per cell, e.g., through color. However, this can greatly limit the visualizations' utility when exploring multivariate data, where each cell represents a data point with multiple values (referred to as details). Three well-established interaction approaches can be applicable in multivariate matrix visualizations (or MMV): focus+context, pan&zoom, and overview+detail. However, there is little empirical knowledge of how these approaches compare in exploring MMV. We report on two studies comparing them for locating, searching, and contextualizing details in MMV. We first compared four focus+context techniques and found that the fisheye lens overall outperformed the others. We then compared the fisheye lens, to pan&zoom and overview+detail. We found that pan&zoom was faster in locating and searching details, and as good as overview+detail in contextualizing details.

preprint2022arXiv

The Quest for Omnioculars: Embedded Visualization for Augmenting Basketball Game Viewing Experiences

Sports game data is becoming increasingly complex, often consisting of multivariate data such as player performance stats, historical team records, and athletes' positional tracking information. While numerous visual analytics systems have been developed for sports analysts to derive insights, few tools target fans to improve their understanding and engagement of sports data during live games. By presenting extra data in the actual game views, embedded visualization has the potential to enhance fans' game-viewing experience. However, little is known about how to design such kinds of visualizations embedded into live games. In this work, we present a user-centered design study of developing interactive embedded visualizations for basketball fans to improve their live game-watching experiences. We first conducted a formative study to characterize basketball fans' in-game analysis behaviors and tasks. Based on our findings, we propose a design framework to inform the design of embedded visualizations based on specific data-seeking contexts. Following the design framework, we present five novel embedded visualization designs targeting five representative contexts identified by the fans, including shooting, offense, defense, player evaluation, and team comparison. We then developed Omnioculars, an interactive basketball game-viewing prototype that features the proposed embedded visualizations for fans' in-game data analysis. We evaluated Omnioculars in a simulated basketball game with basketball fans. The study results suggest that our design supports personalized in-game data analysis and enhances game understanding and engagement.

preprint2022arXiv

Three approaches to facilitate DNN generalization to objects in out-of-distribution orientations and illuminations

The training data distribution is often biased towards objects in certain orientations and illumination conditions. While humans have a remarkable capability of recognizing objects in out-of-distribution (OoD) orientations and illuminations, Deep Neural Networks (DNNs) severely suffer in this case, even when large amounts of training examples are available. In this paper, we investigate three different approaches to improve DNNs in recognizing objects in OoD orientations and illuminations. Namely, these are (i) training much longer after convergence of the in-distribution (InD) validation accuracy, i.e., late-stopping, (ii) tuning the momentum parameter of the batch normalization layers, and (iii) enforcing invariance of the neural activity in an intermediate layer to orientation and illumination conditions. Each of these approaches substantially improves the DNN's OoD accuracy (more than 20% in some cases). We report results in four datasets: two datasets are modified from the MNIST and iLab datasets, and the other two are novel (one of 3D rendered cars and another of objects taken from various controlled orientations and illumination conditions). These datasets allow to study the effects of different amounts of bias and are challenging as DNNs perform poorly in OoD conditions. Finally, we demonstrate that even though the three approaches focus on different aspects of DNNs, they all tend to lead to the same underlying neural mechanism to enable OoD accuracy gains --individual neurons in the intermediate layers become more selective to a category and also invariant to OoD orientations and illuminations. We anticipate this study to be a basis for further improvement of deep neural networks' OoD generalization performance, which is highly demanded to achieve safe and fair AI applications.

preprint2021arXiv

Ask Me or Tell Me? Enhancing the Effectiveness of Crowdsourced Design Feedback

Crowdsourced design feedback systems are emerging resources for getting large amounts of feedback in a short period of time. Traditionally, the feedback comes in the form of a declarative statement, which often contains positive or negative sentiment. Prior research has shown that overly negative or positive sentiment can strongly influence the perceived usefulness and acceptance of feedback and, subsequently, lead to ineffective design revisions. To enhance the effectiveness of crowdsourced design feedback, we investigate a new approach for mitigating the effects of negative or positive feedback by combining open-ended and thought-provoking questions with declarative feedback statements. We conducted two user studies to assess the effects of question-based feedback on the sentiment and quality of design revisions in the context of graphic design. We found that crowdsourced question-based feedback contains more neutral sentiment than statement-based feedback. Moreover, we provide evidence that presenting feedback as questions followed by statements leads to better design revisions than question- or statement-based feedback alone.

preprint2021arXiv

Consistent Recurrent Neural Networks for 3D Neuron Segmentation

We present a recurrent network for the 3D reconstruction of neurons that sequentially generates binary masks for every object in an image with spatio-temporal consistency. Our network models consistency in two parts: (i) local, which allows exploring non-occluding and temporally-adjacent object relationships with bi-directional recurrence. (ii) non-local, which allows exploring long-range object relationships in the temporal domain with skip connections. Our proposed network is end-to-end trainable from an input image to a sequence of object masks, and, compared to methods relying on object boundaries, its output does not require post-processing. We evaluate our method on three benchmarks for neuron segmentation and achieved state-of-the-art performance on the SNEMI3D challenge.

preprint2020arXiv

A Generic Framework and Library for Exploration of Small Multiples through Interactive Piling

Small multiples are miniature representations of visual information used generically across many domains. Handling large numbers of small multiples imposes challenges on many analytic tasks like inspection, comparison, navigation, or annotation. To address these challenges, we developed a framework and implemented a library called Piling.js for designing interactive piling interfaces. Based on the piling metaphor, such interfaces afford flexible organization, exploration, and comparison of large numbers of small multiples by interactively aggregating visual objects into piles. Based on a systematic analysis of previous work, we present a structured design space to guide the design of visual piling interfaces. To enable designers to efficiently build their own visual piling interfaces, Piling.js provides a declarative interface to avoid having to write low-level code and implements common aspects of the design space. An accompanying GUI additionally supports the dynamic configuration of the piling interface. We demonstrate the expressiveness of Piling.js with examples from machine learning, immunofluorescence microscopy, genomics, and public health.

preprint2020arXiv

A New Age of Computing and the Brain

The history of computer science and brain sciences are intertwined. In his unfinished manuscript "The Computer and the Brain," von Neumann debates whether or not the brain can be thought of as a computing machine and identifies some of the similarities and differences between natural and artificial computation. Turing, in his 1950 article in Mind, argues that computing devices could ultimately emulate intelligence, leading to his proposed Turing test. Herbert Simon predicted in 1957 that most psychological theories would take the form of a computer program. In 1976, David Marr proposed that the function of the visual system could be abstracted and studied at computational and algorithmic levels that did not depend on the underlying physical substrate. In December 2014, a two-day workshop supported by the Computing Community Consortium (CCC) and the National Science Foundation's Computer and Information Science and Engineering Directorate (NSF CISE) was convened in Washington, DC, with the goal of bringing together computer scientists and brain researchers to explore these new opportunities and connections, and develop a new, modern dialogue between the two research communities. Specifically, our objectives were: 1. To articulate a conceptual framework for research at the interface of brain sciences and computing and to identify key problems in this interface, presented in a way that will attract both CISE and brain researchers into this space. 2. To inform and excite researchers within the CISE research community about brain research opportunities and to identify and explain strategic roles they can play in advancing this initiative. 3. To develop new connections, conversations and collaborations between brain sciences and CISE researchers that will lead to highly relevant and competitive proposals, high-impact research, and influential publications.

preprint2020arXiv

A Topological Nomenclature for 3D Shape Analysis in Connectomics

One of the essential tasks in connectomics is the morphology analysis of neurons and organelles like mitochondria to shed light on their biological properties. However, these biological objects often have tangled parts or complex branching patterns, which make it hard to abstract, categorize, and manipulate their morphology. In this paper, we develop a novel topological nomenclature system to name these objects like the appellation for chemical compounds to promote neuroscience analysis based on their skeletal structures. We first convert the volumetric representation into the topology-preserving reduced graph to untangle the objects. Next, we develop nomenclature rules for pyramidal neurons and mitochondria from the reduced graph and finally learn the feature embedding for shape manipulation. In ablation studies, we quantitatively show that graphs generated by our proposed method align with the perception of experts. On 3D shape retrieval and decomposition tasks, we qualitatively demonstrate that the encoded topological nomenclature features achieve better results than state-of-the-art shape descriptors. To advance neuroscience, we will release a 3D segmentation dataset of mitochondria and pyramidal neurons reconstructed from a 100um cube electron microscopy volume with their reduced graph and topological nomenclature annotations. Code is publicly available at https://github.com/donglaiw/ibexHelper.

preprint2020arXiv

Automated Measurements of Key Morphological Features of Human Embryos for IVF

A major challenge in clinical In-Vitro Fertilization (IVF) is selecting the highest quality embryo to transfer to the patient in the hopes of achieving a pregnancy. Time-lapse microscopy provides clinicians with a wealth of information for selecting embryos. However, the resulting movies of embryos are currently analyzed manually, which is time consuming and subjective. Here, we automate feature extraction of time-lapse microscopy of human embryos with a machine-learning pipeline of five convolutional neural networks (CNNs). Our pipeline consists of (1) semantic segmentation of the regions of the embryo, (2) regression predictions of fragment severity, (3) classification of the developmental stage, and object instance segmentation of (4) cells and (5) pronuclei. Our approach greatly speeds up the measurement of quantitative, biologically relevant features that may aid in embryo selection.

preprint2020arXiv

Fast Mitochondria Detection for Connectomics

High-resolution connectomics data allows for the identification of dysfunctional mitochondria which are linked to a variety of diseases such as autism or bipolar. However, manual analysis is not feasible since datasets can be petabytes in size. We present a fully automatic mitochondria detector based on a modified U-Net architecture that yields high accuracy and fast processing times. We evaluate our method on multiple real-world connectomics datasets, including an improved version of the EPFL mitochondria benchmark. Our results show an Jaccard index of up to 0.90 with inference times lower than 16ms for a 512x512px image tile. This speed is faster than the acquisition speed of modern electron microscopes, enabling mitochondria detection in real-time. Our detector ranks first for real-time detection when compared to previous works and data, results, and code are openly available.

preprint2020arXiv

Monocular Reconstruction of Neural Face Reflectance Fields

The reflectance field of a face describes the reflectance properties responsible for complex lighting effects including diffuse, specular, inter-reflection and self shadowing. Most existing methods for estimating the face reflectance from a monocular image assume faces to be diffuse with very few approaches adding a specular component. This still leaves out important perceptual aspects of reflectance as higher-order global illumination effects and self-shadowing are not modeled. We present a new neural representation for face reflectance where we can estimate all components of the reflectance responsible for the final appearance from a single monocular image. Instead of modeling each component of the reflectance separately using parametric models, our neural representation allows us to generate a basis set of faces in a geometric deformation-invariant space, parameterized by the input light direction, viewpoint and face geometry. We learn to reconstruct this reflectance field of a face just from a monocular image, which can be used to render the face from any viewpoint in any light condition. Our method is trained on a light-stage training dataset, which captures 300 people illuminated with 150 light conditions from 8 viewpoints. We show that our method outperforms existing monocular reflectance reconstruction methods, in terms of photorealism due to better capturing of physical premitives, such as sub-surface scattering, specularities, self-shadows and other higher-order effects.

preprint2018arXiv

Synthetically Trained Icon Proposals for Parsing and Summarizing Infographics

Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including `ways to conserve the environment' and `understanding the financial crisis'. Composed of stylistically and semantically diverse visual and textual elements, infographics pose new challenges for computer vision. While automatic text extraction works well on infographics, computer vision approaches trained on natural images fail to identify the stand-alone visual elements in infographics, or `icons'. To bridge this representation gap, we propose a synthetic data generation strategy: we augment background patches in infographics from our Visually29K dataset with Internet-scraped icons which we use as training data for an icon proposal mechanism. On a test set of 1K annotated infographics, icons are located with 38% precision and 34% recall (the best model trained with natural images achieves 14% precision and 7% recall). Combining our icon proposals with icon classification and text extraction, we present a multi-modal summarization application. Our application takes an infographic as input and automatically produces text tags and visual hashtags that are textually and visually representative of the infographic's topics respectively.

preprint2016arXiv

Context-guided diffusion for label propagation on graphs

Existing approaches for diffusion on graphs, e.g., for label propagation, are mainly focused on isotropic diffusion, which is induced by the commonly-used graph Laplacian regularizer. Inspired by the success of diffusivity tensors for anisotropic diffusion in image processing, we presents anisotropic diffusion on graphs and the corresponding label propagation algorithm. We develop positive definite diffusivity operators on the vector bundles of Riemannian manifolds, and discretize them to diffusivity operators on graphs. This enables us to easily define new robust diffusivity operators which significantly improve semi-supervised learning performance over existing diffusion algorithms.

preprint2016arXiv

Icon: An Interactive Approach to Train Deep Neural Networks for Segmentation of Neuronal Structures

We present an interactive approach to train a deep neural network pixel classifier for the segmentation of neuronal structures. An interactive training scheme reduces the extremely tedious manual annotation task that is typically required for deep networks to perform well on image segmentation problems. Our proposed method employs a feedback loop that captures sparse annotations using a graphical user interface, trains a deep neural network based on recent and past annotations, and displays the prediction output to users in almost real-time. Our implementation of the algorithm also allows multiple users to provide annotations in parallel and receive feedback from the same classifier. Quick feedback on classifier performance in an interactive setting enables users to identify and label examples that are more important than others for segmentation purposes. Our experiments show that an interactively-trained pixel classifier produces better region segmentation results on Electron Microscopy (EM) images than those generated by a network of the same architecture trained offline on exhaustive ground-truth labels.

preprint2016arXiv

Local High-order Regularization on Data Manifolds

The common graph Laplacian regularizer is well-established in semi-supervised learning and spectral dimensionality reduction. However, as a first-order regularizer, it can lead to degenerate functions in high-dimensional manifolds. The iterated graph Laplacian enables high-order regularization, but it has a high computational complexity and so cannot be applied to large problems. We introduce a new regularizer which is globally high order and so does not suffer from the degeneracy of the graph Laplacian regularizer, but is also sparse for efficient computation in semi-supervised learning applications. We reduce computational complexity by building a local first-order approximation of the manifold as a surrogate geometry, and construct our high-order regularizer based on local derivative evaluations therein. Experiments on human body shape and pose analysis demonstrate the effectiveness and efficiency of our method.

preprint2016arXiv

RhoanaNet Pipeline: Dense Automatic Neural Annotation

Reconstructing a synaptic wiring diagram, or connectome, from electron microscopy (EM) images of brain tissue currently requires many hours of manual annotation or proofreading (Kasthuri and Lichtman, 2010; Lichtman and Sanes, 2008; Seung, 2009). The desire to reconstruct ever larger and more complex networks has pushed the collection of ever larger EM datasets. A cubic millimeter of raw imaging data would take up 1 PB of storage and present an annotation project that would be impractical without relying heavily on automatic segmentation methods. The RhoanaNet image processing pipeline was developed to automatically segment large volumes of EM data and ease the burden of manual proofreading and annotation. Based on (Kaynig et al., 2015), we updated every stage of the software pipeline to provide better throughput performance and higher quality segmentation results. We used state of the art deep learning techniques to generate improved membrane probability maps, and Gala (Nunez-Iglesias et al., 2014) was used to agglomerate 2D segments into 3D objects. We applied the RhoanaNet pipeline to four densely annotated EM datasets, two from mouse cortex, one from cerebellum and one from mouse lateral geniculate nucleus (LGN). All training and test data is made available for benchmark comparisons. The best segmentation results obtained gave $V^\text{Info}_\text{F-score}$ scores of 0.9054 and 09182 for the cortex datasets, 0.9438 for LGN, and 0.9150 for Cerebellum. The RhoanaNet pipeline is open source software. All source code, training data, test data, and annotations for all four benchmark datasets are available at www.rhoana.org.

preprint2016arXiv

Semi-supervised Learning with Explicit Relationship Regularization

In many learning tasks, the structure of the target space of a function holds rich information about the relationships between evaluations of functions on different data points. Existing approaches attempt to exploit this relationship information implicitly by enforcing smoothness on function evaluations only. However, what happens if we explicitly regularize the relationships between function evaluations? Inspired by homophily, we regularize based on a smooth relationship function, either defined from the data or with labels. In experiments, we demonstrate that this significantly improves the performance of state-of-the-art algorithms in semi-supervised classification and in spectral data embedding for constrained clustering and dimensionality reduction.

preprint2015arXiv

VESICLE: Volumetric Evaluation of Synaptic Interfaces using Computer vision at Large Scale

An open challenge problem at the forefront of modern neuroscience is to obtain a comprehensive mapping of the neural pathways that underlie human brain function; an enhanced understanding of the wiring diagram of the brain promises to lead to new breakthroughs in diagnosing and treating neurological disorders. Inferring brain structure from image data, such as that obtained via electron microscopy (EM), entails solving the problem of identifying biological structures in large data volumes. Synapses, which are a key communication structure in the brain, are particularly difficult to detect due to their small size and limited contrast. Prior work in automated synapse detection has relied upon time-intensive biological preparations (post-staining, isotropic slice thicknesses) in order to simplify the problem. This paper presents VESICLE, the first known approach designed for mammalian synapse detection in anisotropic, non-post-stained data. Our methods explicitly leverage biological context, and the results exceed existing synapse detection methods in terms of accuracy and scalability. We provide two different approaches - one a deep learning classifier (VESICLE-CNN) and one a lightweight Random Forest approach (VESICLE-RF) to offer alternatives in the performance-scalability space. Addressing this synapse detection challenge enables the analysis of high-throughput imaging data soon expected to reach petabytes of data, and provide tools for more rapid estimation of brain-graphs. Finally, to facilitate community efforts, we developed tools for large-scale object detection, and demonstrated this framework to find $\approx$ 50,000 synapses in 60,000 $μm ^3$ (220 GB on disk) of electron microscopy data.

preprint2013arXiv

Large-Scale Automatic Reconstruction of Neuronal Processes from Electron Microscopy Images

Automated sample preparation and electron microscopy enables acquisition of very large image data sets. These technical advances are of special importance to the field of neuroanatomy, as 3D reconstructions of neuronal processes at the nm scale can provide new insight into the fine grained structure of the brain. Segmentation of large-scale electron microscopy data is the main bottleneck in the analysis of these data sets. In this paper we present a pipeline that provides state-of-the art reconstruction performance while scaling to data sets in the GB-TB range. First, we train a random forest classifier on interactive sparse user annotations. The classifier output is combined with an anisotropic smoothing prior in a Conditional Random Field framework to generate multiple segmentation hypotheses per image. These segmentations are then combined into geometrically consistent 3D objects by segmentation fusion. We provide qualitative and quantitative evaluation of the automatic segmentation and demonstrate large-scale 3D reconstructions of neuronal processes from a $\mathbf{27,000}$ $\mathbf{μm^3}$ volume of brain tissue over a cube of $\mathbf{30 \; μm}$ in each dimension corresponding to 1000 consecutive image sections. We also introduce Mojo, a proofreading tool including semi-automated correction of merge errors based on sparse user scribbles.

preprint2012arXiv

Verifiable Computation with Massively Parallel Interactive Proofs

As the cloud computing paradigm has gained prominence, the need for verifiable computation has grown increasingly urgent. The concept of verifiable computation enables a weak client to outsource difficult computations to a powerful, but untrusted, server. Protocols for verifiable computation aim to provide the client with a guarantee that the server performed the requested computations correctly, without requiring the client to perform the computations herself. By design, these protocols impose a minimal computational burden on the client. However, existing protocols require the server to perform a large amount of extra bookkeeping in order to enable a client to easily verify the results. Verifiable computation has thus remained a theoretical curiosity, and protocols for it have not been implemented in real cloud computing systems. Our goal is to leverage GPUs to reduce the server-side slowdown for verifiable computation. To this end, we identify abundant data parallelism in a state-of-the-art general-purpose protocol for verifiable computation, originally due to Goldwasser, Kalai, and Rothblum, and recently extended by Cormode, Mitzenmacher, and Thaler. We implement this protocol on the GPU, obtaining 40-120x server-side speedups relative to a state-of-the-art sequential implementation. For benchmark problems, our implementation reduces the slowdown of the server to factors of 100-500x relative to the original computations requested by the client. Furthermore, we reduce the already small runtime of the client by 100x. Similarly, we obtain 20-50x server-side and client-side speedups for related protocols targeted at specific streaming problems. We believe our results demonstrate the immediate practicality of using GPUs for verifiable computation, and more generally that protocols for verifiable computation have become sufficiently mature to deploy in real cloud computing systems.

preprint2012arXiv

Visualization in Connectomics

Connectomics is a field of neuroscience that analyzes neuronal connections. A connectome is a complete map of a neuronal system, comprising all neuronal connections between its structures. The term "connectome" is close to the word "genome" and implies completeness of all neuronal connections, in the same way as a genome is a complete listing of all nucleotide sequences. The goal of connectomics is to create a complete representation of the brain's wiring. Such a representation is believed to increase our understanding of how functional brain states emerge from their underlying anatomical structure. Furthermore, it can provide important information for the cure of neuronal dysfunctions like schizophrenia or autism. In this paper, we review the current state-of-the-art of visualization and image processing techniques in the field of connectomics and describe some remaining challenges.

preprint2010arXiv

The Murchison Widefield Array

It is shown that the excellent Murchison Radio-astronomy Observatory site allows the Murchison Widefield Array to employ a simple RFI blanking scheme and still calibrate visibilities and form images in the FM radio band. The techniques described are running autonomously in our calibration and imaging software, which is currently being used to process an FM-band survey of the entire southern sky.

Hanspeter Pfister

What is connected

Connect this record

See the researcher in context

Building this map preview

29 published item(s)

Virtual Multiplex Staining for Histological Images using a Marker-wise Conditioned Diffusion Model

Discrete Cosine Transform Network for Guided Depth Map Super-Resolution

Instance Segmentation of Unlabeled Modalities via Cyclic Segmentation GAN

Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models

Learning to Generate Realistic Noisy Images via Pixel-level Noise-aware Adversarial Training

MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction

Revisiting RCAN: Improved Training for Image Super-Resolution

The Pattern is in the Details: An Evaluation of Interaction Techniques for Locating, Searching, and Contextualizing Details in Multivariate Matrix Visualizations

The Quest for Omnioculars: Embedded Visualization for Augmenting Basketball Game Viewing Experiences

Three approaches to facilitate DNN generalization to objects in out-of-distribution orientations and illuminations

Ask Me or Tell Me? Enhancing the Effectiveness of Crowdsourced Design Feedback

Consistent Recurrent Neural Networks for 3D Neuron Segmentation

A Generic Framework and Library for Exploration of Small Multiples through Interactive Piling

A New Age of Computing and the Brain

A Topological Nomenclature for 3D Shape Analysis in Connectomics

Automated Measurements of Key Morphological Features of Human Embryos for IVF

Fast Mitochondria Detection for Connectomics

Monocular Reconstruction of Neural Face Reflectance Fields

Synthetically Trained Icon Proposals for Parsing and Summarizing Infographics

Context-guided diffusion for label propagation on graphs

Icon: An Interactive Approach to Train Deep Neural Networks for Segmentation of Neuronal Structures

Local High-order Regularization on Data Manifolds

RhoanaNet Pipeline: Dense Automatic Neural Annotation

Semi-supervised Learning with Explicit Relationship Regularization

VESICLE: Volumetric Evaluation of Synaptic Interfaces using Computer vision at Large Scale

Large-Scale Automatic Reconstruction of Neuronal Processes from Electron Microscopy Images

Verifiable Computation with Massively Parallel Interactive Proofs

Visualization in Connectomics

The Murchison Widefield Array