Researcher profile

Austin Silveria

Austin Silveria contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2026arXiv

Search Your Block Floating Point Scales!

Quantization has emerged as a standard technique for accelerating inference for generative models by enabling faster low-precision computations and reduced memory transfers. Recently, GPU accelerators have added first-class support for microscaling Block Floating Point (BFP) formats. Standard BFP algorithms use a fixed scale based on the maximum magnitude of the block. We observe that this scale choice can be suboptimal with respect to quantization errors. In this work, we propose ScaleSearch, an alternative strategy for selecting these scale factors: using a fine-grained search leveraging the mantissa bits in microscaling formats to minimize the quantization error for the given distribution. ScaleSearch can be integrated with existing quantization methods such as Post Training Quantization and low-precision attention, and is shown to improve their performance. Additionally, we introduce ScaleSearchAttention, an accelerated NVFP4-based attention algorithm, which uses ScaleSearch and adapted prior techniques to ensure near-0 performance loss for causal language modeling. Experiments show that ScaleSearch reduces quantization error by 27% for NVFP4 and improves language model PTQ by up to 15 points for MATH500 (Qwen3-8B), while ScaleSearchAttention improves Wikitext-2 PPL by upto 0.77 points for Llama 3.1 70B. The proposed methods closely match baseline performance while providing quantization accuracy improvements.

preprint2022arXiv

Projection: A Mixed-Initiative Research Process

Communication of dense information between humans and machines is relatively low bandwidth. Many modern search and recommender systems operate as machine learning black boxes, giving little insight as to how they represent information or why they take certain actions. We present Projection, a mixed-initiative interface that aims to increase the bandwidth of communication between humans and machines throughout the research process. The interface supports adding context to searches and visualizing information in multiple dimensions with techniques such as hierarchical clustering and spatial projections. Potential customers have shown interest in the application integrating their research outlining and search processes, enabling them to structure their searches in hierarchies, and helping them visualize related spaces of knowledge.

preprint2020arXiv

Exploratory Search with Sentence Embeddings

Exploratory search aims to guide users through a corpus rather than pinpointing exact information. We propose an exploratory search system based on hierarchical clusters and document summaries using sentence embeddings. With sentence embeddings, we represent documents as the mean of their embedded sentences, extract summaries containing sentences close to this document representation and extract keyphrases close to the document representation. To evaluate our search system, we scrape our personal search history over the past year and report our experience with the system. We then discuss motivating use cases of an exploratory search system of this nature and conclude with possible directions of future work.

preprint2020arXiv

GraphQL Live Querying with DynamoDB

We present a method of implementing GraphQL live queries at the database level. Our DynamoDB simulation in Go mimics a distributed key-value store and implements live queries to expose possible pitfalls. Two key components for implementing live queries are storing fields selected in a live query and determining which object fields have been updated in each database write. A stream(key, fields) request to the system contains fields to include in the live query stream and on subsequent put(key, object) operations, the database asynchronously determines which fields were updated and pushes a new query view to the stream if those fields overlap with the stream() request. Following a discussion of our implementation, we explore motivations for using live queries such as simplifying software communication, minimizing data transfer, and enabling real-time data and describe an architecture for building software with GraphQL and live queries.