Researcher profile

Jorge Ortiz

Jorge Ortiz contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 19 - UnverifiedVerification L1Unclaimed author
5works
0followers
7topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

5 published item(s)

preprint2026arXiv

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

High-stakes deployment of vision-language models (VLMs) requires selective prediction, where systems abstain when uncertain rather than risk costly errors. We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. Using NExT-QA and Gemini 2.0 Flash, we establish two findings. First, confidence thresholding provides mechanistic control in-distribution. Sweeping threshold epsilon produces smooth risk-coverage tradeoffs, reducing error rates f

preprint2026arXiv

TraceFix: Repairing Agent Coordination Protocols with TLA+ Counterexamples

We present TraceFix, a verification-first pipeline for Large Language Model (LLM) multi-agent coordination. An agent synthesizes a protocol topology as a structured intermediate representation (IR) from a task description, generates PlusCal coordination logic, and iteratively repairs the protocol using counterexamples from the TLA+ model checker (TLC) until verification succeeds. Verified process bodies are compiled into per-agent system prompts and executed under a runtime monitor that rejects out-of-topology coordination operations. On 48 tasks spanning 16 scenario families, all tasks reach full TLC verification; 62.5% pass on the first attempt and none requires more than four repair iterations. State spaces span six orders of magnitude yet verification completes in under 60 s for every task. A 3,456-run runtime comparison shows that topology-monitored execution achieves the highest task completion (89.4% average, 81.5% full) and that runtimes using the verified protocol degrade at roughly half the rate of prompt-only and chat-only baselines when model capability is reduced. A paired ablation under a fixed runtime shows that TLC-verified protocols cut deadlock/livelock (DL/LL) from 31.1% to 14.1%, with the largest separation under fault injection.

preprint2022arXiv

Cadence: A Practical Time-series Partitioning Algorithm for Unlabeled IoT Sensor Streams

Timeseries partitioning is an essential step in most machine-learning driven, sensor-based IoT applications. This paper introduces a sample-efficient, robust, time-series segmentation model and algorithm. We show that by learning a representation specifically with the segmentation objective based on maximum mean discrepancy (MMD), our algorithm can robustly detect time-series events across different applications. Our loss function allows us to infer whether consecutive sequences of samples are drawn from the same distribution (null hypothesis) and determines the change-point between pairs that reject the null hypothesis (i.e., come from different distributions). We demonstrate its applicability in a real-world IoT deployment for ambient-sensing based activity recognition. Moreover, while many works on change-point detection exist in the literature, our model is significantly simpler and can be fully trained in 9-93 seconds on average with little variation in hyperparameters for data across different applications. We empirically evaluate Cadence on four popular change point detection (CPD) datasets where Cadence matches or outperforms existing CPD techniques.

preprint2020arXiv

SECRET: Semantically Enhanced Classification of Real-world Tasks

Supervised machine learning (ML) algorithms are aimed at maximizing classification performance under available energy and storage constraints. They try to map the training data to the corresponding labels while ensuring generalizability to unseen data. However, they do not integrate meaning-based relationships among labels in the decision process. On the other hand, natural language processing (NLP) algorithms emphasize the importance of semantic information. In this paper, we synthesize the complementary advantages of supervised ML and NLP algorithms into one method that we refer to as SECRET (Semantically Enhanced Classification of REal-world Tasks). SECRET performs classifications by fusing the semantic information of the labels with the available data: it combines the feature space of the supervised algorithms with the semantic space of the NLP algorithms and predicts labels based on this joint space. Experimental results indicate that, compared to traditional supervised learning, SECRET achieves up to 14.0% accuracy and 13.1% F1 score improvements. Moreover, compared to ensemble methods, SECRET achieves up to 12.7% accuracy and 13.3% F1 score improvements. This points to a new research direction for supervised classification based on incorporation of semantic information.

preprint2013arXiv

Flexibility of Commercial Building HVAC Fan as Ancillary Service for Smart Grid

In this paper, we model energy use in commercial buildings using empirical data captured through sMAP, a campus building data portal at UC Berkeley. We conduct at-scale experiments in a newly constructed building on campus. By modulating the supply duct static pressure (SDSP) for the main supply air duct, we induce a response on the main supply fan and determine how much ancillary power flexibility can be provided by a typical commercial building. We show that the consequent intermittent fluctuations in the air mass flow into the building does not influence the building climate in a human-noticeable way. We estimate that at least 4 GW of regulation reserve is readily available only through commercial buildings in the US. Based on predictions this value will reach to 5.6 GW in 2035. We also show how thermal slack can be leveraged to provide an ancillary service to deal with transient frequency fluctuations in the grid. We consider a simplified model of the grid power system with time varying demand and generation and present a simple control scheme to direct the ancillary service power flow from buildings to improve on the classical automatic generation control (AGC)-based approach. Simulation results are provided to show the effectiveness of the proposed methodology for enhancing grid frequency regulation.