Source author record

Jorge Ortiz

Jorge Ortiz appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Machine Learning Artificial Intelligence Computer Vision Systems and Control Computation and Language Distributed, Parallel, and Cluster Computing eess.SY Multiagent Systems Performance

Catalog footprint

What is connected

7works

9topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

High-stakes deployment of vision-language models (VLMs) requires selective prediction, where systems abstain when uncertain rather than risk costly errors. We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. Using NExT-QA and Gemini 2.0 Flash, we establish two findings. First, confidence thresholding provides mechanistic control in-distribution. Sweeping threshold epsilon produces smooth risk-coverage tradeoffs, reducing error rates f

preprint2026arXiv

TraceFix: Repairing Agent Coordination Protocols with TLA+ Counterexamples

We present TraceFix, a verification-first pipeline for Large Language Model (LLM) multi-agent coordination. An agent synthesizes a protocol topology as a structured intermediate representation (IR) from a task description, generates PlusCal coordination logic, and iteratively repairs the protocol using counterexamples from the TLA+ model checker (TLC) until verification succeeds. Verified process bodies are compiled into per-agent system prompts and executed under a runtime monitor that rejects out-of-topology coordination operations. On 48 tasks spanning 16 scenario families, all tasks reach full TLC verification; 62.5% pass on the first attempt and none requires more than four repair iterations. State spaces span six orders of magnitude yet verification completes in under 60 s for every task. A 3,456-run runtime comparison shows that topology-monitored execution achieves the highest task completion (89.4% average, 81.5% full) and that runtimes using the verified protocol degrade at roughly half the rate of prompt-only and chat-only baselines when model capability is reduced. A paired ablation under a fixed runtime shows that TLC-verified protocols cut deadlock/livelock (DL/LL) from 31.1% to 14.1%, with the largest separation under fault injection.

preprint2022arXiv

Cadence: A Practical Time-series Partitioning Algorithm for Unlabeled IoT Sensor Streams

Timeseries partitioning is an essential step in most machine-learning driven, sensor-based IoT applications. This paper introduces a sample-efficient, robust, time-series segmentation model and algorithm. We show that by learning a representation specifically with the segmentation objective based on maximum mean discrepancy (MMD), our algorithm can robustly detect time-series events across different applications. Our loss function allows us to infer whether consecutive sequences of samples are drawn from the same distribution (null hypothesis) and determines the change-point between pairs that reject the null hypothesis (i.e., come from different distributions). We demonstrate its applicability in a real-world IoT deployment for ambient-sensing based activity recognition. Moreover, while many works on change-point detection exist in the literature, our model is significantly simpler and can be fully trained in 9-93 seconds on average with little variation in hyperparameters for data across different applications. We empirically evaluate Cadence on four popular change point detection (CPD) datasets where Cadence matches or outperforms existing CPD techniques.

preprint2020arXiv

SECRET: Semantically Enhanced Classification of Real-world Tasks

Supervised machine learning (ML) algorithms are aimed at maximizing classification performance under available energy and storage constraints. They try to map the training data to the corresponding labels while ensuring generalizability to unseen data. However, they do not integrate meaning-based relationships among labels in the decision process. On the other hand, natural language processing (NLP) algorithms emphasize the importance of semantic information. In this paper, we synthesize the complementary advantages of supervised ML and NLP algorithms into one method that we refer to as SECRET (Semantically Enhanced Classification of REal-world Tasks). SECRET performs classifications by fusing the semantic information of the labels with the available data: it combines the feature space of the supervised algorithms with the semantic space of the NLP algorithms and predicts labels based on this joint space. Experimental results indicate that, compared to traditional supervised learning, SECRET achieves up to 14.0% accuracy and 13.1% F1 score improvements. Moreover, compared to ensemble methods, SECRET achieves up to 12.7% accuracy and 13.3% F1 score improvements. This points to a new research direction for supervised classification based on incorporation of semantic information.

preprint2015arXiv

Get More With Less: Near Real-Time Image Clustering on Mobile Phones

Machine learning algorithms, in conjunction with user data, hold the promise of revolutionizing the way we interact with our phones, and indeed their widespread adoption in the design of apps bear testimony to this promise. However, currently, the computationally expensive segments of the learning pipeline, such as feature extraction and model training, are offloaded to the cloud, resulting in an over-reliance on the network and under-utilization of computing resources available on mobile platforms. In this paper, we show that by combining the computing power distributed over a number of phones, judicious optimization choices, and contextual information it is possible to execute the end-to-end pipeline entirely on the phones at the edge of the network, efficiently. We also show that by harnessing the power of this combination, it is possible to execute a computationally expensive pipeline at near real-time. To demonstrate our approach, we implement an end-to-end image-processing pipeline -- that includes feature extraction, vocabulary learning, vectorization, and image clustering -- on a set of mobile phones. Our results show a 75% improvement over the standard, full pipeline implementation running on the phones without modification -- reducing the time to one minute under certain conditions. We believe that this result is a promising indication that fully distributed, infrastructure-less computing is possible on networks of mobile phones; enabling a new class of mobile applications that are less reliant on the cloud.

preprint2015arXiv

Sensor-Type Classification in Buildings

Many sensors/meters are deployed in commercial buildings to monitor and optimize their performance. However, because sensor metadata is inconsistent across buildings, software-based solutions are tightly coupled to the sensor metadata conventions (i.e. schemas and naming) for each building. Running the same software across buildings requires significant integration effort. Metadata normalization is critical for scaling the deployment process and allows us to decouple building-specific conventions from the code written for building applications. It also allows us to deal with missing metadata. One important aspect of normalization is to differentiate sensors by the typeof phenomena being observed. In this paper, we propose a general, simple, yet effective classification scheme to differentiate sensors in buildings by type. We perform ensemble learning on data collected from over 2000 sensor streams in two buildings. Our approach is able to achieve more than 92% accuracy for classification within buildings and more than 82% accuracy for across buildings. We also introduce a method for identifying potential misclassified streams. This is important because it allows us to identify opportunities to attain more input from experts -- input that could help improve classification accuracy when ground truth is unavailable. We show that by adjusting a threshold value we are able to identify at least 30% of the misclassified instances.

preprint2013arXiv

Flexibility of Commercial Building HVAC Fan as Ancillary Service for Smart Grid

In this paper, we model energy use in commercial buildings using empirical data captured through sMAP, a campus building data portal at UC Berkeley. We conduct at-scale experiments in a newly constructed building on campus. By modulating the supply duct static pressure (SDSP) for the main supply air duct, we induce a response on the main supply fan and determine how much ancillary power flexibility can be provided by a typical commercial building. We show that the consequent intermittent fluctuations in the air mass flow into the building does not influence the building climate in a human-noticeable way. We estimate that at least 4 GW of regulation reserve is readily available only through commercial buildings in the US. Based on predictions this value will reach to 5.6 GW in 2035. We also show how thermal slack can be leveraged to provide an ancillary service to deal with transient frequency fluctuations in the grid. We consider a simplified model of the grid power system with time varying demand and generation and present a simple control scheme to direct the ancillary service power flow from buildings to improve on the classical automatic generation control (AGC)-based approach. Simulation results are provided to show the effectiveness of the proposed methodology for enhancing grid frequency regulation.

Jorge Ortiz

What is connected

Connect this record

See the researcher in context

Building this map preview

7 published item(s)

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

TraceFix: Repairing Agent Coordination Protocols with TLA+ Counterexamples

Cadence: A Practical Time-series Partitioning Algorithm for Unlabeled IoT Sensor Streams

SECRET: Semantically Enhanced Classification of Real-world Tasks

Get More With Less: Near Real-Time Image Clustering on Mobile Phones

Sensor-Type Classification in Buildings

Flexibility of Commercial Building HVAC Fan as Ancillary Service for Smart Grid