Source author record

Oktay Goktas

Oktay Goktas appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Artificial Intelligence Information Retrieval quant-ph Software Engineering

Catalog footprint

What is connected

2works

4topics

3close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2026arXiv

Improving BM25 Code Retrieval Under Fixed Generic Tokenization: Adaptive q-Log Odds as a Drop-In BM25 Fix

In retrieval-augmented coding, failures often begin when the relevant file is absent from the retrieved context. Under frozen generic tokenization, where a BM25 index has been built by a search system whose analyzer the practitioner does not control, this failure is routine: BM25's logarithmic RSJ-odds IDF under-separates the identifier tail that distinguishes one function from another. We replace the outer logarithm of the Robertson-Spärck-Jones odds with a q-logarithm. At q=1 the transform recovers BM25 exactly by L'Hôpital's rule, and for q<1 it is a Box-Cox transform of the RSJ odds with lambda = 1-q. On CoIR CodeSearchNet Go (182K documents), oracle-tuned NDCG@10 rises from 0.2575 to 0.4874 (absolute +0.2299; +89.3% relative; zero sign reversals in 10,000 paired-bootstrap resamples, reported as p <= 10^-4). The effect is graded across code languages and is near-zero on BEIR text. A one-parameter closed form estimates a corpus-level q from hapax density and stays near q=1 on corpora where BM25 is already optimal. The index-time cost is a single pass over the sparse score matrix and query latency is unchanged. A tokenizer ablation shows that identifier-aware tokenization largely removes the incremental gain from q-IDF.

preprint2020arXiv

Quantum Amplitude Estimation in the Presence of Noise

Quantum Amplitude Estimation (QAE) -- a technique by which the amplitude of a given quantum state can be estimated with quadratically fewer queries than by standard sampling -- is a key sub-routine in several important quantum algorithms, including Grover search and Quantum Monte-Carlo methods. An obstacle to implementing QAE in near-term noisy intermediate-scale quantum (NISQ) devices has been the need to perform Quantum Phase Estimation (QPE) -- a costly procedure -- as a sub-routine. This impediment was lifted with various QPE-free methods of QAE, wherein Grover queries of varying depths / powers (often according to a "schedule") are followed immediately by measurements and classical post-processing techniques like maximum likelihood estimation (MLE). Existing analyses as to the optimality of various query schedules in these QPE-free QAE schemes have hitherto assumed noise-free systems. In this work, we analyse QPE-free QAE under common noise models that may afflict NISQ devices and report on the optimality of various query schedules in the noisy regime. We demonstrate that, given an accurate noise characterization of one's system, one must choose a schedule that balances the trade-off between the greater ideal performance achieved by higher-depth circuits, and the correspondingly greater accumulation of noise-induced error.

Institution

Affiliation not imported yet

This author record came from a source that does not expose affiliation metadata. Once the author claims the profile or we enrich the record from another provider, this section will link to the concrete institution.

Topic footprint