Source author record

Conrado Martínez

Conrado Martínez appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

math.PR Data Structures and Algorithms

Catalog footprint

What is connected

3works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

Analysis of Pivot Sampling in Dual-Pivot Quicksort

The new dual-pivot Quicksort by Vladimir Yaroslavskiy - used in Oracle's Java runtime library since version 7 - features intriguing asymmetries. They make a basic variant of this algorithm use less comparisons than classic single-pivot Quicksort. In this paper, we extend the analysis to the case where the two pivots are chosen as fixed order statistics of a random sample. Surprisingly, dual-pivot Quicksort then needs more comparisons than a corresponding version of classic Quicksort, so it is clear that counting comparisons is not sufficient to explain the running time advantages observed for Yaroslavskiy's algorithm in practice. Consequently, we take a more holistic approach and give also the precise leading term of the average number of swaps, the number of executed Java Bytecode instructions and the number of scanned elements, a new simple cost measure that approximates I/O costs in the memory hierarchy. We determine optimal order statistics for each of the cost measures. It turns out that the asymmetries in Yaroslavskiy's algorithm render pivots with a systematic skew more efficient than the symmetric choice. Moreover, we finally have a convincing explanation for the success of Yaroslavskiy's algorithm in practice: Compared with corresponding versions of classic single-pivot Quicksort, dual-pivot Quicksort needs significantly less I/Os, both with and without pivot sampling.

preprint2014arXiv

Analysis of Branch Misses in Quicksort

The analysis of algorithms mostly relies on counting classic elementary operations like additions, multiplications, comparisons, swaps etc. This approach is often sufficient to quantify an algorithm's efficiency. In some cases, however, features of modern processor architectures like pipelined execution and memory hierarchies have significant impact on running time and need to be taken into account to get a reliable picture. One such example is Quicksort: It has been demonstrated experimentally that under certain conditions on the hardware the classically optimal balanced choice of the pivot as median of a sample gets harmful. The reason lies in mispredicted branches whose rollback costs become dominating. In this paper, we give the first precise analytical investigation of the influence of pipelining and the resulting branch mispredictions on the efficiency of (classic) Quicksort and Yaroslavskiy's dual-pivot Quicksort as implemented in Oracle's Java 7 library. For the latter it is still not fully understood why experiments prove it 10% faster than a highly engineered implementation of a classic single-pivot version. For different branch prediction strategies, we give precise asymptotics for the expected number of branch misses caused by the aforementioned Quicksort variants when their pivots are chosen from a sample of the input. We conclude that the difference in branch misses is too small to explain the superiority of the dual-pivot algorithm.

preprint2010arXiv

Psi-series method in random trees and moments of high orders

An unusual and surprising expansion of the form \[ p_n = ρ^{-n-1}(6n +\tfrac{18}5+ \tfrac{336}{3125} n^{-5}+\tfrac{1008}{3125} n^{-6} +\text{smaller order terms}), \] as $n\to\infty$, is derived for the probability $p_n$ that two randomly chosen binary search trees are identical (in shape and in labels of all corresponding nodes). A quantity arising in the analysis of phylogenetic trees is also proved to have a similar asymptotic expansion. Our method of proof is new in the literature of discrete probability and analysis of algorithms, and based on the psi-series expansions for nonlinear differential equations. Such an approach is very general and applicable to many other problems involving nonlinear differential equations; many examples are discussed and several attractive phenomena are discovered.