Researcher profile

Kimmo Fredriksson

Kimmo Fredriksson contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
3topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2014arXiv

Motif matching using gapped patterns

We present new algorithms for the problem of multiple string matching of gapped patterns, where a gapped pattern is a sequence of strings such that there is a gap of fixed length between each two consecutive strings. The problem has applications in the discovery of transcription factor binding sites in DNA sequences when using generalized versions of the Position Weight Matrix model to describe transcription factor specificities. In these models a motif can be matched as a set of gapped patterns with unit-length keywords. The existing algorithms for matching a set of gapped patterns are worst-case efficient but not practical, or vice versa, in this particular case. The novel algorithms that we present are based on dynamic programming and bit-parallelism, and lie in a middle-ground among the existing algorithms. In fact, their time complexity is close to the best existing bound and, yet, they are also practical. We also provide experimental results which show that the presented algorithms are fast in practice, and preferable if all the strings in the patterns have unit-length.

preprint2013arXiv

Approximate pattern matching with k-mismatches in packed text

Given strings $P$ of length $m$ and $T$ of length $n$ over an alphabet of size $σ$, the string matching with $k$-mismatches problem is to find the positions of all the substrings in $T$ that are at Hamming distance at most $k$ from $P$. If $T$ can be read only one character at the time the best known bounds are $O(n\sqrt{k\log k})$ and $O(n + n\sqrt{k/w}\log k)$ in the word-RAM model with word length $w$. In the RAM models (including $AC^0$ and word-RAM) it is possible to read up to $\floor{w / \log σ}$ characters in constant time if the characters of $T$ are encoded using $\ceil{\log σ}$ bits. The only solution for $k$-mismatches in packed text works in $O((n \logσ/\log n)\ceil{m \log (k + \log n / \logσ) / w} + n^{\varepsilon})$ time, for any $\varepsilon > 0$. We present an algorithm that runs in time $O(\frac{n}{\floor{w/(m\logσ)}} (1 + \log \min(k,σ) \log m / \logσ))$ in the $AC^0$ model if $m=O(w / \logσ)$ and $T$ is given packed. We also describe a simpler variant that runs in time $O(\frac{n}{\floor{w/(m\logσ)}}\log \min(m, \log w / \logσ))$ in the word-RAM model. The algorithms improve the existing bound for $w = Ω(\log^{1+ε}n)$, for any $ε> 0$. Based on the introduced technique, we present algorithms for several other approximate matching problems.

preprint2013arXiv

On a compact encoding of the swap automaton

Given a string $P$ of length $m$ over an alphabet $Σ$ of size $σ$, a swapped version of $P$ is a string derived from $P$ by a series of local swaps, i.e., swaps of adjacent symbols, such that each symbol can participate in at most one swap. We present a theoretical analysis of the nondeterministic finite automaton for the language $\bigcup_{P'\inΠ_P}Σ^*P'$ (swap automaton for short), where $Π_P$ is the set of swapped versions of $P$. Our study is based on the bit-parallel simulation of the same automaton due to Fredriksson, and reveals an interesting combinatorial property that links the automaton to the one for the language $Σ^*P$. By exploiting this property and the method presented by Cantone et al. (2010), we obtain a bit-parallel encoding of the swap automaton which takes $O(σ^2\ceil{k/w})$ space and allows one to simulate the automaton on a string of length $n$ in time $O(n\ceil{k/w})$, where $\ceil{m/σ}\le k\le m$.

preprint2010arXiv

On building minimal automaton for subset matching queries

We address the problem of building an index for a set $D$ of $n$ strings, where each string location is a subset of some finite integer alphabet of size $σ$, so that we can answer efficiently if a given simple query string (where each string location is a single symbol) $p$ occurs in the set. That is, we need to efficiently find a string $d \in D$ such that $p[i] \in d[i]$ for every $i$. We show how to build such index in $O(n^{\log_{σ/Δ}(σ)}\log(n))$ average time, where $Δ$ is the average size of the subsets. Our methods have applications e.g.\ in computational biology (haplotype inference) and music information retrieval.