# Søren Vind

I am a PhD student in the Algorithms, Logic and Graphs Section in the Department of Applied Mathematics and Computer Science at the Technical University of Denmark, from where I also received my BSc and MSc. As part of my BSc/MSc I visited California Institute of Technology as an exchange student and participated in an Erasmus Intensive Programme at Universidad de Alcalá.

My supervisors are Philip Bille and Inge Li Gørtz. I work closely together with fellow PhD students Patrick Hagge Cording and Hjalte Wedel Vildhøj.

### Contact

DTU Compute
Building 322, Office 021
DK-2800 Kongens Lyngby
Denmark

Email: sovi@dtu.dk

## Research

I am associated with the Managed Video as a Service Advanced Technology Foundation project. The goal is to research and develop compressed data structures for indexing massive amounts of meta data that supports efficient queries.

My research interests are within fundamental algorithms and data structure problems, stringology, compression, computational geometry and distributed computing.

### Publications

• Theory of Computing Systems: String Indexing for Patterns with Wildcards with Philip Bille, Inge Li Gørtz, and Hjalte Wedel Vildhøj.
Arxiv version, Abstract, Conference version:
We consider the problem of indexing a string $t$ of length $n$ to report the occurrences of a query pattern $p$ containing $m$ characters and $j$ wildcards. Let $occ$ be the number of occurrences of $p$ in $t$, and $\sigma$ the size of the alphabet. We obtain the following results.
• A linear space index with query time $O(m+\sigma^j \log \log n + occ)$. This significantly improves the previously best known linear space index by Lam et al. [ISAAC 2007], which requires query time $\Theta(jn)$ in the worst case.
• An index with query time $O(m+j+occ)$ using space $O(\sigma^{k^2} n \log^k \log n)$, where $k$ is the maximum number of wildcards allowed in the pattern. This is the first non-trivial bound with this query time.
• A time-space trade-off, generalizing the index by Cole et al. [STOC 2004].
Our results are obtained using a novel combination of well-known and new techniques, which could be of independent interest.
• WADS 2013: Fingerprints in Compressed Strings with Philip Bille, Patrick Hagge Cording, Inge Li Gørtz, Benjamin Sach, and Hjalte Wedel Vildhøj.
Full version (draft), Abstract
The Karp-Rabin fingerprint of a string is a type of hash value that due to its strong properties has been used in many string algorithms. In this paper we show how to construct a data structure for a string $S$ of size $N$ compressed by a context-free grammar of size $n$ that answers fingerprint queries. That is, given indices $i$ and $j$, the answer to a query is the fingerprint of the substring $S[i,j]$. We present the first $O(n)$ space data structures that answer fingerprint queries without decompressing any characters. For Straight Line Programs (SLP) we get $O(\log N)$ query time, and for Linear SLPs (an SLP derivative that captures LZ78 compression and its variations) we get $O(\log \log N)$ query time. Hence, our data structures has the same time and space complexity as for random access in SLPs. We utilize the fingerprint data structures to solve the longest common extension problem in query time $O(\log N\log \ell)$ and $O(\log \ell \log\log \ell + \log\log N)$ for SLPs and Linear SLPs, respectively. Here, $\ell$ denotes the length of the LCE.
• Master's Thesis: String Indexing for Patterns with Wildcards with Hjalte Wedel Vildhøj.
Supervised by Philip Bille and Inge Li Gørtz.
Technical University of Denmark, August 8, 2011.

## Events

Created using JQuery, Markdown and Bootstrap sprinkled with magical bits.