Klim Efremenko
From Irreducible Representations to Locally Decodable Codes.
Electronic Colloquium on Computational Complexity (ECCC) (2011)
[Abstract]
[BiBTeX]
[Paper: PDF]
Locally Decodable Code (LDC) is a code that encodes a message in
a way that one can decode any particular symbol of the message by reading only a constant number of locations,
even if a constant fraction of the encoded message is adversarially corrupted.
In this paper we present a new approach for the construction of LDCs. We show that if there exists an irreducible representation $(\rho, V)$ of $G$ and $q$ elements $g_1,g_2,\ldots, g_q$
in $G$ such that there exists a linear combination of matrices $\rho(g_i)$ that is of rank one,
then we can construct a $q$-query Locally Decodable Code
$C:V-> F^G$.
We show the potential of this approach by constructing constant query LDCs of sub-exponential length matching the parameters of the best known constructions.
@article{Efremenko11,
author = {
Klim Efremenko
},
title = { From Irreducible Representations to Locally Decodable Codes.},
journal = {Electronic Colloquium on Computational Complexity (ECCC)},
year = {2011}
}
Avraham Ben-Aroya, Klim Efremenko and Amnon Ta-Shma
Local List Decoding with a Constant Number of Queries.
51th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2010
[Abstract]
[BiBTeX]
[Paper: PDF]
Recently Efremenko showed locally-decodable codes of sub-exponential
length. That result showed that these codes can handle up to
$\frac{1}{3} $ fraction of errors. In this paper we show that the
same codes can be locally unique-decoded from error rate
$\half-\alpha$ for any $\alpha>0$ and locally list-decoded from
error rate $1-\alpha$ for any $\alpha>0$, with only a constant
number of queries and a constant alphabet size. This gives the first
sub-exponential codes that can be locally list-decoded with a
constant number of queries.
@inproceedings{BET10,
author = {Avraham Ben-Aroya and
Klim Efremenko and
Amnon Ta-Shma},
title = {Local List Decoding with a Constant Number of Queries},
booktitle = {FOCS},
year = {2010},
pages = {715-722},
ee = {http://dx.doi.org/10.1109/FOCS.2010.88}
}
Avraham Ben-Aroya, Klim Efremenko and Amnon Ta-Shma
A Note on Amplifying the Error-Tolerance of Locally Decodable Codes.
Electronic Colloquium on Computational Complexity (ECCC) 17: 134 (2010)
[Abstract]
[BiBTeX]
[Paper: PDF]
Trevisan [Tre03] suggested a transformation that allows amplifying the error rate a
code can handle. We observe that this transformation, that was suggested in the non-local
setting, works also in the local setting and thus gives a generic, simple way to amplify
the error-tolerance of locally decodable codes. Specifically, this shows how to transform a
locally decodable code that can tolerate a constant fraction of errors to a locally decodable
code that can recover from a much higher error-rate, and how to transform such locally
decodable codes to locally list-decodable codes.
The transformation of [Tre03] involves a simple composition with an approximately
locally (list) decodable code. Using a construction of such codes by Impagliazzo et
al. [IJKW10], the transformation incurs only a negligible growth in the length of the code
and in the query complexity.
@article{DBLP:journals/eccc/Ben-AroyaET10a,
author = {Avraham Ben-Aroya and
Klim Efremenko and
Amnon Ta-Shma},
title = {A Note on Amplifying the Error-Tolerance of Locally Decodable
Codes},
journal = {Electronic Colloquium on Computational Complexity (ECCC)},
volume = {17},
year = {2010},
pages = {134},
ee = {http://eccc.hpi-web.de/report/2010/134},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Klim Efremenko
3-Query Locally Decodable Codes of Subexponential Length
The 41st ACM Symposium on Theory of Computing,(STOC) 2009.
[Abstract]
[BiBTeX]
[Paper: PDF]
Locally Decodable Codes (LDC) allow one to decode any particular symbol of the input message by making a constant number of queries to a codeword, even if a constant fraction of the codeword is damaged. In a recent work ~\cite{Yekhanin08} Yekhanin constructs a 3-query LDC with sub-exponential length. However, this construction requires a conjecture that there are infinitely many Mersenne primes. In this paper, we give the first unconditional constant query LDC construction with subexponantial codeword length. In addition, our construction reduces codeword length.
@article{Efremenko09,
author = {Klim Efremenko},
title = {3-Query Locally Decodable Codes of Subexponential Length },
year = {2009},
booktitle= {The 41st ACM Symposium on Theory of Computing}
}
Klim Efremenko and Omer Reingold
How Well Do Random Walks Parallelize?
APPROX-RANDOM, 2009
[Abstract]
[BiBTeX]
[Paper: PDF]
A random walk on a graph is a process that explores the graph in a random way: at each step the walk is at a vertex of the graph, and at each step it moves to a uniformly selected neighbor of this vertex. Random walks are extremely useful in computer science and in other fields. A very natural problem that was recently raised by Alon, Avin, Koucky, Kozma, Lotker, and Tuttle (though it was implicit in several previous papers) is to analyze the behavior of $k$ independent walks in comparison with the behavior of a single walk. In particular, Alon et al. showed that in various settings (e.g., for expander graphs), $k$ random walks cover the graph (i.e., visit all its nodes), $\Omega(k)$-times faster (in expectation) than a single walk. In other words, in such cases $k$ random walks efficiently ``parallelize" a single random walk. Alon et al.\ also demonstrated that, depending on the specific setting, this ``speedup" can vary from logarithmic to exponential in $k$.
In this paper we initiate a more systematic study of multiple random walks. We give lower and upper bounds both on
the cover time {\em and on the hitting time} (the time it takes to hit one specific node) of multiple random walks. Our study revolves over three alternatives for the starting vertices of the random walks: the worst starting vertices (those who maximize the hitting/cover time), the best starting vertices, and starting vertices selected from the stationary distribution. Among our results, we show that the speedup when starting the walks at the worst vertices cannot be too large - the hitting time cannot improve by more than an $O(k)$ factor and the cover time cannot improve by more than $\min\{k \log n,k^2\}$ (where $n$ is the number of vertices). These results should be contrasted with the fact that there was no previously known upper-bound on the speedup and that the speedup can even be {\em exponential} in $k$ for random starting vertices. We further show that for $k$ that is not too large (as a function of various parameters of the graph), the speedup in cover time is $O(k)$ {\em even for walks that start from the best vertices} (those that minimize the cover time). As a rather surprising corollary of our theorems, we obtain a new bound which relates
the cover time $C$ and the mixing time $\mix$ of a graph. Specifically, we show that $C=O(m \sqrt{\mix}\log^2 n)$ (where $m$ is the number of edges).
@inproceedings{DBLP:conf/approx/EfrReingold09,
author = {Klim Efremenko and
Omer Reingold},
title = {How Well Do Random Walks Parallelize?},
booktitle = {APPROX-RANDOM},
year = {2009},
pages = {476-489},
crossref = {DBLP:conf/approx/2009},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Raffaell Clifford , Klim Efremenko, Ely Porat and Amir Rothschild
From Coding Theory to Efficient Pattern Matching
20 nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2009
[Abstract]
[BiBTeX]
[Paper: PDF]
We consider the classic problem of pattern matching with few mismatches in the presence of promiscuously matching wildcard symbols. Given a text $t$ of length $n$ and a pattern $p$ of length $m$ with optional wildcard symbols and a bound $k$, our algorithm finds all the alignments for which the pattern matches the text with Hamming distance at most $k$ and also returns the location and identity of each mismatch. The algorithm we present is deterministic and runs in $\tilde{O}(kn)$ time, matching the best known randomised time complexity to within logarithmic factors. The solutions we develop borrow from the tool set of algebraic coding theory and provide a new framework in which to tackle approximate pattern matching problems.
@inproceedings{DBLP:conf/soda/CliffordEPR09,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Ely Porat and
Amir Rothschild},
title = {From coding theory to efficient pattern matching},
booktitle = {SODA},
year = {2009},
pages = {778-784},
ee = {http://doi.acm.org/10.1145/1496770.1496855},
crossref = {DBLP:conf/soda/2009},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Klim Efremenko and Ely Porat
Approximating General Metric Distances Between a Pattern and a Text
19nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA),2008
[Abstract]
[BiBTeX]
[Paper: PDF]
Let $T=t_0 \ldots t_{n-1}$ be a text and $P = p_0 \ldots p_{m-1}$
a pattern taken from some finite alphabet set $\Sigma$, and let
$\dist$ be a metric on $\Sigma$. We consider the problem of
calculating the sum of distances between the symbols of $P$ and the
symbols of substrings of $T$ of length $m$ for all possible offsets.
We present an $\varepsilon$-approximation algorithm for this problem
which runs in time $O(\frac{1}{\varepsilon^2}n\cdot \mathrm{
polylog}(n,\abs{\Sigma}))$. This algorithm is based on a low
distortion embedding of metric spaces into normed spaces (especially, into $\ell_{\infty}$), which is done as a preprocessing
stage. The algorithm is also based on a technique of sampling.
@inproceedings{PoratE08,
author = {Ely Porat and
Klim Efremenko},
title = {Approximating general metric distances between a pattern
and a text},
booktitle = {SODA},
year = {2008},
pages = {419-427}
}
Raffaell Clifford , Klim Efremenko, Ely Porat and Amir Rothschild
Pattern Matching with Don't Cares and Few Errors
Journal of Computer and System Sciences (JCSS), 2010
[Abstract]
[BiBTeX]
[Paper: PDF]
We present solutions for the k-mismatch pattern matching problem with don't cares. Given a text t of length n and a pattern pof length m with don't care symbols and a bound k, our algorithms find all the places that the pattern matches the text with at most k mismatches. We first give a ?(n(k+logmlogk)logn) time randomised algorithm which finds the correct answer with high probability. We then present a new deterministic ?(nk2log2m) time solution that uses tools originally developed for group testing. Taking our derandomisation approach further we develop an approach based on k-selectors that runs in ?(nkpolylogm) time. Further, in each case the location of the mismatches at each alignment is also given at no extra cost.
@article{DBLP:journals/jcss/CliffordEPR10,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Ely Porat and
Amir Rothschild},
title = {Pattern matching with don't cares and few errors},
journal = {J. Comput. Syst. Sci.},
volume = {76},
number = {2},
year = {2010},
pages = {115-124},
ee = {http://dx.doi.org/10.1016/j.jcss.2009.06.002},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Raffaell Clifford , Klim Efremenko, Benny Porat and Ely Porat
A Black Box for Online Approximate Pattern Matching
Combinatorial Pattern Matching (CPM), 2008
[Abstract]
[BiBTeX]
[Paper: PDF]
We present a deterministic black box solution for online approximate matching.
Given a pattern of length $m$ and a streaming text of length $n$ that
arrives one character at a time, the task is to report the distance
between the pattern and a sliding window of the text as soon as the
new character arrives. Our solution requires $O(\Sigma_{j=1}^{\log_2{m}} T(n,2^{j-1})/n)$ time for each input character, where $T(n,m)$ is the total running time of the best offline
algorithm. The types of approximation that are supported include exact matching with wildcards, matching under the Hamming norm, approximating the Hamming norm, $k$-mismatch and numerical measures such as the $L_2$ and $L_1$ norms. For these examples, the resulting online algorithms take $O(\log^2{m})$, $O(\sqrt{m\log{m}})$, $O(\log^2{m}/{\epsilon}^2)$, $O(\sqrt{k \log k} \log{m})$, $O(\log^2{m})$ and $O(\sqrt{m\log{m}})$ time per character respectively. The space overhead is $O(m)$ which we show is optimal.
@inproceedings{DBLP:conf/cpm/CliffordEPP08,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Benny Porat and
Ely Porat},
title = {A Black Box for Online Approximate Pattern Matching},
booktitle = {CPM},
year = {2008},
pages = {143-151},
ee = {http://dx.doi.org/10.1007/978-3-540-69068-9_15},
crossref = {DBLP:conf/cpm/2008},
bibsource = {DBLP, http://dblp.uni-trier.de}
}
Raffaell Clifford , Klim Efremenko, Benny Porat, Ely Porat and Amir Rothschild
Mismatch sampling
String Processing and Information Retrieval, (SPIRE), 2008
[Abstract]
[BiBTeX]
[Paper: PDF]
We consider the well known problem of pattern matching under the
Hamming distance. Previous approaches have shown how to count the
number of mismatches efficiently especially, when a bound is known
for the maximum Hamming distance. Our interest is different in
that we wish collect a random sample of mismatches of fixed size
at each position in the text. Given a pattern $p$ of length $m$
and a text $t$ of length $n$,
we show how to sample with high probability $c$ mismatches where possible from every alignment of $p$ and $t$ in time.
Further, we guarantee that the mismatches are sampled uniformly and that they can therefore be seen as representative
of the types of mismatches that occur.
@inproceedings{DBLP:conf/spire/CliffordEPPR08,
author = {Rapha{\"e}l Clifford and
Klim Efremenko and
Benny Porat and
Ely Porat and
Amir Rothschild},
title = {Mismatch Sampling},
booktitle = {SPIRE},
year = {2008},
pages = {99-108},
ee = {http://dx.doi.org/10.1007/978-3-540-89097-3_11},
crossref = {DBLP:conf/spire/2008},
bibsource = {DBLP, http://dblp.uni-trier.de}
}