Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Abstract:In the noisy population recovery problem of Dvir et al., the goal is to learn an unknown distribution $f$ on binary strings of length $n$ from noisy samples. For some parameter $\mu \in [0,1]$, a noisy sample is generated by flipping each coordinate of a sample from $f$ independently with probability $(1-\mu)/2$. We assume an upper bound $k$ on the size of the support of the distribution, and the goal is to estimate the probability of any string to within some given error $\varepsilon$. It is known that the algorithmic complexity and sample complexity of this problem are polynomially related to each other. We show that for $\mu > 0$, the sample complexity (and hence the algorithmic complexity) is bounded by a polynomial in $k$, $n$ and $1/\varepsilon$ improving upon the previous best result of $\mathsf{poly}(k^{\log\log k},n,1/\varepsilon)$ due to Lovett and Zhang. Our proof combines ideas from Lovett and Zhang with a \emph{noise attenuated} version of M\"{o}bius inversion. In turn, the latter crucially uses the construction of \emph{robust local inverse} due to Moitra and Saks.