Abstract:We resolve a long-standing open question, about the existence of a constant-factor approximation algorithm for the average-case \textsc{Decision Tree} problem with uniform probability distribution over the hypotheses. We answer the question in the affirmative by providing a simple polynomial-time algorithm with approximation ratio of $\frac{2}{1-\sqrt{(e+1)/(2e)}}+ε<11.57$. This improves upon the currently best-known, greedy algorithm which achieves $O(\log n/{\log\log n})$-approximation. The first key ingredient in our analysis is the usage of a decomposition technique known from problems related to \textsc{Hierarchical Clustering} [SODA '17, WALCOM '26], which allows us to decompose the optimal decision tree into a series of objects called separating subfamilies. The second crucial idea is to reduce the subproblem of finding a \textsc{Separating Subfamily} to an instance of the \textsc{Maximum Coverage} problem. To do so, we analyze the properties of cutting cliques into small pieces, which represent pairs of hypotheses to be separated. This allows us to obtain a good approximation for the \textsc{Separating Subfamily} problem, which then enables the design of the approximation algorithm for the original problem.
Abstract:We consider the following generalization of the classic Binary Search Problem: a searcher is required to find a hidden target vertex $x$ in a graph $G$, by iteratively performing queries about vertices. A query to $v$ incurs a cost $c(v, x)$ and responds whether $v=x$ and if not, returns the connected component in $G-v$ containing $x$. The goal is to design a search strategy that minimizes the average-case search cost. Firstly, we consider the case when the cost of querying a vertex is independent of the target. We develop a $\br{4+ε}$-approximation FPTAS for trees running in $O(n^4/ε^2)$ time and an $O({\sqrt{\log n}})$-approximation for general graphs. Additionally, we give an FPTAS parametrized by the number of non-leaf vertices of the graph. On the hardness side we prove that the problem is NP-hard even when the input is a tree with bounded degree or bounded diameter. Secondly, we consider trees and assume $c(v, x)$ to be a monotone non-decreasing function with respect to $x$, i.e.\ if $u \in P_{v, x}$ then $c(u, x) \leq c(v, x)$. We give a $2$-approximation algorithm which can also be easily altered to work for the worst-case variant. This is the first constant factor approximation algorithm for both criterions. Previously known results only regard the worst-case search cost and include a parametrized PTAS as well as a $4$-approximation for paths. At last, we show that when the cost function is an arbitrary function of the queried vertex and the target, then the problem does not admit any constant factor approximation under the UGC, even when the input tree is a star.
Abstract:This work considers a number of optimization problems and reductive relations between them. The two main problems we are interested in are the \emph{Optimal Decision Tree} and \emph{Set Cover}. We study these two fundamental tasks under precedence constraints, that is, if a test (or set) $X$ is a predecessor of $Y$, then in any feasible decision tree $X$ needs to be an ancestor of $Y$ (or respectively, if $Y$ is added to set cover, then so must be $X$). For the Optimal Decision Tree we consider two optimization criteria: worst case identification time (height of the tree) or the average identification time. Similarly, for the Set Cover we study two cost measures: the size of the cover or the average cover time. Our approach is to develop a number of algorithmic reductions, where an approximation algorithm for one problem provides an approximation for another via a black-box usage of a procedure for the former. En route we introduce other optimization problems either to complete the `reduction landscape' or because they hold the essence of combinatorial structure of our problems. The latter is brought by a problem of finding a maximum density precedence closed subfamily, where the density is defined as the ratio of the number of items the family covers to its size. By doing so we provide $\cO^*(\sqrt{m})$-approximation algorithms for all of the aforementioned problems. The picture is complemented by a number of hardness reductions that provide $o(m^{1/12-ε})$-inapproximability results for the decision tree and covering problems. Besides giving a complete set of results for general precedence constraints, we also provide polylogarithmic approximation guarantees for two most typically studied and applicable precedence types, outforests and inforests. By providing corresponding hardness results, we show these results to be tight.