On reconstructing a string from its substring compositions
Proceedings of the IEEE International Symposium on Information Theory, Austin, TX, USA, June 2010
Abstract

Motivated by protein  sequencing , we consider the problem of  reconstructing  a string from the compositions  of  its  substrings. We provide several results, including the following. General classes of strings that cannot be distinguished from their substring  compositions . An almost complete characterization of the lengths for which reconstruction is possible. Bounds on the number of strings with the same substring  compositions  in terms of the number of divisors of the string length plus one. A relation to the turnpike problem and a bivariate polynomial formulation of string reconstruction.