An Empirical Comparison of Preservation Methods for Synthetic DNA Data Storage

Small Methods |

Synthetic DNA has recently risen as a viable alternative for long‐term digital data storage. To ensure that information is safely recovered after storage, it is essential to appropriately preserve the physical DNA molecules encoding the data. While preservation of biological DNA has been studied previously, synthetic DNA differs in that it is typically much shorter in length, it has different sequence profiles with fewer, if any, repeats (or homopolymers), and it has different contaminants. In this paper, nine different methods used to preserve data files encoded in synthetic DNA are evaluated by accelerated aging of nearly 29 000 DNA sequences. In addition to a molecular count comparison, the DNA is also sequenced and analyzed after aging. These findings show that errors and erasures are stochastic and show no practical distribution difference between preservation methods. Finally, the physical density of these methods is compared and a stability versus density trade‐offs discussion provided.