Learning Formatting Style Transfer and Structure Extraction for Spreadsheet Tables with a Hybrid Neural Network Architecture

CIKM'20 (applied research track) |

Related File

 

Table formatting is a typical task for spreadsheet users to better exhibit table structures and data relationships. But quickly and effectively formatting tables is a challenge for users. Lots of manual operations are needed, especially for complex tables. In this paper, we propose techniques for table formatting style transfer, i.e., to automatically format a target table according to the style of a reference table. Considering the latent many-to-many mappings between table structures and formats, we propose CellNet, which is a novel end-to-end, multi-task model leveraging conditional Generative Adversarial Networks (cGANs) with three key components to (1) model and recognize table structures; (2) encode formatting styles; (3) learn and apply the latent mapping based on recognized table structure and encoded style, respectively. Moreover, we build up a spreadsheet table corpus containing 5,226 tables with high-quality formats and 784 tables with human-labeled structures. Our evaluation shows that CellNet is highly effective according to both quantitative metrics and human perception studies by comparing with heuristic-based and other learning-based methods.