MLog: Towards Declarative In-Database Machine Learning

  • Bin Cui ,
  • Xupeng Li ,
  • Yiru Chen ,
  • ,
  • Ce Zhang

Proceedings of the VLDB Endowment (VLDB 2017) |

We demonstrate MLOG, a high-level language that integrates machine learning into data management systems. Unlike existing machine learning frameworks (e.g., TensorFlow, Theano, and Caffe), MLOG is declarative, in the sense that the system manages all data movement, data persistency, and machine-learning related optimizations (such as data batching) automatically. Our interactive demonstration will show audience how this is achieved based on the novel notion of tensoral views (TViews), which are similar to relational views but operate over tensors with linear algebra. With MLOG, users can succinctly specify not only simple models such as SVM (in just two lines), but also sophisticated deep learning models that are not supported by existing in-database analytics systems (e.g., MADlib, PAL, and SciDB), as a series of cascaded TViews. Given the declarative nature of MLOG, we further demonstrate how query/program optimization techniques can be leveraged to translate MLOG programs into native TensorFlow programs. The performance of the automatically generated TensorFlow programs is comparable to that of hand-optimized ones.