Downloads
LayoutLM
January 2021
LayoutLM is a simple but effective multi-modal pre-training method of text, layout, and image for visually-rich document understanding and information extraction tasks, such as form understanding and receipt understanding. LayoutLM archives the SOTA results on multiple datasets. For more details,…
CodeXGLUE
September 2020
CodeXGLUE is a benchmark dataset and open challenge for code intelligence. It includes a collection of code intelligence tasks and a platform for model evaluation and comparison. CodeXGLUE stands for General Language Understanding Evaluation benchmark for CODE. It includes 14…
FB15K-237 Knowledge Base Completion Dataset
October 2015
This dataset contains knowledge base relation triples and textual mentions of Freebase entity pairs, as used in the work published in (Toutanova and Chen CVSM-2015) and (Toutanova et al. EMNLP-2015). The knowledge base triples are a subset of the FB15K…