- Colloquial to Arabic Converter
- Diacritizer
- Named Entity Recognizer (NER)
- Parser
- Part of Speech Tagger
- SARF (morphological analyzer)
- Speller
- Transliterator
Natural Language Processing (NLP) is a foundational infrastructure for processing written text. This processing revolves around text analysis and understanding. NLP serves a multitude of sophisticated tasks such as Text Search, Document Management, Automatic Translation, Proofreading, Text Summarization and many more. The Advanced Technology Lab in Cairo has developed the Arabic Toolkit Service (ATKS) as a set of NLP components targeting Arabic language.
The component suite includes a full-fledged morphological analyzer (SARF), a spell-checker, an auto corrector, a diacritizer, a named entity recognizer (NER), a colloquial to Arabic converter, and a part-of-speech (POS) tagger. These components are integrated into multiple Microsoft products and services, such as Windows, Office, Bing, Exchange, SharePoint, and Windows Phone. The ATKS avails these components in the form of web services and associated APIs hosted on Windows Azure.
The Arabic Toolkit Service (ATKS) facilitates advanced research in Arabic NLP technologies by providing quality and reliable foundational components.