Glider: A reinforcement learning approach to extract UI scripts from websites

Web automation scripts (tasklets) are used by personal AI assistants to carry out human tasks such as reserving a car or buying movie tickets. Generating tasklets today is a tedious job which requires much manual effort. We propose Glider, an automated and scalable approach to generate tasklets from a natural language task query and a website URL. A major advantage of Glider is that it does not require any pre-training. Glider models tasklet extraction as a state space search, where agents can explore a website’s UI and get rewarded when making progress towards task completion.

The reward is computed based on the agent’s navigating pattern and the similarity between its trajectory and the task query. A hierarchical reinforcement learning policy is used to efficiently find the action sequences that maximize the reward. To evaluate Glider, we used it to extract tasklets for tasks in various categories (shopping, real-estate, flights, etc.); in 79% of cases a correct tasklet was generated.

Publication Downloads

Glider

May 3, 2021

We release Glider, a reinforcement learning agent to extract UI scripts from the web. Given a website URL (e.g., http://www.unitconversion.org/) and a task description in natural language (e.g., “convert length 7333 from inch to foot”), Glider explores the website's UI to identify a sequence of UI actions (click, type, select, etc.) to complete the given task. Glider works with real websites and does not require any pre-training.