Caption Crawler

Overview

Caption Crawler is a plug-in for the Edge (opens in new tab) and Chrome (opens in new tab) web browsers that provides additional information about images for screen reader users. Many images on the web lack captions (i.e., alt text). When a webpage loads, Caption Crawler identifies images that are missing captions and checks if it is captioned elsewhere on the web; if so, the discovered captions are made available to the screen reader.

Download Information

Caption Crawler is a prototype developed by the Microsoft Research Ability team (opens in new tab) to explore ways to scale web accessibility. You can download the beta release for Edge at http://aka.ms/captioncrawleredge (opens in new tab) and for Chrome at http://aka.ms/captioncrawlerchrome (opens in new tab). If you have any feedback or questions about Caption Crawler, please share them with us at captioncrawler@microsoft.com.

Usage Guide

Once you have downloaded and installed the browser extension, you may browse the web as you typically would; Caption Crawler works invisibly behind-the-scenes, backfilling missing captions when available. Note that Caption Crawler is only able to supply captions for images that appear in multiple places on the web and have been captioned in at least one place; our research suggests this covers about 10% – 15% of images on popular websites (opens in new tab). We do not filter discovered captions for content or accuracy.

When your screen reader reads image descriptions aloud, any descriptions that have been supplied by the Caption Crawler will be preceded by the words “Auto Alt:” so that you know they have automatically added to the page for you. Some images may have had more than one caption available; you can check for additional captions by using the keyboard shortcut Ctrl+Shift+U to advance through the list of captions, and Ctrl+Shift+Y will move you backwards in that list.

There are a few advanced settings that are meant for sighted debugging of the tool by people not using screen readers. You can access these options by right-clicking on the Caption Crawler icon in the browser.

  1. The first option controls whether the state of each image on the page is made visible by drawing a colored border around it. A yellow border means that the image has been recognized as an image element on the page. If an image has a caption in the original webpage, then its border will turn green. Note that not all non-captioned images cause the extension to request a caption; our plug-in has heuristics to ignore small images such as logos. When our plug-in identifies a missing caption and requests it from our cloud service, the border changes to red. There are times that the web service knows that it has not received any captions from the Bing Image Search service; in that case the border changes to orange. Finally, when a caption is successfully returned for an image, its border changes to blue.
  2. The second option controls what happens when the navigation keyboard commands (ctrl+shift+u and ctrl+shift+y) are pressed. When the option is “on”, then a popup dialog shows the newly assigned caption (alt text).
  3. The third option controls whether the keyboard command ctrl+shift+h is active. When it is, then a popup will appear providing statistics on the current state of the images on the page.

Implementation Details

Once loaded into the browser, Caption Crawler monitors each open web page and looks for images that do not have captions (also known as “alt text”). When such an image is found, its URL is sent to the Caption Crawler Web Service, which runs in Azure. That web service in turn uses Bing Image Search (opens in new tab), which indexes captions for some images. In addition, it may also return a list of web pages on which the same image may be found. The Caption Crawler web service then retrieves those web pages and searches them to see if the image has a caption on those pages. All found captions are returned to the plug-in, in a first-found order. The first returned caption is set as the “alt text” attribute of the image, and will be read by the screen reader when appropriate. If more than one caption has been retrieved, a user can choose to hear the additional captions using keyboard commands, as described in the usage guide above. Note that Caption Crawler is only able to supply captions for images that appear in multiple places on the web and have been captioned in at least one place; our research suggests this covers about 10% – 15% of images on popular websites (opens in new tab). We do not filter discovered captions for content or accuracy.