By Rob Knies, Managing Editor, Microsoft Research
You’ve got photos—lots and lots of photos. Everybody does these days, thanks to the digital-photography revolution. Hundreds, thousands, a veritable treasure trove of image-based memories.
Sometimes, though, you don’t need thousands of photos. Sometimes, you just need that one representative shot to convey the fun you had during your day at the beach. But which one? The kids frolicking in their new swimsuits? The adults soaking in some rays? Your family’s nonpareil sandcastle? To tell the story adequately, no single one will suffice. You need them all combined into a defining composite.
Spotlight: blog post
Microsoft Research Cambridge is at your service.
AutoCollage, an easy, novel framework for the automatic creation of representative collages from collections of photos, became available to the general public on Sept. 4. Utilizing a collection of sophisticated technological techniques, AutoCollage is simple to use, produces attractive imagery, and, perhaps most important, is a whole lot of fun.
A free, 30-day trial version of the software is available worldwide, and a full, unrestricted version can be purchased in the United States and in European Union nations. AutoCollage is one of the first Microsoft Research products to be made available to consumers.
It works like this: AutoCollage—which works with either Windows Vista or Windows XP Service Pack 2 and above—cuts out interesting parts of photos and combines them together, following natural features as boundaries between images. The selected pieces are sized similarly and assembled into a pleasing whole.
“The most significant feature that differentiates AutoCollage is that it offers exceptionally sophisticated blending technology for photographs, powered by state-of-the-art computer-vision techniques,” says Alisson Sol, development manager for the Incubation and Tech Transfer team at the England lab. “It’s great that we can give everyone the opportunity to play with and use this compelling technology, and we’re looking forward to seeing what collages they come up with.”
The application is a direct result of months of incubation efforts at Microsoft Research Cambridge.
“While the majority of the work undertaken at Microsoft Research is longer-term, pure research,” says Mitch Goldberg, director of the Incubation, and Tech Transfer team, “compelling innovations are also brought to market through a mix of technology transfer into Microsoft products, licensing our technology, and creating new ventures. Cambridge Incubation is proud to make AutoCollage available worldwide to trial as a Microsoft Research download and a full version available for purchase in the United Kingdom and the United States via the online Microsoft store.”
While AutoCollage might be simple and intuitive to use, behind the scenes, the most advanced technologies extant are doing all the heavy lifting.
The most sophisticated tool of its kind available to consumers, the software combines object recognition, face detection, image blending, and other computer-vision and -graphics techniques to provide a seamless summary of the most interesting images within a group of photos.
Just ask Carsten Rother. A researcher in the Machine Learning and Perception group within Microsoft Research Cambridge, Rother expanded an earlier research project called Tapestry to the point that the Cambridge Incubation team became intrigued with the possibilities.
“People have a lot of images,” Rother says, “and the first goal was to ask, ‘Can we create a representation of these images as compact as possible?’ ”
The answer, as you might have surmised, is a resounding yes. A pair of papers outlines the evolution of the project. Digital Tapestry, written by Rother, Vladimir Kolmogorov, and Andrew Blake of Microsoft Research Cambridge, in conjunction with Sanjiv Kumar of Carnegie Mellon University, was presented during the Institute of Electrical and Electronics Engineers’ Computer Vision and Pattern Recognition conference in 2005. A second, entitled AutoCollage, written by Rother and Microsoft Research Cambridge colleagues Lucas Bordeaux, Youssef Hamadi, and Blake, was featured in 2006 during the Association for Computing Machinery’s annual conference on Computer Graphics and Interactive Techniques.
“We’ve tested tens of thousands of different collages in the course of our research,” Rother says, “and it’s really exciting that the positive feedback we’ve received from our user studies shows we’ve answered these challenges successfully.”
One of AutoCollage’s most impressive features is its ease of use. You simply point to a folder containing a collection of photos, click a button, and the system creates a collage, using the most representative images within the collection and artfully placing, via computer-vision techniques, the most interesting portions of those photos into a rectangular format.
Once the collage is complete—usually taking only a few seconds—the resultant composite can be printed, e-mailed, or set as the background of a PC desktop. Many of those who have used the software find that one of the most useful ways to employ AutoCollage is to place a collage at the front of a collection of images, such as a photo album, thereby summarizing the contents to come.
The program’s creators suggest a range of 7-30 photos for optimal performance, with a default set at 12. A slider enables the user to adjust according to the scope of the collection. Set your desired number of images, press the button, and voilà. Wait till Mom sees this!
In actuality, although the process might seem simple, there are significant technological achievements that combine to produce the final collage.
“You have one big objective function that you want to optimize,” he says. “If you have a lot of input images, what should be in there? The most important images should be in there. The images should be as different as possible.
“From each image, you take the most interesting part, what we call the ‘region of interest.’ Then, these images should be arranged in a nice way.”
AutoCollage also makes use of face-detection technology and clever rules of thumb born of experimentation, such as placing images that include sky at the top of a collage, where the sky isn’t as jarring as if it appeared in the middle of the collage.
Rother cites five steps the software uses to produce a collage.
“The first task,” he says, “is to rank all the images, where the top-ranked image is the most likely to end up in the collage and the last one is the least interesting. If there are a lot of faces in a group shot, it’s more likely to be in. If there are two duplicate images, then only one should be in.”
Next is an analysis of the top-ranked images, the detection of regions of interest.
“What is the prominent area?” Rother asks. “We know that a face is likely to be interesting for the user to have in the collage. It’s less important to have a lot of sky regions.
“There are a lot of other internal features, like image contrast, that decide what is an interesting region. There has been a lot of research on that, and we exploit that body of knowledge.”
Then the most visually appealing portions of the top-ranked images are combined so the regions of interest don’t overlap, a process called packing.
“After the packing,” Rother continues, “we do a cut where we segment from each image the exact, not rectangular region. We use code used in other projects, such as GrabCut, for image segmentation.”
“The segmentation has two objectives,” he explains. “One is that it should not be a tiny fraction of the image, because you want to have each image to be as balanced as possible, each image being a relatively big portion. And you prefer sharper boundaries to high-contrast transitions, because it’s very likely that those indicate a true object.
“We don’t have any object recognition right now in the system. We have it in the sense of face detection and sky detection, but we don’t do generic object recognition to say: ‘That’s a car. That’s a road.’ But edges are likely candidates for these transitions. If there is an edge, we take it in the segmentation.”
That leaves one final step, the blending of the images to produce the final collage. This uses an existing technique called Poisson blending.
As a whole it’s an integrated procedure. All these things are invisible to the user, who simply waits a moment for the final result. AutoCollage is not computationally expensive, so if, by chance, the first collage is not satisfactory, you simply tweak the number of photos or the collection of photos and try again. If assistance is needed, users can consult a comprehensive help system included with the tool or get support in the Microsoft Research products forum.
“Seeing an AutoCollage of your own photos,” said one participant in a user study, “is a surprisingly emotive experience.”
The AutoCollage application, driven by the Microsoft Research Cambridge Incubation team, is a result of worldwide collaboration. Although much of the work was performed at Microsoft Research Cambridge—with the Computer Vision, Incubation and Tech Transfer, Computer-Mediated Living, and Constraint Reasoning groups at that lab all making contributions—Microsoft Research associates in Redmond and Beijing also played key roles.
Andrew Herbert, managing director of Microsoft Research Cambridge, is delighted with the results.
“AutoCollage is a great example of some very innovative computer-science research from our Cambridge facility and partner labs in the United States and China,” Herbert says. “We are furthering our commitment to transfer scientific innovations by providing consumers with access to technologies that are on the cutting edge of computer science.”
For Sol, the novel results provided by AutoCollage are validation enough.
“This is the first application that I can explain to my Grandma,” he smiles. “She can really use it and quickly produce an AutoCollage, click e-mail, and send it to us.
“This is what I’m really proud of: It is a small application. It doesn’t fulfill a business need. It doesn’t enhance your productivity. It enhances how happy you are with using computers.”