Microsoft Photosynth - Creating 3D Virtual Environments by Seamlessly Marrying Digital Photographs and HSL's Thoughts and Analysis
On July 27th Microsoft Live Labs unveiled Photosynth, a software program that takes a large collection of photos of a place or object, analyzes them for similarities, and displays them in a reconstructed 3-Dimensional space. The photos can be captured by different photographers using different cameras and then linked together using the internet.
Microsoft Photosynth Demo/Tour
Video Overview of Microsoft Photosynth
How Does It Work? Here are some excerpts from the Microsoft Live Labs Photosynth Website
Each photo is processed by computer vision algorithms to extract hundreds of distinctive features, like the corner of a window frame or a doorhandle. Then, photos that share features are linked together in a web. When a feature's found in multiple images, its 3D position can be calculated. It's similar to depth perception--what your brain does to perceive the 3D positions of things in your field of view based on their images in both of your eyes. Photosynth's 3D model is just the cloud of points showing where those features are in space.
Your brain knows that your eyes are about two inches apart. But when Photosynth does its magic, it doesn't know where the cameras were, or which way they were pointing. Fortunately, when there are many cameras, and many features in common, the algorithms behind Photosynth can figure out not only where the features are in 3D, but where all of the cameras would have to have been, and which way they were aimed, consistent with the features they "saw".
The Photosynth client shows you the 3D pointcloud, but more importantly, it also shows you the original pictures overlaid on the model. Imagine a slide projector placed at each camera position, aimed where the camera was aiming, and projecting the picture that camera took. A screen is placed in the 3D environment at an appropriate distance from the projector. As you move around in the Photosynth environment, projectors turn on and off, giving you a changing perspective on a world built entirely out of the original photos.
After years of being not that impressed with Microsoft (Bloatware, security flaws, NSAKEY, etc.) I seem to be getting more and more impressed everyday. It was almost a month ago that I published on the Microsoft RoundTable/RingCam 360 degree webcam and meeting recorder and then on the upcoming Microsoft Unified Communications offering. This morning HPL Board of Advisor member Chris Van Waters sent me a link to the Microsoft Photosynth demo and I was blown away yet again. The ability to take the world's existing inventory of digital photography and create active 3D visualizations is fairly amazing. In 2003 the University of California at Berkley's How Much Information? Project estimated that the world had produced over 900 Billion photographs that, if all digitized at 5MB per photo, would equal a library of 4.5 Exabytes of data. At the time the project estimated that the growth of photographic images had grown by 150 Billion in the two years between 2003's study of 2002's data and the project's original estimates in 2000. The world's rate of photo aquisition has surely accelerated as more photography has gone digital and the average storage capacity in digital cameras has increased while their cost has decreased.
The point? The professional and amateur photographers of the world have already captured and continue to capture much of the civilized world. While these photographs now reside on hundreds of millions of individual hard drives, photo albums, and shoe boxes it doesn't take too much imagination to envision applications that could harvest, catalog, and assemble these existing (and future) photographs very rapidly into a complete 3D visualization of most of the cities on the planet (Think Google Earth on Steroids). While Microsoft hasn't released the specifics on what the requirements are for Photosynth to be able to accurately match an image to it's database and accurately incorporate an image into the right 3D visualization (I.E. lots of Gothic Cathedrals use the same architectural elements) I would assume that it would be fairly easy to harvest those photo databases where there is already existing metatag data on location, time, date, etc.(Corbis, AP, UPI, Commercial Stock Photography Collections, PhotoDisk, etc.).
This should allow Microsoft to very rapidly create a 3D visualization of most of the civilized world simply by working with the existing stock of available photography. This should be especially true for highly photographed tourist destinations like Times Square, The US Capital, and, as demonstrated in the Photosynth video, St. Peter's Basilica. These "photorealistic" virtual environments could then be ported to video games, websites, virtual reality caves, mapping tools, and other applications.
The big winner: Microsoft's EMC sales rep (This project is going to require an amazing amount of storage)
Other Photosynth Resources
An Interview with SeaDragon Founder and Photosynth Architect Blaise Aguera y Arcas can be found Here.
An Interview with Adam Shepard, Group Manager of Microsoft Live Labs can be found Here.