Recovering 'lost dimensions' of images and video

Press releases may be edited for formatting or style | October 17, 2019 Artificial Intelligence

MIT researchers have developed a model that recovers valuable data lost from images and video that have been "collapsed" into lower dimensions.

The model could be used to recreate video from motion-blurred images, or from new types of cameras that capture a person's movement around corners but only as vague one-dimensional lines. While more testing is needed, the researchers think this approach could someday could be used to convert 2D medical images into more informative -- but more expensive -- 3D body scans, which could benefit medical imaging in poorer nations.

"In all these cases, the visual data has one dimension -- in time or space -- that's completely lost," says Guha Balakrishnan, a postdoc in Computer Science and Artificial Intelligence Laboratory (CSAIL) and first author on a paper describing the model, which is being presented at next week's International Conference on Computer Vision. "If we recover that lost dimension, it can have a lot of important applications."

Captured visual data often collapses data of multiple dimensions of time and space into one or two dimensions, called "projections." X-rays, for example, collapse three-dimensional data about anatomical structures into a flat image. Or, consider a long-exposure shot of stars moving across the sky: The stars, whose position is changing over time, appear as blurred streaks in the still shot.

Likewise, "corner cameras," recently invented at MIT, detect moving people around corners. These could be useful for, say, firefighters finding people in burning buildings. But the cameras aren't exactly user-friendly. Currently they only produce projections that resemble blurry, squiggly lines, corresponding to a person's trajectory and speed.

The researchers invented a "visual deprojection" model that uses a neural network to "learn" patterns that match low-dimensional projections to their original high-dimensional images and videos. Given new projections, the model uses what it's learned to recreate all the original data from a projection.

In experiments, the model synthesized accurate video frames showing people walking, by extracting information from single, one-dimensional lines similar to those produced by corner cameras. The model also recovered video frames from single, motion-blurred projections of digits moving around a screen, from the popular Moving MNIST dataset.

Joining Balakrishnan on the paper are: Amy Zhao, a graduate student in the Department of Electrical Engineering and Computer Science (EECS) and CSAIL; EECS professors John Guttag, Fredo Durand, and William T. Freeman; and Adrian Dalca, a faculty member in radiology at Harvard Medical School.



You Must Be Logged In To Post A Comment Sign In If you've already created an account, use your email address and password to sign in using the form below. Login Problems: Click here if you are having login issues. Email address: Password: Forgot your password? Login Problems? View our Legal Notice and Privacy Notice Register Registration is Free and Easy. Enjoy the benefits of The World's Leading New & Used Medical Equipment Marketplace. Register Now!