New research that uses machine learning and artificial intelligence for flow-edge guided video completion can help video editors perform a number of tasks more easily and efficiently.

Jia-Bin Huang, an assistant professor in the Bradley Department of Electrical and Computer Engineering and a faculty member at the Discovery Analytics Center, and Ph.D. student Chen Gao collaborated with Facebook researchers Ayush Saraf and Johannes Kopf on this project, which was included in proceedings of the recent European Conference on Computer Vision 2020.

“Prior methods propagated colors among local flow connections between adjacent frames,” said Huang. “But not all missing regions in a video can be reached in this way, because the motion boundaries form barriers that cannot be permeated. Our method alleviates this problem by introducing nonlocal flow connections to temporally distant frames, giving video editors the ability to propagate video content over motion boundaries.”

Loading player for https://youtu.be/v4wXQK5OeDE...

The method offers advantages to a number of video editing applications. For VFX — the process that creates or manipulates imagery outside the context of a live action shot in filmmaking — it can remove unwanted objects as well as remove wires or rig. It can also help an editor stabilize a video without cropping; remove a certain object (a chair for example) or other obstruction in the scene; remove watermarks or trademarks; and restore vintage videos by eliminating scratches.

“For many professionals who have had to spend a lot of time working on object removal, having a robust approach for video completion will be a game-changer for them,” Huang said.

The key to achieving good results with the flow-based approach is accurate flow completion, in particular, synthesizing sharp flow edges along the object boundaries, said Huang. 

The research team was able alleviate the limitations of existing flow-based video completion algorithms by explicitly completing flow edges to obtain piecewise-smooth flow completion; handling regions that cannot be reached through transitive flow (e.g., periodic motion, such as walking) leveraging nonlocal flow; and avoiding visible seams in their results through operating in the gradient domain. 

Their approach handles videos with up to 4K resolution, which other methods have failed to do because of their built-in excessive memory requirements. 

“Our results show clear improvement over prior methods in both quantitative evaluation and the quality of visual results,” said Huang.

This collaborative research began in the summer of 2019, when Huang was a visiting research scientist at Facebook and Gao was a research intern. They continued to collaborate throughout the year on the paper presented at ECCV.

Read the full paper here.

Written by Barbara L. Micale

Share this story