Visual media in the form of television programs, online video, and cinematic films have the power to engage people with dynamic, presentations of ideas. Expert storytellers design how such media unfolds over time to help audiences make sense of complex concepts, appreciate cultural or societal differences and imagine living in entirely different worlds. Technological advances have made it cheaper and easier to capture audio-visual media using the video cameras that are readily available in our mobile and desktop devices. Yet, the most viewed video are not simply raw recordings thrown onto the Web. The best material is carefully composed, filtered and edited to ensure that the resulting media is clear and engaging.

Nevertheless, today’s tools for authoring and viewing video treat the media as a “baked” stream of audio samples, pixels, and frames – the very lowest-level representation possible. They have no understanding of the higher-level semantic structure of the audio-visual content. Researchers have developed a variety of techniques for extracting such higher-level structure from video and shown how to use this structure to significantly facilitate analysis, browsing, editing and manipulation of the raw material.

The goal of this graduate seminar (advanced undergraduates also welcome) is to survey recent work on computational video analysis and manipulation techniques. We will learn how to acquire, represent, edit and remix video. Several popular video manipulation algorithms will be presented, with an emphasis on using these techniques to build practical systems. Students will have the opportunity to acquire their own video and develop the processing tools needed to computationally analyze and manipulate it.

There are no official prerequisites for the course, but we will expect familiarity with the basic concepts of Computer Graphics and/or Computer Vision at the level of CS 148/248 and/or CS 131. Contact me (Maneesh) via email if you are worried about whether you have the background for the course.

Schedule

Week 1

M Mar 30: Introduction/Feature Detection

Slides

Assigned: Assignment 1 (due Apr 8 by 1:30pm)

Optional readings
Chapter 4.1: Feature Detection and Matching: Points and Patches. Szeliski. 2010. (pdf)

W Apr 1: Warping/RANSAC/Morphing

Reading Prompt | Slides

Required readings
Feature-Based Image Metamorphosis. Beier and Neely. SIGGRAPH 1992. (pdf)
Michael Jackson's Black or White video, morphing sequence. (YouTube)

Optional readings
Chapter 2.1: Image Formation: Geometric primitives and transformations. Szeliski. 2010. (pdf)
Chapter 6.1: Feature-Based Alignment: 2D and 3D Feature-Based Alignment. Szeliski. 2010. (pdf)

Week 2

M Apr 6: Feature Tracking and Video Texture

Reading Prompt | Slides

Required readings
Video Textures. Schodl et al. SIGGRAPH 2000. (pdf)

Optional readings
Panoramic Video Textures. Agarwala et al. SIGGRAPH 2005. (pdf)

W Apr 8: Graph-Cut Texture

Reading Prompt | Slides

Due (by 1:30pm): Assignment 1

Assigned: Assignment 2 (due Apr 22 by 1:30pm)

Required readings
Graphcut Textures. Kwatra et al. SIGGRAPH 2003. (pdf)

Week 3

M Apr 13: Looping in Space and Time

Reading Prompt | Slides

Required readings
Automated Video Looping with Progressive Dynamism. Liao et al. SIGGRAPH 2013. (pdf)

Optional readings
Fast Computation of Seamless Video Loops. Liao et al. SIGGRAPH 2015. (pdf)
Gigapixel Panorama Video Loops. He et al. SIGGRAPH 2018. (pdf)

W Apr 15: Stabilization and Segmentation

Reading Prompt | Slides

Required readings

Week 4

M Apr 20: De-Animation and Cinemagraphs

Reading Prompt | Slides

Required readings
Selectively De-Animating Video. Bai et al. SIGGRAPH 2012. (pdf)

Optional readings
Automatic Cinemagraph Portraits. Bai et al. EGSR 2013. (pdf)

W Apr 22: Structure from Motion

Reading Prompt | Slides

Due (by 1:30pm): Assignment 2

Required readings
Photo Tourism: Exploring Photo Collections in 3D. Snavely et al. SIGGRAPH 2006. (pdf)

Optional readings
Modeling the World from Internet Photo Collections. Snavely et al. IJCV 2007. (pdf)

Week 5

M Apr 27: Scene Building

Reading Prompt | Slides

Assigned: Final Project: Proposal (due May 4 by 1:30pm)

Assigned: Final Project: Presentation (due Jun 3 )

Assigned: Final Project: Slides, Code and Paper (due Jun 8 by 11:59pm)

Required readings
Sampling Based Scene Space Video Processing. Klose et al. SIGGRAPH 2015. (pdf)

W Apr 29: Action Classification I

Reading Prompt | Slides

Required readings

Optional readings

Week 6

M May 4: Action Classification II

Reading Prompt | Slides

Due (by 1:30pm): Final Project: Proposal

Required readings

Optional readings

W May 6: Processing Faces

Reading Prompt | Slides

Required readings
Deformable Model Fitting by Regularized Landmark Mean-Shifts. Saragih et al. IJCV 2011. (pdf)

Optional readings
Detecting face landmarks using deep learning, Blendshape models, Pose estimation.

Week 7

M May 11: Bringing Portraits to Life

Reading Prompt | Slides

Required readings
Bringing Portraits to Life. Averbuch-Elor et al. SIGGRAPH Asia 2017. (pdf)

W May 13: Transcript-Based Manipulation and Browsing

Reading Prompt | Slides

Required readings
Tools for Placing Cuts and Transitions in Interview Video. Berthouzoz et al. SIGGRAPH 2012. (pdf)

Week 8

M May 18: Automated Film Editing

Reading Prompt | Slides

Required readings
Computational Video Editing for Dialogue-Driven Scenes. Leake et al. SIGGRAPH 2017. (pdf)

Optional readings
QuickCut: An Interactive Tool for Editing Narrated Video Truong et al. UIST 2016. (pdf)

W May 20: Generative Adversarial Networks

Reading Prompt | Slides

Required readings
Image-to-Image Translation with Conditional Adversarial Nets. Isola et al. CVPR 2017. (web)

Optional readings
Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Zhu et al. ICCV 2017. (web)
Video-to-Video Synthesis. Wang et al. NeurIPS 2018. (web)

Week 9

M May 25: Memorial Day Holiday (No Class)

W May 27: Learning Slow Motion

Reading Prompt | Slides

Required readings
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. Jiang et al. CVPR 2018. (web)

Optional readings
Video Frame Synthesis using Deep Voxel Flow. Liu et al. ICCV 2017. (arXiv)

Week 10

M Jun 1: GANs for Faces and Pose

Reading Prompt | Slides

Required readings
Deep Video Portraits. Kim et al. SIGGRAPH 2018. (web)

Optional readings
Everybody Dance Now. Chan et al. 2018. (web)

W Jun 3: Final Project Presentations (10 minute presentation from each group)

Due: Final Project: Presentation

M Jun 8: No class, but final project due

Due (by 11:59pm): Final Project: Slides, Code and Paper

Teaching Staff

Instructor: Maneesh Agrawala
Office Hours: TBD
Instructor: Ohad Fried
Office Hours: TBD
Instructor: Juan Carlos Niebles
Office Hours: TBD

To contact us please use Piazza. This is the fastest way to get a response.

Assignments and Requirements

Class Participation (15%)
Paper Presentation (15%)
Assignment 1: Manual Manipulation (5%)
Assignment 2: Video Morphing (15%)
Final Project (50%)

Attendance Requirement: This course relies on you reading the assigned papers and participating in the discussions. Therefore attendance is mandatory.

Plagiarism Policy: Assignments should consist primarily of your original work, building off of others’ work–including 3rd party libraries, public source code examples, and design ideas–is acceptable and in most cases encouraged. However, failure to cite such sources will result in score deductions proportional to the severity of the oversight.

COMPUTATIONAL VIDEO MANIPULATION

Schedule

Teaching Staff

Assignments and Requirements

COMPUTATIONAL
VIDEO
MANIPULATION