Video Playlist  

Visual media in the form of television programs, online video, and cinematic films have the power to engage people with dynamic, presentations of ideas. Expert storytellers design how such media unfolds over time to help audiences make sense of complex concepts, appreciate cultural or societal differences and imagine living in entirely different worlds. Technological advances have made it cheaper and easier to capture audio-visual media using the video cameras that are readily available in our mobile and desktop devices. Yet, the most viewed video are not simply raw recordings thrown onto the Web. The best material is carefully composed, filtered and edited to ensure that the resulting media is clear and engaging.

Nevertheless, today’s tools for authoring and viewing video treat the media as a “baked” stream of audio samples, pixels, and frames – the very lowest-level representation possible. They have no understanding of the higher-level semantic structure of the audio-visual content. Researchers have developed a variety of techniques for extracting such higher-level structure from video and shown how to use this structure to significantly facilitate analysis, browsing, editing and manipulation of the raw material.

The goal of this graduate seminar (advanced undergraduates also welcome) is to survey recent work on computational video analysis and manipulation techniques. We will learn how to acquire, represent, edit and remix video. Several popular video manipulation algorithms will be presented, with an emphasis on using these techniques to build practical systems. Students will have the opportunity to acquire their own video and develop the processing tools needed to computationally analyze and manipulate it.

There are no official prerequisites for the course, but we will expect familiarity with the basic concepts of Computer Graphics and/or Computer Vision at the level of CS 148/248 and/or CS 131. Contact me (Maneesh) via email if you are worried about whether you have the background for the course.

Schedule


Week 1
M Mar 30: Introduction/Feature Detection
    Slides
   Assigned: Assignment 1 (due Apr 8 by 1:30pm)
   Optional readings
        Chapter 4.1: Feature Detection and Matching: Points and Patches. Szeliski. 2010. (pdf)
 
W Apr 1: Warping/RANSAC/Morphing
    Reading Prompt | Slides
   Required readings
        Feature-Based Image Metamorphosis. Beier and Neely. SIGGRAPH 1992. (pdf)
        Michael Jackson's Black or White video, morphing sequence. (YouTube)
   Optional readings
        Chapter 2.1: Image Formation: Geometric primitives and transformations. Szeliski. 2010. (pdf)
        Chapter 6.1: Feature-Based Alignment: 2D and 3D Feature-Based Alignment. Szeliski. 2010. (pdf)
 
Week 2
M Apr 6: Feature Tracking and Video Texture
    Reading Prompt | Slides
   Required readings
        Video Textures. Schodl et al. SIGGRAPH 2000. (pdf)
   Optional readings
        Panoramic Video Textures. Agarwala et al. SIGGRAPH 2005. (pdf)
 
W Apr 8: Graph-Cut Texture
    Reading Prompt | Slides
   Due (by 1:30pm): Assignment 1
   Assigned: Assignment 2 (due Apr 22 by 1:30pm)
   Required readings
        Graphcut Textures. Kwatra et al. SIGGRAPH 2003. (pdf)
 
Week 3
M Apr 13: Looping in Space and Time
    Reading Prompt | Slides
   Required readings
        Automated Video Looping with Progressive Dynamism. Liao et al. SIGGRAPH 2013. (pdf)
   Optional readings
        Fast Computation of Seamless Video Loops. Liao et al. SIGGRAPH 2015. (pdf)
        Gigapixel Panorama Video Loops. He et al. SIGGRAPH 2018. (pdf)
 
W Apr 15: Stabilization and Segmentation
    Reading Prompt | Slides
   Required readings
       
 
Week 4
M Apr 20: De-Animation and Cinemagraphs
    Reading Prompt | Slides
   Required readings
        Selectively De-Animating Video. Bai et al. SIGGRAPH 2012. (pdf)
   Optional readings
        Automatic Cinemagraph Portraits. Bai et al. EGSR 2013. (pdf)
 
W Apr 22: Structure from Motion
    Reading Prompt | Slides
   Due (by 1:30pm): Assignment 2
   Required readings
        Photo Tourism: Exploring Photo Collections in 3D. Snavely et al. SIGGRAPH 2006. (pdf)
   Optional readings
        Modeling the World from Internet Photo Collections. Snavely et al. IJCV 2007. (pdf)
 
Week 5
M Apr 27: Scene Building
    Reading Prompt | Slides
   Assigned: Final Project: Proposal (due May 4 by 1:30pm)
   Assigned: Final Project: Presentation (due Jun 3 )
   Assigned: Final Project: Slides, Code and Paper (due Jun 8 by 11:59pm)
   Required readings
        Sampling Based Scene Space Video Processing. Klose et al. SIGGRAPH 2015. (pdf)
 
W Apr 29: Action Classification I
    Reading Prompt | Slides
   Required readings
       
   Optional readings
       
 
Week 6
M May 4: Action Classification II
    Reading Prompt | Slides
   Due (by 1:30pm): Final Project: Proposal
   Required readings
       
   Optional readings
       
 
W May 6: Processing Faces
    Reading Prompt | Slides
   Required readings
        Deformable Model Fitting by Regularized Landmark Mean-Shifts. Saragih et al. IJCV 2011. (pdf)
   Optional readings
        Detecting face landmarks using deep learning, Blendshape models, Pose estimation.
 
Week 7
M May 11: Bringing Portraits to Life
    Reading Prompt | Slides
   Required readings
        Bringing Portraits to Life. Averbuch-Elor et al. SIGGRAPH Asia 2017. (pdf)
 
W May 13: Transcript-Based Manipulation and Browsing
    Reading Prompt | Slides
   Required readings
        Tools for Placing Cuts and Transitions in Interview Video. Berthouzoz et al. SIGGRAPH 2012. (pdf)
 
Week 8
M May 18: Automated Film Editing
    Reading Prompt | Slides
   Required readings
        Computational Video Editing for Dialogue-Driven Scenes. Leake et al. SIGGRAPH 2017. (pdf)
   Optional readings
        QuickCut: An Interactive Tool for Editing Narrated Video Truong et al. UIST 2016. (pdf)
 
W May 20: Generative Adversarial Networks
    Reading Prompt | Slides
   Required readings
        Image-to-Image Translation with Conditional Adversarial Nets. Isola et al. CVPR 2017. (web)
   Optional readings
        Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. Zhu et al. ICCV 2017. (web)
        Video-to-Video Synthesis. Wang et al. NeurIPS 2018. (web)
 
Week 9
M May 25: Memorial Day Holiday (No Class)
 
W May 27: Learning Slow Motion
    Reading Prompt | Slides
   Required readings
        Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. Jiang et al. CVPR 2018. (web)
   Optional readings
        Video Frame Synthesis using Deep Voxel Flow. Liu et al. ICCV 2017. (arXiv)
 
Week 10
M Jun 1: GANs for Faces and Pose
    Reading Prompt | Slides
   Required readings
        Deep Video Portraits. Kim et al. SIGGRAPH 2018. (web)
   Optional readings
        Everybody Dance Now. Chan et al. 2018. (web)
 
W Jun 3: Final Project Presentations (10 minute presentation from each group)
   Due: Final Project: Presentation
 
M Jun 8: No class, but final project due
   Due (by 11:59pm): Final Project: Slides, Code and Paper
 


Teaching Staff


Instructor: Maneesh Agrawala
    Office Hours: TBD
Instructor: Ohad Fried
    Office Hours: TBD
Instructor: Juan Carlos Niebles
    Office Hours: TBD

To contact us please use Piazza. This is the fastest way to get a response.


Assignments and Requirements


Class Participation (15%)
Paper Presentation (15%)
Assignment 1: Manual Manipulation (5%)
Assignment 2: Video Morphing (15%)
Final Project (50%)

Attendance Requirement: This course relies on you reading the assigned papers and participating in the discussions. Therefore attendance is mandatory.

Plagiarism Policy: Assignments should consist primarily of your original work, building off of others’ work–including 3rd party libraries, public source code examples, and design ideas–is acceptable and in most cases encouraged. However, failure to cite such sources will result in score deductions proportional to the severity of the oversight.