Ski Montage - Interval Research

Ski Montage
Short Film, Interval Research Corporation, 1998

Ski Montage addresses the question of how to make an interesting and compelling short film out of a large quantity of source material in a straightforward algorithmic fashion. The subject matter and style of the material are influenced by Interval's MarkerCam technology, and the video segment selection, editing and assembly are influenced by Kinetica's Adaptive Media Template concepts, currently embodied in our MediaFlow tool set.

Source Footage

The footage for the Ski Montage was recorded on a skiing expedition in the wilderness of the Sierra Nevada mountains in February 1998. The group snowshoed up one side of a mountain and then skied or snow boarded down the other, recording the whole journey using lightweight video packs and cameras mounted on the foreheads of two of the skiers in the party. The result was two tapes, each documenting the first-person point of view of one of the members of the party. Each tape was a single, continuous, two-hour shot.

As a result of the character of the footage, the style of the Montage is distinct from other ski videos, emphasizing the first person experience and the flow of the activity. I like the sense of flow that tends to be present in Vertov footage: the first-person POV, and the singular focus on the activity and the environment. It's very meditative and compelling to watch.

Template Development

Part of my motivation for working with this footage was to use it for the development of Media Lego applications in MediaFlow. This work is based partly on the Adaptive Media Templates from 1997. The general concept of a Template is that it can select bits of footage based on certain criteria, match them against its fitness parameters specified by the template author, and automatically generate a short film.

For the Ski Montage Template, the raw footage tells a story, and that it is the story of a journey. There may in fact be several interesting stories in the footage, and many different criteria could be used to select the footage to be used. The major challenge here is to sift the interesting bits out of hours of raw footage and present them in a compelling way. This is an important problem in dealing with all manner of video material, including footage collected from Vertov cams as well as ordinary home videos. One goal of this work is to develop a generalized template or set of templates.

In this case the journey is basically up one side of the mountain and down the mountain, and I have characterized it as having three main phases. Phase one is the trip up the mountain, cross-country skiing. Phase two is the time spent at the top of the mountain. Phase three is the trip down the mountain, downhill skiing. These phases constitute the top-level structure of the Template, and the abstract structure of the Template correlates to the actual experience documented in the footage. Within each section, further constraints can be added to specify the shots to provide the flow of the action.

An important goal of this experiment is to use MediaFlow to automatically select the footage to be included in the montage. After examining several possibilities, including using footage from multiple cameras to alternate between multiple skiers and points of view, I decided to concentrate on the footage from one camera, exclusively in the first person, and use the sections of the video where the skier was moving in particular ways that might be detected my our video motion analysis algorithms.

Analysis Phase

Before the video can be synthesized, the input footage must be analyzed. This was done by a combination of Vertov marks recorded in the field and a process similar to traditional video logging. In the future, GSP data could be used to determine the skier's velocity and correlate the phases of the journey to the footage.

Segments of footage selected are characterized by their image flow as the skier moves through space with two major kinds of motion image flow being isolated. One is forward flow, which is the typical state when the skier is moving forward and looking ahead. The other is panoramic, which is the typical state when the skier is stopped and is looking around at his surroundings. Changes in the rate of motion flow may also provide useful clues to the content of the action. For example, a rapid stop in the motion level may indicate a crash. The algorithm used reacked segements of footage acoring to their fit with "idealized" motion flow characteristics, and then filled up the bin until a global threshold for the total time had been reached.

Synthesis Phase

The synthesis algorithm of the Ski Montage is relatively simple. Once the shots are selected, they are sequenced together in the order in which they appeared on the source tape. The first section is a sequence of forward-looking shots ascending the mountain, the second section is panning panoramas of the vistas at the mountain top, and the third is forward-looking shots skiing down the mountain. Instead of cuts, cross-dissolves are used between shots to imply a temporal ellipsis as well as to impart smoothness. The system tries to match motion continuity across dissolves.

The durations of each of the segments are set by the template and are partially dependent on the number and duration of shots that the system determines to meet it's motion-flow criteria. The montage has a lot of flexibility inherent in the Template with respect to parameters such as duration, length and number of clips, and pacing of the cuts.

Finally, a musical soundtrack is mixed with the ambient sound which accompanies the video. Note the Ski Montage presented here is only of the third, downhill section.

Future Directions

Within the basic framework described above are a number of variables that can make a large difference in the resulting viewing experience and the overall impression of the content. These include the overall duration of the montage, and the relative duration (or presence) of the three sections, as well as the pacing and number of shots within a segment. The tempo and energy level of the music can also alter the experience.

It would be intersting make the Ski Montage experience into an interactive presentation, with realtime rather than scripted control over the composistion paramaters. The pacing of the cutting could be faster or slower, matched to the music, or interleaving different ratios of forward flow and panoramas, depending on which aspects of the experience he wishes to highlight.