This automatic music scoring application is authored in MediaCalc/MediaFlow. It takes the approach of combining clips of orchestrated movie music, residing in an annotated database, and adapting them to the movie story context, and to their neighbor clips in that context, using simple matching and continuity rules.
In contrast to a 'from scratch' algorithmic composition, this more resembles a professional moviemaking practice whereby, before an original score for the new movie has been written, the composer/editor spots a movie edit draft using elements of a functionally similar score. This also resembles the kind of collage and repeated use and rearrangement of stock phrases employed by Carl Stalling for the Looney Tunes cartoon scores. A phrase in the database is annotated with its typical story function and related parameters like playback tempo range (so for instance a 'maudlin' music clip won't be spoiled by being played too fast), along with representations like the underlying harmony and tempo that let the system enforce continuity rules by default.
Figure 1 shows a MIDI music score being annotated in MediaCalc for later use as clips in movie scores. Three streams are shown: a stored MIDI music stream and two Annotation streams that refer to it, as indicated by the flow from the spigot to the annotation operations. Wire connections going off screen indicate the presence of more annotation streams below, dependent on the music stream.
The free-form annotation streams let a user define properties of various temporal extents within the music. Each annotation stream defines a clip with a main body, optional head and tail, beat interval defining the tempo, harmonic root and chord type or quality, and story function such as opening (for introducing a scene) or charge for part of a battle sequence.
Figure 2 shows the clip database being used to automatically score a movie scene from a story action scene. The top two streams show stored audio and video tracks for an AutoBuddy action movie scene, automatically edited from actual game play using algorithms developed by my team in 1996. Below these is a manually constructed Story Action stream describing the action.
Such streams can be computed automatically too. For example, in AutoBuddy we know when we are firing or being fired upon, accelerating, etc., and can use this information to compute story actions like Charge, Firefight, Retreat, or a Pastoral moment during a lull in fire. To find clips with appropriate story actions, a StoryToMusic matching function finds clips with appropriate story actions and extents, adjusting tempo or looping as needed to fit the context, and changing key if needed to fit with an adjacent clip.
Underlying this, a MusicFromAnnotation function dependent on the clip stream renders the chosen music clips into a music stream that will accompany the video and audio streams it scores, shown below the MIDI view.
A deeper treatment of this technology should explore qualitative aspects like different kinds of harmonic motion. Ascent by semitones adds suspense. Descent by semitones can release that tension, or grow ominous if it continues. Motion up by fourths is not jarring, but provides natural harmonic motion, in contrast to the current static 'harmonic continuity' defaults. Users, and automatic scoring systems, need comprehensible ways to manipulate these parameters.
Emotional components of elements like harmony must be built from a larger framework. Musical streams like MIDI scores can be analyzed into descriptive layers representing traditional music analysis parameters. Chords are known to be stable and unstable, and harmonic motion conveys widely shared emotional meanings. The chart below presents the classic first order emotional model of triadic chord structures that we subconsiously share:
Facial Expression | Basic Emotion | Chord Quality |
Happy | Major | |
Sad | Minor | |
Concerned | Diminished | |
Surprised | Augmented |
This approach can be taken with other continuity parameters, and each can be treated more deeply. In the world of rhythm, one can apply a swing or a waltz to a melody by oscillating its tempo, and thus modulate its emotional effect or subtly provide continuity with a movie story's era. One theme may go through many such transformations in the course of a movie.
We aim to discover scoring genres that, while rooted in the great widely-known movie scoring traditions, exploit things that work reliably and which our software finds easy to do. We are especially intrigued by melodic and orchestral continuity rules: when the last note of one melody is also the first note of another one, and the instrument is the same, you can change melodies wildly and still seamlessly get away with it. This is the kind of discovery Carl Stalling made in the Looney Tunes, and with it he helped create a more exciting genre of movie scoring than we had before.