MediaFlow Functions
Media Application Programming Interface, Interval Research Corporation, 1996 - 1999
MediaFlow Functions
  • There are two kinds of functions in MediaFlow
    • C++ functions called in the usual C++ way
    • MediaFlow function objects (IFunctions)
  • A MediaFlow app can use either or both kinds
  • MediaFlow function objects
    • are intended to support scripting or vis. prog. interface
    • can be created automatically from C++ functions
    • let you specify the inputs to a function call one-by-one
    • perform type-checking at run-time, not compile-time
    • let you retrieve the type metadata for each input
    • serve as the function nodes in dependency graphs

Streams
  • All time-varying data are represented as streams.
  • A stream is a collection of homogeneous elements, each with an associated time span [start-time, stop-time]
  • Streams for different media types have different types of elements
    • video streams have images as elements
    • audio streams have audio chunks as elements
  • You can make a stream with any element type
  • Stream iterators allow you to position based on time

Live Streams vs Stored Streams
  • MediaFlow has two kinds of streams: live & stored
  • Stored streams have elements that can be retrieved at any time during execution of an app
  • Streams that represent media files are stored streams
  • Live streams have elements that depend on live information (info received during app execution)
  • Streams that represent input devices such as microphones and video cameras are live streams

Images and Video
  • MediaFlow provides an abstract Image type and a set of image processing functions.
  • The Image type is fairly general, and fast
  • Video stream = stream whose elements are images
  • Any image function can also be applied to video

Video/Image Transformation Functions
  • Zoom, Crop, Scale, Rotate
  • Change brightness
  • Add white noise
  • Clear to a constant
  • Multiply by a constant
  • Apply a median filter

Image Combiners
  • Average weighted images together
  • Composite images

Image Calculations and Alpha-Channel Specific Functions
  • Test for equality
  • Compute the average luminance
  • Count the frequency of pixels of a given color
  • Compute the motion energy between two images
  • Blur, dilate, or erode, in alpha, the edges of the foreground
  • Add or remove alpha channels
  • Set the color of pixels with a given alpha
  • Set the alpha of pixels of a given color

Video Functions
  • Resample a video stream to achieve constant frame rate (good cleanup for time-warped video)
  • "Shake" a video stream based on a numeric stream
  • Segment video via background subtraction

Audio
  • MediaFlow provides audio data types and a library of audio-processing functions
  • Audio stream is a stream whose elements are audio chunks
  • An audio chunk is an array of samples, roughly analogous to a frame of a video
  • Typical length ranges from 1/30th to 1/2 sec
  • MediaFlow provides easy, consistent access to the platform's audio input and output capabilities
  • MediaFlow supports both live and stored audio with the same core set of functions
  • MediaFlow provides a library of functions that analyze and process audio

Why Audio Chunks?
  • Some functions do not make sense for a single amplitude sample
  • Real-world digital audio I/O is often buffered
  • In many applications, time granularity on the order of 1/30th of a second is fine enough

Audio Capture/Recording
  • MediaFlow hides as many details as possible
  • Basic capture as a live audio stream is very easy
  • To record, simply copy audio chunks from a live stream to a stored stream

Summary of Audio Functions
  • Mixing and Routing
  • Amplitude / Dynamics
    • AudioAmplify, AudioGate, AudioLimitPeaks
    • AudioNormalizeLevels (simple version), AudioNormalizeLevels2 (more control)
    • AudioRampLevels (apply a simple envelope)
  • Simple Analysis
    • DetectSounds, ChunkedVolumeStream, SmoothVolumeStream, AudioStreamGetBias
    • AudioMinVolRMS, AudioMaxVolRMS, AudioMaxVolPeak
  • Spectral processing
  • Temporal manipulations
  • Feature extraction
    • speech properties (phoneme, pitch, etc.), event recognition