jeremy's neglected site

ffmpeg recipes

2014.02.16 13:46:06

The following are ffmpeg commands that I found useful for video editing (mainly for recorded screencasts and Youtube videos). I've given a rough description of each command, followed by my explanation of the commandline options, followed by the command itself. I assume you will modify the commands to suit your needs.

All commands based on version Mar 22 2013 08:56:38 of ffmpeg, which is a newer version compared to a lot of the help material I found for ffmpeg on the web. Most of these commands are tested on Windows versions of ffmpeg, despite that I often use Unix multi-line escapes to break up long commands.

Concatenate (join) a selection of videos (in different formats and containers) into one final video

This command concatenates and performs a lossy transcoding on the videos. For my purposes this is okay, but please know you will be degrading the quality of your videos by doing this.

The original template for this command came from the ffmpeg wiki page How to concatenate (join, merge) media files.

#    -i 2.mp4 -i 3.mp4 -i 4.mp4 -i 5.mp4 -i 6.mp4
#        The input videos. I have 5 videos in this command.
#        Input order is important when ffmpeg needs to reference
#        the video by index.
#        2.mp4 will be referenced by index 0 because it is first.
#        5.mp4, being the fourth video, is referenced by index 3.
#        ffmpeg uses 0-based indexing (as most nerd things do for
#        historical reasoning).
#    -filter_complex "[0:0] [0:1] [1:0] [1:1] [2:0] [2:1] [3:0] [3:1] [4:0] [4:1] concat=n=5:v=1:a=1 [v] [a]"
#        Perform a complex filtering of input to output.
#        Since we know 2.mp4 maps to index of zero, I could describe the
#        above as:
#
#        [0:0] [0:1]
#            Takes the video stream of 2.mp4, denoted as [0:0], and the audio
#            stream of 2.mp4, denoted as [0:1], and includes them in the concat
#            process. These values were true for my videos, although it is
#            possible you will be concatenating different streams than me.
#
#        And with that mapping, we are stating "Take the audio and
#        video streams of all of my input videos and..."
#
#        concat=n=5:v=1:a=1 [v] [a]
#            n=5 will need to be changed to correspond to the number of videos
#            you are concatenating (you have to be explicit). I had 5, so I
#            put 5. I'm telling ffmpeg that there is 1 video stream with v=1
#            and there is one audio stream with a=1 (I believe the order is
#            important, but I'm not entirely sure).
#            [v] is essentially a variable reference to the video stream
#            this is formed by the concatenation.
#            [a] is essentially a variable reference to the audio stream
#            this is formed by the concatenation.
#    -map "[v]" -map "[a]"
#        Use the concatenated streams for the output, not the original input
#        file streams as is usually assumed by ffmpeg. First stream will be
#        the video, second will be the audio.
#    -c:v libx264
#        Use the codec for h264 video encoding.
#    -crf 23
#        From the ffmpeg x264 encoding wiki:
#        crf stands for constant rate factor. The range of the
#        quantizer scale is 0-51; where 0 is lossless, 23 is
#        default, and 51 is worst possible.
#        Consider 18 to be visually lossless.
#    -preset slow
#        From the ffmpeg x264 encoding wiki:
#        A preset is a collection of options that will provide
#        a certain encoding speed to compression ratio.
#    -c:a aac
#        Use the codec for aac encoding of the audio.
#    -strict experimental
#        The aac encoder is, at the time of writing, considered experimental
#        and is not available by default unless we include the
#        -strict experimental flag.
#    -ac 2
#        Copy the audio as stereo.
#    -ar 44100
#        Audio sampling frequency of 44100 Hz.
#    -ab 128k
#        Encode for a transfer bitrate of 128k bits per second.
#    -threads 0
#        Let ffmpeg decide how many threads to use
#        for encoding (if it even chooses to use more than one).
#    output.mkv
#        Output into the Matroska container format
#        (detected by the file extension).

# Unix friendly command.
ffmpeg -i 2.mp4 -i 3.mp4 -i 4.mp4 -i 5.mp4 -i 6.mp4 \
-filter_complex '[0:0] [0:1] [1:0] [1:1] [2:0] [2:1] [3:0] [3:1] [4:0] [4:1] concat=n=5:v=1:a=1 [v] [a]' \
-map '[v]' -map '[a]' \
-c:v libx264 -crf 23 -preset slow \
-c:a aac -strict experimental -ac 2 -ar 44100 -ab 128k \
-threads 0 output.mkv

# DOS friendly command (because this command was picky about quotes.)
ffmpeg -i 2.mp4 -i 3.mp4 -i 4.mp4 -i 5.mp4 -i 6.mp4 ^
-filter_complex "[0:0] [0:1] [1:0] [1:1] [2:0] [2:1] [3:0] [3:1] [4:0] [4:1] concat=n=5:v=1:a=1 [v] [a]" ^
-map "[v]" -map "[a]" ^
-c:v libx264 -crf 23 -preset slow ^
-c:a aac -strict experimental -ac 2 -ar 44100 -ab 128k ^
-threads 0 output.mkv

Crop a section of a video

Sometimes I want to crop just a rectangle of the original video and turn that into a whole other video.

#    -i capture.avi
#        The input video.
#    -filter:v crop=1280:960:0:32
#        Apply a filter to the video streams.
#        Take a 1280 pixel wide by 960 pixel tall
#        video window from the input, and offset this
#        1280x960 rectangle 0 pixels from the left side
#        (aka. the cropping begins from the original left side)
#        and 32 pixels down from the top.
#    -c:v libx264
#        Use the libx264 video codec (h.264 encoder).
#    -preset slow
#        From the ffmpeg x264 encoding wiki:
#        A preset is a collection of options that will provide
#        a certain encoding speed to compression ratio.
#    -crf 18
#        From the ffmpeg x264 encoding wiki:
#        crf stands for constant rate factor. The range of the
#        quantizer scale is 0-51; where 0 is lossless, 23 is
#        default, and 51 is worst possible.
#        Consider 18 to be visually lossless.
#    -b:a 128k
#        Set audio bitrate to 128k.
#    -c:a libmp3lame
#        Use the lame library for MP3 encoding.
#    -threads 0
#        Let ffmpeg decide how many threads to use
#        for encoding (if it even chooses to use more than one).
#    -pix_fmt yuv420p
#        Pixel format / color scheme.
#        yuv420p seems to be a very well accepted color scheme,
#        and is also accepted by Youtube (I've been focusing on Youtube
#        friendly videos). After some encoding/decoding problems,
#        I've been sticking this in most of my reencodings just to
#        make sure.
#    output.mkv
#        Output into the Matroska container format
#        (detected by the file extension).

ffmpeg -i capture.avi -filter:v crop=1280:960:0:32 \
    -c:v libx264 -preset slow -crf 18 \
    -b:a 128k -c:a libmp3lame \
    -threads 0 -pix_fmt yuv420p output.mkv

DOSBox + Fraps conversions

Tools and setup

  • DOSBox version 0.74
    • The DOS emulator that allows me to play old DOS games.
    • Change the frameskip of a game to 2 (or maybe 1). This will cutdown on lag sync problems and keep the game from grinding the system to a hault.
    • Change the number of cycles devoted to the game (test out to see what works). For example, for my Warlords recording I gave the game 3000 fixed cycles. This seemed to work, somewhat.
    • Use the ddraw renderer.
    • If the game is old, try different scalings. I've used the hq2x and hq3x to great effect.
  • D-Fend Reloaded version 1.3.3
    • DOSBox frontend. Not required, but makes my life easier.
  • ffmpeg version Jan 20 2013 23:39:19
    • Commandline video file editor and converter.
  • Fraps version 3.5.9
    • I just couldn't get DOSBox video to sync to the voice over audio with Audacity, so I resorted to Fraps for ingame commentary.
  • Microphone
    • In the Windows Input Devices setup, I set the microphone properties to 2 channel, 16 bit, 44100 Hz.

Post processing with ffmpeg

Test: 45 minute example of Master of Magic (10000 cycles, 2 frameskip, ddraw renderer, hq3x scaling) produced an in sync audio+video recording at 13.6 GB in size.

The conversion below decreased the size of the video .5 GB and at the same time doubled the video scale, while also removing the black lines generated by the scaling of the hq3x filter in my monitor.

#    -i capture.avi
#        input file
#    -filter:v crop=1280:960:0:32
#        Take the video stream and crop it within a
#        box of dimensions 1280x960, and position
#        the box 0 pixels from the left and 32 pixels
#        from the top.
#    -sws_flags lanczos+full_chroma_inp
#        using lanczos with full chroma input flag
#    -s 2560x1920
#        Size to scale to
#        (which is double the cropped, scaled
#        video size of 1280x960).
#    -c:v libx264 -preset slow -crf 18
#        Use the x264 encoder and some settings
#        that I can't quite explain.
#    -b:a 128k
#        Set audio bitrate to 128k.
#    -c:a libmp3lame
#        Use the lame library for MP3 encoding.
#    -threads 0
#        ffmpeg decide how many threads to use
#        for encoding (if more than one).
#    -pix_fmt yuv420p
#        Pixel formatting for the video.
#    upload.mkv
#        The output file.

ffmpeg -i capture.avi -filter:v crop=1280:960:0:32 -sws_flags lanczos+full_chroma_inp -s 2560x1920 -c:v libx264 -preset slow -crf 18 -b:a 128k -c:a libmp3lame -threads 0 -pix_fmt yuv420p upload.mkv

Extract the audio as a wav file from a video

#    -i capture.avi
#        The original video.
#    -ac 2
#        Copy the audio as stereo.
#    -ar 44100
#        Audio sampling frequency of 44100 Hz.
#    -vn
#        Don't copy over any video.
#    capture_audio.wav
#        Save as a wav format audio encoding (determined by file extension).

ffmpeg -i capture.avi -ac 2 -ar 44100 -vn capture_audio.wav

Extract just the video without the audio

#    -i capture.avi
#        The original video.
#    -an
#        No audio is copied to the output.
#    -c:v copy
#        Copy the video streams without reencoding.
#    just_video.avi
#        Output into a new file.

ffmpeg -i capture.avi -an -c:v copy capture_video.avi

Iterate through a bunch of videos and convert them

# Windows Powershell
# For every MP4 file in the directory
# convert to h264
# copy the audio
# color scheme in yuv420p
# and output file with the same basename
# except switch container (and file extension) to Matroska.

for %F IN (*.mp4) DO (ffmpeg -i %F -c:v libx264 -preset slow -crf 23 -c:a copy -threads 0 -pix_fmt yuv420p %~nF.mkv)

# And unix
for f in *.mp4; do   
    ffmpeg -i $f -c:v libx264 -preset slow -crf 23 -c:a copy -threads 0 -pix_fmt yuv420p `basename $f .mp4`.mkv; 
done

Mix a separate audio and video file together into one output

#    -i capture_video.avi
#        The first input file.
#    -i remixed_audio.mp3
#        The second input file.
#    -map 0
#        Map all of the streams from the first
#        input into the output video, in order.
#    -map 1
#        Map all of the streams from the second
#        input into the output video, in order.
#    -c copy
#        Do not reencode any streams, use the same
#        encoding.
#    output.mkv
#        Output into the Matroska container format
#        (detected by the file extension).

ffmpeg -i capture_video.avi -i remixed_audio.mp3 \
    -map 0 -map 1 -c copy intermediate2.mkv

Upscale a video (make the pixel dimensions bigger in the output)

This particular formula worked for me when I was needing to upscale a video with a lot of text in it. The text was still readable. I'm not an expert on video editing, but after some reading, and given I was concerned with quality, it appears that the Lanczos filter is probably the best that ffmpeg has to offer for detailed things like text. The downside is that it is pretty slow.

#    -i capture.avi
#        The input video.
#    -sws_flags lanczos+full_chroma_inp
#        Use the lanczos filter with full chroma input
#        during the resize process.
#    -s 2560x1840
#        Width x Height in pixels to scale
#        the output video to.
#    -c:v libx264
#        Use the libx264 video codec (h.264 encoder).
#    -preset slow
#        From the ffmpeg x264 encoding wiki:
#        A preset is a collection of options that will provide
#        a certain encoding speed to compression ratio.
#    -crf 18
#        From the ffmpeg x264 encoding wiki:
#        crf stands for constant rate factor. The range of the
#        quantizer scale is 0-51; where 0 is lossless, 23 is
#        default, and 51 is worst possible.
#        Consider 18 to be visually lossless.
#    -b:a 128k
#        Set audio bitrate to 128k.
#    -c:a libmp3lame
#        Use the lame library for MP3 encoding.
#    -threads 0
#        Let ffmpeg decide how many threads to use
#        for encoding (if it even chooses to use more than one).
#    -pix_fmt yuv420p
#        Pixel format / color scheme.
#        yuv420p seems to be a very well accepted color scheme,
#        and is also accepted by Youtube (I've been focusing on Youtube
#        friendly videos). After some encoding/decoding problems,
#        I've been sticking this in most of my reencodings just to
#        make sure.
#    output.mkv
#        Output into the Matroska container format
#        (detected by the file extension).

ffmpeg -i capture.avi -sws_flags lanczos+full_chroma_inp \
    -s 2560x1840 -c:v libx264 -preset slow -crf 18 -b:a 128k \
    -c:a libmp3lame -threads 0 -pix_fmt yuv420p output.mkv

References

The following links were the ones that helped me the most with figuring out how to use ffmpeg.