Capturing screenshots and recordings with exwm

I've been using exwm as my window manager for a while now. I like the approach but there are a few rough edges. One of these was that I needed to implement something to take simple screenshots and screen recordings. Here's how I got it working.

1. The result

This image shows me pressing super-shift-4 to capture a screenshot by running a function md/screenshot-image-selection. After capture the image is automatically opened in firefox.

The video itself was taken by pressing super-shift-5 to trigger a function md/screenshot-video-selection-start, and then another key to run md/screenshot-video-stop to stop recording.

2. The exwm bindings

I wanted something similar to the MacOS cmd-shift-4/5 bindings, so I added them to exwm-input-global-keys:

(setq exwm-input-global-keys
      `(;; Various other keys...

        ;; Prompt for a selection and take a screenshot
        (,(kbd "s-$") . md/screenshot-image-selection)
        ;; Prompt for a selectoin and start a video
        (,(kbd "s-%") . md/screenshot-video-selection-start)
        ;; Stop the video
        (,(kbd "s-^") . md/screenshot-video-stop)))

3. The elisp functions

The functions themselves just call out to a script I wrote named ,screenshot:

(defun md/screenshot-image-selection ()
  (interactive)
  (shell-command ",screenshot --image-selection"))

(defun md/screenshot-video-selection-start ()
  (interactive)
  (shell-command ",screenshot --video-selection-start"))

(defun md/screenshot-video-stop ()
  (interactive)
  (shell-command ",screenshot --video-stop"))

4. The script

The ,screenshot script implements shell equivalents of all three elisp functions. This could easily be moved inline to the elisp code rather than in a separate script, but I wanted to be able to use it elsewhere:

#!/bin/bash
#
# Features for capturing the screen as image or video.

_THIS_DATE="$(date --iso-8601=second)"
_IMAGE_OUTPUT="/f/inbox/screenshots/${_THIS_DATE}.png"
_VIDEO_OUTPUT="/f/inbox/screenshots/${_THIS_DATE}.mp4"

function image-selection () {
    maim -s >"$_IMAGE_OUTPUT" && firefox "$_IMAGE_OUTPUT"
}

function video-selection-start () {
    # Use slop to grab screen area
    slop=$(slop -f "%x %y %w %h %g %i") || exit 1
    read -r X Y W H G ID < <(echo "$slop")

    # make the width + height divisble by 2 so ffmpeg doesn't error
    if ! [ $((W%2)) -eq 0 ]; then W=$((W+1)); fi
    if ! [ $((H%2)) -eq 0 ]; then H=$((H+1)); fi

    # start capturing video.
    # We use yuv420p here otherwise it can't be played by Firefox.
    # See https://bugzilla.mozilla.org/show_bug.cgi?id=1368063
    ffmpeg -f x11grab -s "$W"x"$H" -r 60 -i :0.0+"$X","$Y" -vcodec h264 -crf 18 -pix_fmt yuv420p -y "$_VIDEO_OUTPUT"  >> /tmp/ffmpg-record.log 2>&1 &

    # store pid
    echo $! >/tmp/ffmpeg-record.pid
    echo "$_VIDEO_OUTPUT" >/tmp/ffmpeg-record.filename
}

function video-stop() {
   pkill --signal INT --pidfile /tmp/ffmpeg-record.pid && firefox "$(cat /tmp/ffmpeg-record.filename)"
}

case "$1" in
    --image-selection) image-selection;;
    --video-selection-start) video-selection-start;;
    --video-stop) video-stop;;
    *) echo "argument invalid or not provided, exiting." && exit 1
esac

How screenshots work

The image-selection function calls out to simple terminal tool called maim, which prompts you for a selection/window and takes the screenshot.

How videos work

The video-selection-start function uses ffmpeg with x11grab as the input source. ffmpeg doesn't provide an easy way to choose what part of the screen to record - you instead have to pass coordinates as arguments. So we use slop to make a selection on the screen and extract the coordinates, and then pass those to ffmpeg.

Unlike screenshots, videos also need a stop instruction. ffmpeg expects to receive a SIGINT to stop recording. So we write two files to /tmp:

  1. A pid file. This allows us to stop the video by doing pkill --signal INT --pidfile /tmp/ffmpeg-record.pid
  2. The output path of the video. This allows us to automatically open the video with firefox.

5. FFmpeg gotchas

I ran into a few issues configuring ffmpeg:

Firefox doesn't recognise the video format

The default ffmpeg video output was failing to play on Firefox - "Video can't be played because the file is corrupt".

The problem turned out not to be that the file was corrupt, but that Firefox doesn't support ffmpeg's default YUV444 chroma subsampling setting for H.264. A workaround is to specify -pix_fmt yuv420p. (I'm not sure whether this affects codecs other than H.264).

Dimensions must be divisible by 2

Sometimes slop would produce an odd number for the input height, which is invalid for H.264 and causes ffmpeg to throw a "height not divisible by 2" error. There are a few solutions suggested on this thread, but for now I'm just adding 1 to my width/height if they're not even because I don't need the values to be exact.

Screen tearing

Finally I had an issue where during playback I was seeing screen tearing in the video. This wasn't specific to exwm or ffmpeg - it also occurred when I was using i3 and other video recording tools.

I was using picom as an X compositor, and was able to isolate the issue to only occur when picom was running with the glx backend and vsync=true. There seem to be a few options to fixing it - eg. setting vsync=false in picom, switching to the xrender backend instead of glx, or even disabling picom entirely. Right now I'm disabling it entirely as I don't notice much difference when it's on.

With this fixed everything is working well. You can find the code in my dotfiles.

2021-Jun-18