Build your own text editor

I recently followed snaptoken's build your own text editor booklet, which talks you through building a basic text editor in about 1000 lines of C (the kilo editor, written by Antirez). It was fun, and I'd recommend it to anybody who either (1) is interested in how graphical terminal programs work, or (2) wants to play a bit with C.

the kilo editor

1. What was in the chapters?

Roughly in order, the steps were:

  • Write a main loop that uses read() to respond to input from stdin.
  • Put the terminal into "raw" mode - disable echoing, read one keypress at a time, etc. Save and restore the terminal configuration on program exit.
  • Add cursor movement.
  • Add file I/O and the ability to view files.
  • Add scrolling for when the file is bigger than the screen size.
  • Add a "rendering" translation layer which case be used to eg. display \t as a fixed number of spaces.
  • Add a status bar that shows the filename, current line etc. Also add a message area that can display user messages.
  • Add the ability to insert and delete text, with a "dirty" flag that tell the user if the buffer has been modified since last save.
  • Add a generic prompt, and then use it to implement incremental search.
  • Add basic syntax highlighting, which is triggered by filetype detection. This only supports C files, but can be extended.

2. The program structure

It was quite simple. There were two main data structures: the global editor, and the row. The editor kept an array of pointers to rows, each row representing a single line of text (plus some metadata, eg. the row size). The editor also kept track of the cursor position, the file offset for scrolling, the current status message, etc.

The program roughly just did:

main loop:
  read keypress;
  in response to keypress:
    update global editor state and row state;
    maybe quit if "q" was pressed;
  refresh screen with latest state;

The terminal interaction and cursor movement is all done using VT100 terminal escape sequences. I'm not sure how portable this is, but in practice I think it works for the few terminal emulators that I use.

The syntax highlighting has one feature that can tokenize across multiple lines: it recognises when comments begin and end. It is otherwise pretty naive, just matching keywords, strings and numbers. Having said that, at a glance it looks pretty similar to the highlighting I see in Vim.

Many of the functions operate on the global editor state. If I was going to seriously work on this project, I'd want to rewrite some of them to accept the editor as an argument rather than all mutating a single variable.

3. It was easy to extend the project

I added a few features that I use in Vim and Emacs (see Github):

  • Splitting user input into normal and insert modes.
  • Word-based cursor movement that is normally found with w/W/b/B
  • A new prompt to simulate :wq and :q!.
  • Standard cursor movement with hjkl, ^/$, C-f/C-b, gg and G.
  • Using dd to remove lines, and J to join lines.
  • Adding the jj and jk bindings that I use in insert mode to exit to normal mode (which means waiting for a follow-up key to j, and inserting it into the row if it doesn't come after a set timeout).

I was surprised at how much it felt like my usual environment for file browsing and basic editing. Although my implementation wasn't extensible or composable, most of the time I just rely on the same few bindings.

Given that my personal Emacs config has grown to about 5x the size of this program, it almost seems worth just writing the features I want from scratch!

4. I did it as literate programming with org-mode

I wanted to try writing notes alongside the code as I progressed, using org-mode. I compiled kilo from my README.org file, which can be done in a couple of lines of lisp:

;; Don't ask me to confirm each time I evaluate the file
(setq-local org-confirm-babel-evaluate nil)
;; concatenate all embedded C snippets to kilo-org.c
(org-babel-tangle nil "kilo-org.c" "c")
;; run make
(compile "make")

In the end I didn't find it very useful - it hid the actual source code too much, which made it harder to refactor and jump between sections of code quickly. I wonder if the org-mode approach is more useful for detailing one-off scripts and troubleshooting.

5. What similar projects exist?

openemacs is a small fork of kilo that implements some Emacs navigation features, which is worth reading if you're interested in modifying kilo.

The other good "build something from scratch" project that I've followed was The Elements of Computing Systems, where you build a (virtual) computer from first principles.

The Destroy All Software "From Scratch" screencasts follow the same idea, and are pretty enjoyable.

I'm interested to know what else is out there. Although not so interested to have ever searched for it. If you know of anything good, let me know.

2020-Jan-22