Building a Text Editor in C: A Deep Dive into Terminal Control
Welcome back to our C-Language Series! Today, we're embarking on an exciting journey that takes us beyond typical command-line applications and into the realm of interactive terminal programs. We're going to build the foundational pieces of our very own text editor, right from scratch, in C.
Creating a text editor is a classic rite of passage for many programmers. It forces you to confront low-level system interactions, understand how terminals truly work, and manage intricate data structures. While we won't build a feature-complete editor like Vim or Emacs in a single post, we'll lay down the essential groundwork for:
- Enabling raw terminal mode for direct key input.
- Manipulating the cursor and screen using ANSI escape codes.
- Reading individual key presses, including special keys.
- Displaying a basic interactive interface.
This project will not only solidify your C programming skills but also give you a profound appreciation for the underlying mechanisms that modern applications abstract away.
The Challenge: Interacting Directly with the Terminal
Typically, when you run a C program in your terminal, the terminal itself handles a lot of the input and output processing. This is known as "cooked mode" or "canonical mode." In cooked mode:
- Input is buffered until you press Enter.
- Characters are automatically echoed back to the screen.
- Special key combinations like Ctrl+C (SIGINT) are handled by the terminal driver.
For a text editor, this behavior is unacceptable. We need to:
- Read each character as it's typed.
- Prevent automatic echoing.
- Handle special key combinations ourselves (e.g., arrow keys, Ctrl+S for saving).
- Position the cursor and clear parts of the screen at will.
To achieve this, we must switch the terminal into "raw mode."
Step 1: Enabling Raw Mode
Changing terminal attributes involves using functions from the <termios.h> header. This header provides the termios structure and functions like tcgetattr() (get terminal attributes) and tcsetattr() (set terminal attributes).
First, let's include necessary headers and define an error handling macro for convenience:
#include <stdio.h>
#include <stdlib.h>
#include <termios.h>
#include <unistd.h> // For STDIN_FILENO
// Macro for error handling
void die(const char *s) {
perror(s);
exit(1);
}
Now, let's create functions to enable and disable raw mode. It's crucial to restore the original terminal attributes when our editor exits, otherwise, the user's terminal will be left in a broken state.
struct termios orig_termios; // Global variable to store original terminal attributes
void disableRawMode() {
if (tcsetattr(STDIN_FILENO, TCSAFLUSH, &orig_termios) == -1)
die("tcsetattr");
}
void enableRawMode() {
if (tcgetattr(STDIN_FILENO, &orig_termios) == -1)
die("tcgetattr");
atexit(disableRawMode); // Ensure disableRawMode is called on program exit
struct termios raw = orig_termios;
// Input flags:
// ICRNL: Fixes Ctrl-M (carriage return) which reads as 13 instead of 10.
// IXON: Disables Ctrl-S (stop transmission) and Ctrl-Q (start transmission).
// ECHO: Disables echoing of characters.
// ISIG: Disables Ctrl-C (SIGINT) and Ctrl-Z (SIGSTOP).
// IEXTEN: Disables Ctrl-V (literal character) and Ctrl-O.
raw.c_iflag &= ~(BRKINT | ICRNL | INPCK | ISTRIP | IXON);
// Output flags:
// OPOST: Disables output processing (e.g., \n to \r\n conversion).
raw.c_oflag &= ~(OPOST);
// Local flags:
// ECHO: Disable character echoing.
// ICANON: Disable canonical mode (read input byte-by-byte).
// IEXTEN: Disable Ctrl-V and Ctrl-O.
// ISIG: Disable Ctrl-C and Ctrl-Z signals.
raw.c_lflag &= ~(ECHO | ICANON | IEXTEN | ISIG);
// Control characters:
// VMIN: Minimum number of bytes for read() to return. Set to 0.
// VTIME: Maximum time to wait for input (in tenths of a second). Set to 1.
// This makes read() return as soon as there is input, or after 0.1 seconds.
raw.c_cc[VMIN] = 0;
raw.c_cc[VTIME] = 1;
if (tcsetattr(STDIN_FILENO, TCSAFLUSH, &raw) == -1)
die("tcsetattr");
}
A quick test in your main function:
int main() {
enableRawMode();
// Loop to read characters, pressing 'q' to quit
while (1) {
char c;
if (read(STDIN_FILENO, &c, 1) == -1 && errno != EAGAIN) die("read");
if (iscntrl(c)) { // Check if it's a control character
printf("%d\r\n", c); // Print its ASCII value
} else {
printf("%d ('%c')\r\n", c, c); // Print ASCII and character
}
if (c == 'q') break;
}
return 0;
}
Compile and run this. You should notice that characters are no longer echoed automatically, and pressing 'q' will exit the program.
Step 2: Basic Screen Manipulation with ANSI Escape Codes
ANSI escape codes are sequences of characters that, when printed to the terminal, don't display as text but instead control the terminal's behavior. We'll use these to clear the screen and position the cursor.
\x1b[2J: Clears the entire screen.\x1b[H: Positions the cursor at row 1, column 1 (top-left corner).
Let's create a function to refresh the screen and draw some placeholder content:
void editorRefreshScreen() {
// Clear the screen
write(STDOUT_FILENO, "\x1b[2J", 4);
// Position cursor at top-left
write(STDOUT_FILENO, "\x1b[H", 3);
int y;
for (y = 0; y < 24; y++) { // Assuming a 24-line terminal for now
write(STDOUT_FILENO, "~\r\n", 3); // Draw a tilde, then carriage return and newline
}
// Position cursor at top-left again after drawing rows
write(STDOUT_FILENO, "\x1b[H", 3);
}
Modify your main loop to call editorRefreshScreen():
#include <errno.h> // For errno and EAGAIN
// ... (enableRawMode, disableRawMode, die, editorRefreshScreen functions) ...
int main() {
enableRawMode();
while (1) {
editorRefreshScreen(); // Refresh screen at the start of each loop iteration
char c;
if (read(STDIN_FILENO, &c, 1) == -1 && errno != EAGAIN) die("read");
if (c == 'q') break;
}
return 0;
}
Now, when you run the program, the screen will be cleared, you'll see tilde characters, and 'q' still exits.
Step 3: Reading Keypresses and Handling Special Keys
Reading a single character is straightforward with read(). However, special keys like arrow keys, Home, End, etc., are often sent as "escape sequences" – multiple characters starting with the escape character (ASCII 27, \x1b).
We'll create a editorReadKey() function to abstract this:
enum editorKey {
ARROW_LEFT = 1000, // Assign high values to avoid conflict with actual characters
ARROW_RIGHT,
ARROW_UP,
ARROW_DOWN,
PAGE_UP,
PAGE_DOWN,
HOME_KEY,
END_KEY,
DEL_KEY
};
int editorReadKey() {
int nread;
char c;
while ((nread = read(STDIN_FILENO, &c, 1)) != 1) {
if (nread == -1 && errno != EAGAIN) die("read");
}
if (c == '\x1b') { // If it's an escape character
char seq[3]; // Max 3 chars for common escape sequences
// Try to read two more bytes
if (read(STDIN_FILENO, &seq[0], 1) != 1) return '\x1b';
if (read(STDIN_FILENO, &seq[1], 1) != 1) return '\x1b';
// Check for specific escape sequences
if (seq[0] == '[') {
if (seq[1] >= '0' && seq[1] <= '9') { // Page Up/Down, Home/End, Del
if (read(STDIN_FILENO, &seq[2], 1) != 1) return '\x1b';
if (seq[2] == '~') {
switch (seq[1]) {
case '1': return HOME_KEY;
case '3': return DEL_KEY;
case '4': return END_KEY;
case '5': return PAGE_UP;
case '6': return PAGE_DOWN;
case '7': return HOME_KEY; // Fallback for some terminals
case '8': return END_KEY; // Fallback for some terminals
}
}
} else { // Arrow keys
switch (seq[1]) {
case 'A': return ARROW_UP;
case 'B': return ARROW_DOWN;
case 'C': return ARROW_RIGHT;
case 'D': return ARROW_LEFT;
case 'H': return HOME_KEY; // Some terminals use H for Home
case 'F': return END_KEY; // Some terminals use F for End
}
}
} else if (seq[0] == 'O') { // Older/different terminals for Home/End
switch (seq[1]) {
case 'H': return HOME_KEY;
case 'F': return END_KEY;
}
}
return '\x1b'; // If unrecognized escape sequence, return escape char
} else {
return c; // Regular character
}
}
Step 4: Editor Structure and Processing Keypresses
Let's define a global structure to hold our editor's state, such as cursor position, and a function to initialize it. For this post, we'll only track the cursor's X and Y coordinates (cx, cy).
struct editorConfig {
int cx, cy; // Cursor x and y position
// Add other state variables later (e.g., screen rows/cols, text buffer)
struct termios orig_termios_backup; // Backup original termios (for global access)
};
struct editorConfig E; // Global editor state
void initEditor() {
E.cx = 0;
E.cy = 0;
// Potentially get actual window size here later
}
Now, a function to process incoming keypresses:
void editorMoveCursor(int key) {
switch (key) {
case ARROW_LEFT:
if (E.cx != 0) E.cx--;
break;
case ARROW_RIGHT:
E.cx++; // We'll add bounds checking later
break;
case ARROW_UP:
if (E.cy != 0) E.cy--;
break;
case ARROW_DOWN:
E.cy++; // We'll add bounds checking later
break;
}
}
void editorProcessKeypress() {
int c = editorReadKey();
switch (c) {
case 'q':
// Later, we'll ask user to save changes before quitting
exit(0);
break;
case ARROW_UP:
case ARROW_DOWN:
case ARROW_LEFT:
case ARROW_RIGHT:
editorMoveCursor(c);
break;
}
}
Step 5: Updating the Screen with Cursor Position
Our editorRefreshScreen() needs to be updated to use the stored cursor position and place the cursor accordingly. The ANSI escape code for positioning the cursor is \x1b[{ROW};{COL}H. Remember that ANSI rows/columns are 1-indexed, while our cx/cy are 0-indexed.
#include <string.h> // For snprintf
void editorRefreshScreen() {
write(STDOUT_FILENO, "\x1b[?25l", 6); // Hide cursor (optional, but good for drawing)
write(STDOUT_FILENO, "\x1b[H", 3); // Position cursor at top-left
int y;
for (y = 0; y < 24; y++) {
write(STDOUT_FILENO, "~\r\n", 3);
}
// Position cursor at E.cx, E.cy
char buf[32];
snprintf(buf, sizeof(buf), "\x1b[%d;%dH", E.cy + 1, E.cx + 1);
write(STDOUT_FILENO, buf, strlen(buf));
write(STDOUT_FILENO, "\x1b[?25h", 6); // Show cursor
}
Putting It All Together: The Main Loop
Finally, we integrate all these pieces into our main function.
#include <stdio.h>
#include <stdlib.h>
#include <termios.h>
#include <unistd.h>
#include <errno.h>
#include <string.h> // For snprintf
// --- Defines and Global Variables ---
#define KILO_VERSION "0.0.1" // Editor version (or your editor's name)
void die(const char *s) {
perror(s);
exit(1);
}
struct editorConfig {
int cx, cy; // Cursor x and y position
struct termios orig_termios; // Original terminal attributes
};
struct editorConfig E;
enum editorKey {
ARROW_LEFT = 1000,
ARROW_RIGHT,
ARROW_UP,
ARROW_DOWN,
PAGE_UP,
PAGE_DOWN,
HOME_KEY,
END_KEY,
DEL_KEY
};
// --- Terminal Input/Output ---
void disableRawMode() {
if (tcsetattr(STDIN_FILENO, TCSAFLUSH, &E.orig_termios) == -1)
die("tcsetattr");
}
void enableRawMode() {
if (tcgetattr(STDIN_FILENO, &E.orig_termios) == -1)
die("tcgetattr");
atexit(disableRawMode);
struct termios raw = E.orig_termios;
raw.c_iflag &= ~(BRKINT | ICRNL | INPCK | ISTRIP | IXON);
raw.c_oflag &= ~(OPOST);
raw.c_lflag &= ~(ECHO | ICANON | IEXTEN | ISIG);
raw.c_cc[VMIN] = 0;
raw.c_cc[VTIME] = 1;
if (tcsetattr(STDIN_FILENO, TCSAFLUSH, &raw) == -1)
die("tcsetattr");
}
int editorReadKey() {
int nread;
char c;
while ((nread = read(STDIN_FILENO, &c, 1)) != 1) {
if (nread == -1 && errno != EAGAIN) die("read");
}
if (c == '\x1b') {
char seq[3];
if (read(STDIN_FILENO, &seq[0], 1) != 1) return '\x1b';
if (read(STDIN_FILENO, &seq[1], 1) != 1) return '\x1b';
if (seq[0] == '[') {
if (seq[1] >= '0' && seq[1] <= '9') {
if (read(STDIN_FILENO, &seq[2], 1) != 1) return '\x1b';
if (seq[2] == '~') {
switch (seq[1]) {
case '1': return HOME_KEY;
case '3': return DEL_KEY;
case '4': return END_KEY;
case '5': return PAGE_UP;
case '6': return PAGE_DOWN;
case '7': return HOME_KEY;
case '8': return END_KEY;
}
}
} else {
switch (seq[1]) {
case 'A': return ARROW_UP;
case 'B': return ARROW_DOWN;
case 'C': return ARROW_RIGHT;
case 'D': return ARROW_LEFT;
case 'H': return HOME_KEY;
case 'F': return END_KEY;
}
}
} else if (seq[0] == 'O') {
switch (seq[1]) {
case 'H': return HOME_KEY;
case 'F': return END_KEY;
}
}
return '\x1b';
} else {
return c;
}
}
// --- Editor Operations ---
void editorMoveCursor(int key) {
switch (key) {
case ARROW_LEFT:
if (E.cx != 0) E.cx--;
break;
case ARROW_RIGHT:
E.cx++; // Simplified, will add bounds checking
break;
case ARROW_UP:
if (E.cy != 0) E.cy--;
break;
case ARROW_DOWN:
E.cy++; // Simplified, will add bounds checking
break;
}
}
// --- Editor Drawing ---
void editorDrawRows() {
int y;
for (y = 0; y < 24; y++) { // Hardcoded 24 lines for now
if (y == 24 / 3) { // Simple "welcome message"
char welcome[80];
int welcomelen = snprintf(welcome, sizeof(welcome),
"Kilo editor -- version %s", KILO_VERSION);
if (welcomelen > 80) welcomelen = 80;
int padding = (80 - welcomelen) / 2;
if (padding) {
write(STDOUT_FILENO, "~", 1);
padding--;
}
while (padding--) write(STDOUT_FILENO, " ", 1);
write(STDOUT_FILENO, welcome, welcomelen);
} else {
write(STDOUT_FILENO, "~", 1);
}
write(STDOUT_FILENO, "\x1b[K", 3); // Clear line from cursor to end
write(STDOUT_FILENO, "\r\n", 2);
}
}
void editorRefreshScreen() {
write(STDOUT_FILENO, "\x1b[?25l", 6); // Hide cursor
write(STDOUT_FILENO, "\x1b[H", 3); // Position cursor at top-left
editorDrawRows();
char buf[32];
snprintf(buf, sizeof(buf), "\x1b[%d;%dH", E.cy + 1, E.cx + 1);
write(STDOUT_FILENO, buf, strlen(buf));
write(STDOUT_FILENO, "\x1b[?25h", 6); // Show cursor
}
// --- Input Processing ---
void editorProcessKeypress() {
int c = editorReadKey();
switch (c) {
case 'q':
case CTRL_KEY('q'): // Using Ctrl+Q to quit
exit(0);
break;
case ARROW_UP:
case ARROW_DOWN:
case ARROW_LEFT:
case ARROW_RIGHT:
editorMoveCursor(c);
break;
}
}
// --- Init ---
void initEditor() {
E.cx = 0;
E.cy = 0;
// We'll get actual screen dimensions later
}
// --- Main ---
int main() {
enableRawMode();
initEditor();
while (1) {
editorRefreshScreen();
editorProcessKeypress();
}
return 0;
}
To compile and run:
gcc -Wall -Wextra -pedantic editor.c -o editor
./editor
You should now have a basic terminal application that clears the screen, displays tildes, shows a welcome message, and allows you to move a blinking cursor around with the arrow keys. Press 'q' or 'Ctrl+Q' to quit.
Note: The CTRL_KEY macro is missing in the example code above for brevity, but you'd typically define it as #define CTRL_KEY(k) ((k) & 0x1f). Adding this definition would allow you to use CTRL_KEY('q').
Beyond the Basics: What's Next?
This post provides a foundational understanding of how to interact with the terminal at a low level. A real text editor, however, requires much more:
- Terminal Size Detection: Dynamically getting the actual rows and columns of the terminal (using
ioctlandTIOCGWINSZ). - Text Buffer: An efficient data structure to store and manage the actual text content (e.g., an array of strings, a linked list of lines).
- Scrolling: Handling text that extends beyond the screen's visible area.
- File I/O: Loading content from files and saving changes back.
- Editing Operations: Inserting, deleting, and modifying characters and lines.
- Status Bar and Message Bar: Displaying information to the user.
- Syntax Highlighting: Making code more readable.
- Search and Replace: Functionality to find and modify text.
Building a text editor is a truly rewarding project that teaches you a lot about operating systems, data structures, and user interfaces. This is just the beginning of what you can achieve by leveraging the power of C and direct terminal control.