Debugging C Programs with GDB – Part 1

When you write C code, you’re playing with power! You’re bound to let this power go to your head and shoot yourself in the foot here and there. At some point(s) your program is going to do something that just doesn’t quite make sense.

The bad news is that your program doesn’t make any sense because you’ve written flaws into it. That’s fine, you’ve either written janky C programs, or not written any C. The good news is that GDB is here to help us learn from our mistakes!

Through the next few posts I’ll share some tips on basic GDB usage, explore a bit of history and dig more into how the C programs on my machine are actually working.

Building for Debugging

To kick things off, I’m going to just slap together a quick C program and a Makefile to assist in building it and running my debugger.

// test.c
#include <stdio.h>

char *done = "Done!";

int main(int argc, char *argv[]) {
  int i;

  for (i = 0; i < 10; i++) {
    printf("Iteration %d\n", i);
  }
  printf("%s\n", done);

  return 0;
}

This program has a simple for loop and a few print statements and I’ll use GDB to inspect what it’s doing a bit more. To provide more information to the debugger about this program I’ll use the -g flag when building it.

# Makefile
CC=gcc -g -o $@ -Wall $<

all: test

test: test.c
  $(CC)

debug: test
  gdb -q ./test

For maximum laziness, I added a debug target to my Makefile here so that I can use make debug to jump right it. I gave gdb the -q option to quiet down since it normally has a lot to say on startup.

That’s about all I need to get my program ready for debugging!

Basic Commands

Now we get to the hard part. GDB has a bajillion features so getting started can be daunting. Probably one of the best commands to learn first is the run command, as so far the program has been looked at a little bit, but isn’t actually running at the moment.

You can also provide arguments to the program by providing arguments to run. This program doesn’t care about arguments, but don’t let that stop you from giving it some anyway!

The excitement of just running a program in GDB is very short lived, I want to be able to stop the program somewhere and poke around a bit. The list command can spit out a listing of the program.

Initially gdb will show the first 10 lines of the source. You could run list again to see the next 10 lines but GDB has a friendly feature where hitting enter will automatically rerun your last command, so I used that to continue reading the full source.

Looking at this listing, I think a good place to pause and look around would be at the printf() call within my for loop. To have GDB stop here I’ll use the break command and I’ll give it the argument 10 to indicate I’d like to set a breakpoint at line 10.

Now when I give it a run, it’ll stop the program when it hits that line.

To resume the program, until the next breakpoint is hit, you can use the continue command. Another little time-saver trick with gdb is that many commands have shortcuts, such as c for continue.

Peeking Into The Code

The ability to set breakpoints and resume execution is a good start, but even better is getting a look around at this point in time to glean more about what the program is doing. It’s time to start looking beyond the C code and see what the program is actually doing in assembly, the state of the CPU in the context of our program and what’s going on in memory.

First let’s look at the assembly version of the main function. I’ll use the disassemble command for that, and I’ll tell it that main is what I’m interested in disassembling.

Oh noes! Assembly!

Assembly code get’s a bad rep, but it’s not as bad as people think it is. You might not want to write a large application in assembly, and that’s reasonable, but if you want to be a strong C programmer you need to know enough assembly to figure out what your program is up to.

x86_64 assembly has two different syntaxes to choose from, AT&T syntax and Intel syntax. They both work just fine but GDB defaults to AT&T syntax and I prefer the Intel syntax so I’ll use the command set disassembly-flavor intel to get it to my liking.

That looks better! Now let’s briefly look at a few things. Looks like my main function is 21 instructions long, alright… a smidge more than half of the operations are mov (move) instructions and I see a few branching operations, jmp (jump), call (call a subroutine), jle (jump if less than or equal to) and ret (return from subroutine).

One thing I find interesting is the instruction at offset <+64>, call 0x400430 <puts@plt>. I did not use the puts() function in my code! The compiler caught on that my last printf() statement doesn’t need to be a format string and optimized the result a little bit.

Let’s get back to inspecting what this program is up to, I’m currently still in the middle of my paused program, and I’m at the very start of one of my loop iterations. In this disassembly output I can see I’m at offset <+24>, as indicated by the little => arrow, this is the next instruction the program will run.

The mov instruction moves a value from one place to another, similar to the assignment operator  = in most programming languages. In this case the full instruction is mov eax,DWORD PTR [rbp-0x4] which is basically eax = DWORD PTR [rbp - 0x4]. Ignoring the right side of that for now, we’re assigning a value to something called eax. This eax thing is a CPU register, which is basically a variable in the hardware of the CPU. We can look at all the registers with the info command by saying info registers.

Okay so there are a bunch of registers, and eax is not one of them… GREAT! This is because the x86 architecture has been through a lot, way back in the day (early 70s) Intel released their 8008 CPU that had some 8-bit registers with names like A (for Accumulator).

When Intel got to the 8086 in the late 70s they made the A register twice the size (16-bits) and started calling it the AX register. To help with software compatibility with older system the AX register could be used as an 8-bit register with AH representing the higher 8 bits and AL the lower 8 bits.

Then the mid-80s showed up and Intel was like MOAR BITS and released their 80386 that had 32-bit registers, now they refer to the A register as EAX (there’s our guy!), again preserving backward compatibility by allowing the 16 and 8 bit registers to remain the same. Now-a-days our 64-bit processors are king, so we have the 64-bit register RAX, but can still use EAX, AX, AH, and AL.

All that history lesson to give full context on why mov eax, <stuff>  is going to modify our rax register!

Now, to run just that one instruction, I’ll use the nexti command. I’ll then check the registers again with the shorthand version of info registers and just look at the eax register: i r eax

If I continue my program, I’ll notice that this number correlates with something in my program.

The eax register is getting set to the i value I’m setting during my for loop!

In the next post I’ll continue digging into this program and discover more about the disassembled version of my C program and show off some more GDB commands along the way!

Barebones Linux ISO

My post on Building a Barebones Linux System has been the most popular post on my blog so far. Last week a reader left a comment asking how the system described in that post could be loaded onto an ISO file along with GRUB to make it into a bootable image. I think that’s a pretty good follow-up so today I will share the process on how to do just that!

Continuing the Barebones Journey

It’s been a bit since I posted the original Barebones Linux post and in  Getting Busybox With It I added BusyBox to the system and structured my build into a Makefile to preserve that effort (available on my Github). That is what I’ll use as the starting point for this post.

Back then I used the Linux Kernel version 4.13.3 and BusyBox version 1.27.2, while I’m pulling this up again I’ll change a few of my Makefile variables to use the latest versions for today, Linux 4.15 and BusyBox 1.28.0.  After updating my versions I let the Makefile run and realized I needed to update my BusyBox config. I opted for the changes that the BusyBox build system suggested and gave it another test in QEMU to make sure it was all good to go.

Now to build an ISO for it! I looked to see what resources I could find on ow to get that done. I saw what osdev.org had to offer and found a lot of good info on how grub-mkresue works here. I took a peek at the grub2 manual and it looks easier than I expected it would be.

Creating a GRUB ISO

As an initial test I wanted to verify that a normal grub2 rescue image would book in QEMU. I used grub-mkrescue -o test.iso to build this image.

Then to test it I ran qemu-system-x86_64 -m 2048 -cdrom test.iso -boot d.

Easy enough. There is no grub.cfg so it gives you this basic rescue shell, but I’m glad it works. The grub documentation and examples I found show that you can give grub-mkrescue a directory that it will include in the build ISO and I decided to give that a shot.

I returned to my barebones repo directory and created a directory iso to store my files that I’d like to go in the ISO, and the boot/grub directory within that to store my grub config.

I copied my built vmlinuz and initramfs files to iso/boot/ and wrote the smallest grub.cfg to test it.

menuentry 'barebones' {
  linux	/boot/vmlinuz
  initrd /boot/initramfs
}

With all the files where they should be I run grub-mkrescue again to make the build.

Then I’ll give my barebones.iso a test in QEMU and there’s my menu entry:

Selected my barebones OS and let it do it’s thing:

Huzzah! IT’S ALIVE!

I thought there was going to be a bit more to this but I was pleasantly surprised!

Making with Make

I enjoy writing in a variety of low level and compiled languages. One of the tools I use almost every time regardless of the language, or sometimes mix of languages, is make. In this article, I want to share some of the ways that make can be used and some of the tips and tricks I employ the most when using make.

I’ll be writing and running these examples on my Ubuntu 16.04.3 laptop with GNU Make 4.1.

Make Without Makefiles

I almost always use make with a Makefile, as it’s significantly more customizable that way, but I think it’s important to know that make does have some usage you can employ even without a Makefile.

Without a Makefile, make can still follow some of it’s implicit rules for building some files. Let’s say we have the following test.cpp file ready to build:

#include <stdio.h>

int main() {
  printf("Hello from test.cpp!\n");
  return 0;
}

You could build the test program with a simple call to make test.

Or if you’d like to build a test.o object file, you can do make test.o and if that’s present when you do make test it’ll build the binary using the already built object file.

There is a pretty good smattering of things you can build this way, check the documentation if you’re interested in more detail on the rules and supported targets.



Looking at Makefiles

Most projects will have a depth beyond what make is able to determine with all it’s smarty-pantsness, and in those cases the beloved Makefile is there to make life easy.

If you’ve pulled down and built a common open source project and looked at the Makefile, or generated one with a tool like autotools or cmake, you may have looked at it with much confusion.

As an example I’ll look at libuv, my favorite cross platform asynchronous I/O library. After cloning the repo down, running ./autogen.sh to generate the build configuration script, then running the ./configure script I get a nearly 5000 line Makefile. To me it looks like mostly gibberish and some tests. In all fairness there is a lot of good things happening in there but it’s not good for learning how to write a Makefile.

A Makefile doesn’t always need to be that cray-cray. For my own projects I try to keep it pretty simple, though over time it generally becomes more complex. An example of mine, from my post on building a barebones Linux system, is on my github here. I’m not the only person crazy enough to stick with a handwritten Makefile; the Redis database also uses a handwritten Makefile, and Redis is production quality and awesome AF.

Let’s start looking at the basics of making your own Makefile!

Makefile Basics

Most of the time when you use make, it will be looking for a file named Makefile to find your targets. If you run make without a Makefile, you’ll be greeted with this lovely message:

make: *** No targets specified and no makefile found. Stop.

If you had say, an empty Makefile, you’ll see something along the lines of:

make: *** No targets. Stop.

The first thing you should be aware of regarding Makefile syntax is that tabs are part of the syntax! I’ve seen a few developers start building a Makefile and be like “WTF!” when nothing works because their text editor is configured to insert spaces when they hit tab.

For the record, I use both 😀

The general format of a Makefile is a a list of targets with optional dependencies and commands

<target>: [dependency] [dependency]
<tab>[command]
<tab>[command]

As an example, I’ll define a target test that will not have dependencies and that target will run some echo commands.

test:
  echo test!
  echo IT WERKS!!!!

With this in my Makefile, if I run make with no arguments it’ll run my first target test. The common convention is for a Makefiles to start with the target all.

If I add a second target moartest and I want to run that one, I’ll need to specify it during my command as make moartest.

test:
  echo test!
  echo IT WERKS!!!!

moartest:
  echo woah now, so fancy

Makefile Dependencies

One of my favorite things about make is the way it handles dependencies. If you’re using it for building a project you can organize the steps however you’d like and structure a hierarchy where one step runs before another.

I’ll extend my previous example to add my moartest target as a dependency of the test target.

test: moartest
  echo test!
  echo IT WERKS!!!!

moartest:
  echo woah now, so fancy

Now when I run make, test will be inspected since it is the first target and since test has moartest in its list of dependencies that make will first look for that target and if it’s commands execute successfully the commands for test will also be ran.

If for some reason the dependency commands should fail, make will error out at that point. To simulate this I will add an exit 1 command to my moartest target.

test: moartest
  echo test!
  echo IT WERKS!!!!

moartest:
  echo woah now, so fancy
  exit 1

If the target name is a file, and that file already exists, the target will be skipped. Here’s an example where my randomcrap target generates a file that’s a dependency of my test target.

test: randomcrap
  echo we have random!

randomcrap:
  dd if=/dev/urandom of=randomcrap bs=1024 count=1

A dependency doesn’t need to be a another target, in many cases it’s useful if a dependency is some source file. make will look to see if that source file has been updated and will re-run the target only when it seems necessary.

Consider this example:

test: copiedfile
  echo we have the latest copy!

copiedfile: originalfile
  cp originalfile copiedfile

And observe how make response to the absence of the source file, how it skips the file when it’s already the same as the original, and how updates to the original will be noticed during the subsequent run.



Variables in Makefiles

It is often useful to have some variables in your Makefile. Variables can be set with the NAME=VALUE syntax. In my first example here I’ll compile the following hello.c program:

#include <stdio.h>

int main(int argc, char *argv[]) {
  printf("Well hello you proverbial world you.\n");
  return 0;
}

To make my compilation of the program a bit more flexible I’ll make a COMPILER variable to setup what compiler I’d like it to use.

COMPILER = gcc

all: hello-world

hello-world: hello.c
  $(COMPILER) -o hello-world hello.c

Now if I wanted to switch my various build targets to use clang instead, I can just modify my COMPILER variable.

I could even move my program name and source file to their own variables, and reference that variable as my target and its dependencies.

COMPILER = clang
PROGRAM = hello-world
SOURCE = hello.c

all: $(PROGRAM)

$(PROGRAM): $(SOURCE)
  $(COMPILER) -o $(PROGRAM) $(SOURCE)

Outside of the variables you define yourself, there are also some automatic variables that can be pretty handy.

The three I use the most are $@ which if used in a command will be the name of the target, $< which will be the first dependency for that target and $^ which will be all of the dependencies for the target.

all: automagic

automagic: automation magic
  echo "target: $@"
  echo "first dependency: $<"
  echo "all dependencies: $^"

These variables can be combined in interesting and useful ways. The automatic variables can even be embedded in your normal variables. Let’s say we have some C program that has a header and a code file, you’d want to rebuild the program if the header was changed but not include the header as an argument to the compiler. You could make your own compiler rule that includes most of the settings you want and define a pattern where the first dependency is included in the commands for the target.

PROGRAM = myprogram
COMPILE_PROGRAM = gcc -Wall -o $@ $<

all: $(PROGRAM)

$(PROGRAM): main.c main.h
  $(COMPILE_PROGRAM)

Additional Command-fu

There are two other things I think that are useful to know when writing the commands for the targets.

So far we’ve seen all our output repeated, which is normally quite handy for debugging. If you feel like making your output a little prettier you can start your command with @ to squelch the output.

all:
  @echo "one moment"
  @sleep 5
  @echo "okay i'm back"
  @sleep 2

Another good thing to be aware of is that each command is ran from your current working directory. If you want to do something like make a directory, jump into it and do more work inside of it, you’ll have to run multiple commands in a single go.

all:
  @mkdir subdirectory
  @cd subdirectory
  @pwd
  @cd subdirectory; pwd

If you find yourself with really long lines in your Makefile you can always add a backslash (\) before your new line to ask the make parser to ignore that as you’re just trying to make things pretty.

And that will wrap up my post on Makefiles! I hope you find this useful and I’d love to receive your questions and feedback in the comments. Keep Tinkering!