This chapter begins by addressing many of the basics of a Mac OS X system. This includes the general architecture and the tools necessary to deal with the architecture. It then addresses some of the security improvements that come with version 10.5 "Leopard", the most recent version of Mac OS X. Many of these security topics will be discussed in great detail throughout this book.
Basics
Before we dive into the tools, techniques, and security of Mac OS X, we need to start by discussing how it is put together. To understand the details of Leopard, you need first to understand how it is built, from the ground up. As depicted in Figure 1-1, Mac OS X is built as a series of layers, including the XNU kernel and the Darwin operating system at the bottom, and the Aqua interface and graphical applications on the top. The important components will be discussed in the following sections.
XNU
The heart of Mac OS X is the XNU kernel. XNU is basically composed of a Mach core (covered in the next section) with supplementary features provided by Berkeley Software Distribution (BSD). Additionally, XNU is responsible for providing an environment for kernel drivers called the I/O Kit. We'll talk about each of these in more detail in upcoming sections. XNU is a Darwin package, so all of the source code is freely available. Therefore, it is completely possible to install the same kernel used by Mac OS X on any machine with supported hardware; however, as Figure 1-1 illustrates, there is much more to the user experience than just the kernel.
From a security researcher's perspective, Mac OS X feels just like a FreeBSD box with a pretty windowing system and a large number of custom applications. For the most part, applications written for BSD will compile and run without modification on Mac OS X. All the tools you are accustomed to using in BSD are available in Mac OS X. Nevertheless, the fact that the XNU kernel contains all the Mach code means that some day, when you have to dig deeper, you'll find many differences that may cause you problems and some you may be able to leverage for your own purposes. We'll discuss some of these important differences briefly; for more detailed coverage of these topics, see Mac OS X Internals: A Systems Approach (Addison-Wesley, 2006).
Mach
Mach, developed at Carnegie Mellon University by Rick Rashid and Avie Tevanian, originated as a UNIX-compatible operating system back in 1984. One of its primary design goals was to be a microkernel; that is, to minimize the amount of code running in the kernel and allow many typical kernel functions, such as file system, networking, and I/O, to run as user-level Mach tasks. In earlier Mach-based UNIX systems, the UNIX layer ran as a server in a separate task. However, in Mac OS X, Mach and the BSD code run in the same address space.
In XNU, Mach is responsible for many of the low-level operations you expect from a kernel, such as processor scheduling and multitasking and virtual-memory management.
BSD
The kernel also involves a large chunk of code derived from the FreeBSD code base. As mentioned earlier, this code runs as part of the kernel along with Mach and uses the same address space. The FreeBSD code within XNU may differ significantly from the original FreeBSD code, as changes had to be made for it to coexist with Mach. FreeBSD provides many of the remaining operations the kernel needs, including
* Processes * Signals * Basic security, such as users and groups * System call infrastructure * TCP/IP stack and sockets * Firewall and packet filtering
To get an idea of just how complicated the interaction between these two sets of code can be, consider the idea of the fundamental executing unit. In BSD the fundamental unit is the process. In Mach it is a Mach thread. The disparity is settled by each BSD-style process being associated with a Mach task consisting of exactly one Mach thread. When the BSD fork() system call is made, the BSD code in the kernel uses Mach calls to create a task and thread structure. Also, it is important to note that both the Mach and BSD layers have different security models. The Mach security model is based on port rights, and the BSD model is based on process ownership. Disparities between these two models have resulted in a number of local privilege-escalation vulnerabilities. Additionally, besides typical system cells, there are Mach traps that allow user-space programs to communicate with the kernel.
I/O Kit
I/O Kit is the open-source, object-oriented, device-driver framework in the XNU kernel and is responsible for the addition and management of dynamically loaded device drivers. These drivers allow for modular code to be added to the kernel dynamically for use with different hardware, for example. The available drivers are usually stored in the /System/Library/Extensions/ directory or a subdirectory. The command kextstat will list all the currently loaded drivers,
$ kextstat
Index Refs Address Size Wired Name (Version)
Many of the entries in this list say they are loaded at address zero. This just
means they are part of the kernel proper and aren't really device drivers-i.e.,
they cannot be unloaded. The first actual driver is number 17.
Besides kextstat, there are other functions you'll need to know for loading
and unloading these drivers. Suppose you wanted to find and load the driver
associated with the MS-DOS file system. First you can use the kextfind tool to
find the correct driver.
$ kextfind -bundle-id -substring `msdos'
/System/Library/Extensions/msdosfs.kext
Now that you know the name of the kext bundle to load, you can load it into
the running kernel.
$ sudo kextload /System/Library/Extensions/msdosfs.kext
kextload: /System/Library/Extensions/msdosfs.kext loaded successfully
It seemed to load properly. You can verify this and see where it was loaded.
$ kextstat | grep msdos
126 0 0x346d5000 0xc000 0xb000
com.apple.filesystems.msdosfs (1.5.2) <7 6 5 2>
It is the 126th driver currently loaded. There are zero references to it (not surprising,
since it wasn't loaded before we loaded it). It has been loaded at address
0x346d5000 and has size 0xc000. This driver occupies 0xb000 wired bytes of
kernel memory. Next it lists the driver's name and version. It also lists the index
of other kernel extensions that this driver refers to-in this case, looking at the
full listing of kextstat, we see it refers to the "unsupported" mach, libkern, and
bsd drivers. Finally, we can unload the driver.
$ sudo kextunload com.apple.filesystems.msdosfs
kextunload: unload kext /System/Library/Extensions/msdosfs.kext
succeeded
Darwin and Friends
A kernel without applications isn't very useful. That is where Darwin comes
in. Darwin is the non-Aqua, open-source core of Mac OS X. Basically it is all
the parts of Mac OS X for which the source code is available. The code is made
available in the form of a package that is easy to install. There are hundreds of
available Darwin packages, such as X11, GCC, and other GNU tools. Darwin
provides many of the applications you may already use in BSD or Linux for
Mac OS X. Apple has spent significant time integrating these packages into
their operating system so that everything behaves nicely and has a consistent
look and feel when possible.
On the other hand, many familiar pieces of Mac OS X are not open source.
The main missing piece to someone running just the Darwin code will be Aqua,
the Mac OS X windowing and graphical-interface environment. Additionally,
most of the common high-level applications, such as Safari, Mail, QuickTime,
iChat, etc., are not open source (although some of their components are open
source). Interestingly, these closed-source applications often rely on open-source
software, for example, Safari relies on the WebKit project for HTML
and JavaScript rendering. For perhaps this reason, you also typically have
many more symbols in these applications when debugging than you would
in a Windows environment.
Tools of the Trade
Many of the standard Linux/BSD tools work on Mac OS X, but not all of them. If
you haven't already, it is important to install the Xcode package, which contains
the system compiler (gcc) as well as many other tools, like the GNU debugger
gdb. One of the most powerful tools that comes on Mac OS X is the object file
displaying tool (otool). This tool fills the role of ldd, nm, objdump, and similar
tools from Linux. For example, using otool you can use the -L option to get a
list of the dynamically linked libraries needed by a binary.
$ otool -L /bin/ls
/bin/ls:
/usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current
version 5.4.0)
/usr/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current version
1.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version
111.0.0)
To get a disassembly listing, you can use the -tv option.
$ otool -tv /bin/ps
/bin/ps:
(__TEXT,__text) section
00001bd0 pushl $0x00
00001bd2 movl %esp,%ebp
00001bd4 andl $0xf0,%esp
00001bd7 subl $0x10,%esp
...
You'll see many references to other uses for otool throughout this book.
Ktrace/DTrace
You must be able to trace execution flow for processes. Before Leopard, this
was the job of the ktrace command-line application. ktrace allows kernel trace
logging for the specified process or command. For example, tracing the system
calls of the ls command can be accomplished with
$ ktrace -tc ls
This will create a file called ktrace.out. To read this file, run the kdump
command.
$ kdump
918 ktrace RET ktrace 0
918 ktrace CALL execve(0xbffff73c,0xbffffd14,0xbffffd1c)
918 ls RET execve 0
918 ls CALL issetugid
918 ls RET issetugid 0
918 ls CALL
__sysctl(0xbffff7cc,0x2,0xbffff7d4,0xbffff7c8,0x8fe45a90,0xa)
918 ls RET __sysctl 0
918 ls CALL __sysctl(0xbffff7d4,0x2,0x8fe599bc,0xbffff878,0,0)
918 ls RET __sysctl 0
918 ls CALL
__sysctl(0xbffff7cc,0x2,0xbffff7d4,0xbffff7c8,0x8fe45abc,0xd)
918 ls RET __sysctl 0
918 ls CALL __sysctl(0xbffff7d4,0x2,0x8fe599b8,0xbffff878,0,0)
918 ls RET __sysctl 0
...
For more information, see the man page for ktrace.
In Leopard, ktrace is replaced by DTrace. DTrace is a kernel-level tracing
mechanism. Throughout the kernel (and in some frameworks and applications)
are special DTrace probes that can be activated. Instead of being an application
with some command-line arguments, DTrace has an entire language, called
D, to control its actions. DTrace is covered in detail in Chapter 4, "Tracing and
Debugging," but we present a quick example here as an appetizer.
$ sudo dtrace -n `syscall:::entry {@[execname] = count()}'
dtrace: description `syscall:::entry ` matched 427 probes
^C
fseventsd 3
socketfilterfw 3
mysqld 6
httpd 8
pvsnatd 8
configd 11
DirectoryServic 14
Terminal 17
ntpd 21
WindowServer 27
mds 33
dtrace 38
llipd 60
SystemUIServer 69
launchd 182
nmblookup 288
smbclient 386
Finder 5232
Mail 5352
Here, this one line of D within the DTrace command keeps track of the number
of system calls made by processes until the user hits Ctrl+C. The entire
functionality of ktrace can be replicated with DTrace in just a few lines of D.
Being able to peer inside processes can be very useful when bug hunting or
reverse-engineering, but there will be more on those topics later in the book.
Objective-C
Objective-C is the programming language and runtime for the Cocoa API used
extensively by most applications within Mac OS X. It is a superset of the C
programming language, meaning that any C program will compile with an
Objective-C compiler. The use of Objective-C has implications when applications
are being reverse-engineered and exploited. More time will be spent on
these topics in the corresponding chapters.
One of the most distinctive features of Objective-C is the way object-oriented
programming is handled. Unlike in standard C++, in Objective-C, class methods
are not called directly. Rather, they are sent a message. This architecture
allows for dynamic binding; i.e., the selection of method implementation occurs at
runtime, not at compile time. When a message is sent, a runtime function looks
at the receiver and the method name in the message. It identifies the receiver's
implementation of the method by the name and executes that method.
The following small example shows the syntactic differences between C++
and Objective-C from a source-code perspective.
#include
Here an interface is defined for the class Integer. An interface serves the role
of a declaration. The hyphen character indicates the class's methods.
#import "Integer.h"
@implementation Integer
- (int) integer
{
return integer;
}
- (id) integer: (int) _integer
{
integer = _integer;
}
@end
(Continues...)
Excerpted from The Mac Hacker's Handbook
by Charles Miller
Copyright © 2009 by Charles Miller.
Excerpted by permission.
All rights reserved. No part of this excerpt may be reproduced or reprinted without permission in writing from the publisher.
Excerpts are provided by Dial-A-Book Inc. solely for the personal use of visitors to this web site.