• he/him

Chronicling writing a simple NES emulator to learn Swift, macOS APIs, and retro architecture.


Now that there are registers and memory, it'd be nice if they did something, so next up is actually implementing the CPU functionality. The 6502s functionality is pretty basic, with only 56 different operations. Each can have several different addressing modes, but there's only 151 8-bit opcodes defined. The opcodes then have 0-2 extra bytes for operands, depending on the addressing mode. Let's get some definitions for the various addressing modes, using Swift's enum.

    enum AddressingMode: String {
        // zero bytes
        case implied
        case implied_a // same, but uses the accumulator
        
        // one byte - $nn
        case immediate // $#nn
        case zeropage // $00nn
        case zeropage_x // $00nn + x, no carry
        case zeropage_y // $00nn + y, no carry, LDX/STX only
        case indirect_pre_x // ($00xx + x), no carry
        case indirect_post_y // ($00xx) + y
        case relative // PC + signed $nn
        
        // two bytes - $ll $hh
        case absolute // $hhll
        case absolute_x // $hhll + X
        case absolute_y // $hhll + Y
        case indirect_absolute // ($hhll), JMP only
    }
syntax highlighting by codehost

These group the standard addressing modes by the number of additional bytes they take. 0 for the ones where the operand is completely encoded into the opcode (such as CLC for clear carry). 1 for immediates (like LDA #30, loading 30 into the accumulator) or zero page accesses (LDA $30, loading the data from memory location 0x0030) and the various zero page indexed and indirect modes, or for relative branches. And 2 for full 16-bit addresses (LDA $2000).

These aren't all strictly standard nomenclature, but the official terms "indexed indirect" and "indirect indexed" are not as clear as I think they should be, so I've called them "indirect pre x" and "indirect post y" instead.

With that, we're ready to implement some opcodes. Let's gather that information together is a nice struct:

struct Opcode {
    typealias Handler = (Environment, Opcode) -> Void
    typealias Map = [UInt8: AddressingMode]
    
    let name: Name
    let code: UInt8
    let mode: AddressingMode
    let handler: Handler
   
    enum Name: String {
        // Load and store
        case LDA
        case LDX
        case LDY
        case STA
        case STX
        case STY
        ...
    }
}

Here, I'm trying to make the type system my friend. Restrict the list of names using an enum, and define the type of a handler function. I tried to use some fancy features, like Swift's ResultBuilders, but in the end the simplest way to dispatch my opcodes seemed to be to just make a big table mapping bytes to functions (in this case, via their Opcode structs). Using the helper type Opcode.map, I defined some metadata, and a handler, and built a giant table out of them. Some sample opcodes for loading the registers:

        // MARK: Load and Store
        static let lda_map: Opcode.Map = [
            0xA9: .immediate,
            0xA5: .zeropage,
            0xB5: .zeropage_x,
            0xAD: .absolute,
            0xBD: .absolute_x,
            0xB9: .absolute_y,
            0xA1: .indirect_pre_x,
            0xB1: .indirect_post_y
        ]
        static func lda_handler(env: Environment, opcode: Opcode) {
            env.reg.a = loadValue(env: env, mode: opcode.mode)
            env.reg.p.setNZ(fromValue: env.reg.a)
        }
        
        static let ldx_map: Opcode.Map = [
            0xA2: .immediate,
            0xA6: .zeropage,
            0xB6: .zeropage_y,
            0xAE: .absolute,
            0xBE: .absolute_y
        ]
        static func ldx_handler(env: Environment, opcode: Opcode) {
            env.reg.x = loadValue(env: env, mode: opcode.mode)
            env.reg.p.setNZ(fromValue: env.reg.x)
        }
        
        static let ldy_map: Opcode.Map = [
            0xA0: .immediate,
            0xA4: .zeropage,
            0xB4: .zeropage_x,
            0xAC: .absolute,
            0xBC: .absolute_x
        ]
        static func ldy_handler(env: Environment, opcode: Opcode) {
            env.reg.y = loadValue(env: env, mode: opcode.mode)
            env.reg.p.setNZ(fromValue: env.reg.y)
        }

Using some helper functions to abstract away the common features of most opcodes, most of the handlers are very short. For example, loadValue will, based on the addressing mode, read the required amount of bytes to find the relevant value (either an immediate, or calculate the address and fetch the value from there), and setNZ sets the negative and zero flags based on the value fetched.

        private static func loadValue(env: Environment, mode: AddressingMode) -> UInt8 {
            return
                if mode == .immediate { env.loadAndAdvancePC() }
                else { env.load(calculateAddress(env: env, mode: mode)) }
        }

        private static func calculateAddress(env: Environment, mode: AddressingMode) -> Address {
            let low = env.loadAndAdvancePC()
            
            // May read two bytes from environment, advancing PC to calculate the address
            switch(mode) {
            case .implied, .implied_a, .immediate:
                fatalError("Cannot generate an address for \(mode)")
                
            // Zero page address calculations should wrap without carry, so use &+ on the UInt8
            case .zeropage:
                return Address(low)
            case .zeropage_x:
                return Address(low &+ env.reg.x)
            case .zeropage_y:
                return Address(low &+ env.reg.y)
            ...
       }

Swift's enums make it easy to make sure all cases are handled. So far, this does run a simplified model of the 6502; it currently does not count cycles, and it doesn't account for all the false/superfluous memory accesses the real 6502 does. For example, due to its design requiring it to make a memory access every cycle, even if unneeded, and address calculations sometimes taking multiple cycles due to carries, the real 6502 can, when doing indexed addressing (like, LDA $10FF,X, with x = 1), read from the wrong page ($1000) before the carry propagates to the high byte and it reads again from the correct address ($1100). While this makes little difference for standard memory, on memory mapped devices it can cause side effects! A high-accuracy emulator would probably need to account for that.

With all 56 operations and all 151 bytes mapped, and writing a quick dispatcher to read from memory, look up an opcode, and run the handlers, I could run through the CPU test mentioned previously. It found a couple small bugs in my implementation (notably I forgot to change which flag CLV (clear overflow) cleared when copy/pasting), but shortly after I was able to pass everything but the decimal mode tests, which I have not yet implemented yet since the NES does not need them. Most opcodes are only 2 or 3 lines of code after factoring out the commonalities, with only ADC and BRK really taking much effort at all.

I will have to work on the simulator fidelity soon I imagine, but with that, it may be possible to start putting in the skeleton for the NES's PPU.



First, we need some memory to work with. At first, I'm just using 64 kB of read-write memory, but later on, we'll need to stick some memory mapped devices in there, so let's abstract that a little.

First, a type to represent addresses, which are just 16-bit integers on the 6502:

typealias Address = UInt16
syntax highlighting by codehost

And an interface (called a protocol in Swift) for doing accesses with these addresses.

protocol DataBus {
    // Non-destructive, for previews etc.
    subscript(address: Address) -> UInt8 { get set }
    // Destructive
    func load(_ address: Address) -> UInt8
    mutating func store(_ address: Address, _ value: UInt8)
}

Because in the future there may be memory mapped registers that do something when being read, I've split into destructive and non-destructive operations. Non-destructive reads for example, should just return the value, but without triggering any side-effects.1

For my first go, I've used an extension to allow the Swift Data type, which mostly just acts as an array of bytes, to allow subscripting with Address so it can conform to DataBus, and have a default implementation of load() and store() that just forwards to these, with some optional trace logging:

// Allow a Data to conform to the DataBus
extension Data: Hardware.DataBus {
    subscript(address: Address) -> UInt8 {
        get { self[Int(address)] }
        set { self[Int(address)] = newValue }
    }
}

// Default implementation for destructive load and store
extension Hardware.DataBus {
    func load(_ address: Address) -> UInt8 {
        let ret = self[address]
//        logger.trace("\(address, format: .hex(minDigits: 4)) R \(ret, format: .hex(minDigits: 2))")
        return ret
    }
    mutating func store(_ address: Address, _ value: UInt8) {
//        logger.trace("\(address, format: .hex(minDigits: 4)) W \(value, format: .hex(minDigits: 2))")
        self[address] = value
    }
}

With Data now conforming, I could load a memory image from a file, asset catalog, the web, anywhere, and use it as a memory image on the systems DataBus. Later, I can make more complex implementations that model RAM, memory mapped registers, ROM, bank switching, etc.

Now my 6502 model looks like this:

@Observable
class Environment {
    var reg = RegisterSet()
    private var bus: DataBus = Data(repeating: 0, count: 65536)
}

Now it has the registers, and a memory space to work on. In retrospect, Environment is not a great name for my hardware environment, as SwiftUI has its own concept of the Environment that views share, which mildly conflicts. It is also marked @Observable, which allows SwiftUI to get notified when things change and update the Views. This turned out to have implications I will have to deal with later.

Lastly, to actually test my implementation, I found a lovely memory image to test which purports to test all the opcodes from within by Klaus Dormann, at his GitHub. I could take the precompiled binary, and drop it into my asset catalog, and easily load it and use it as my DataBus.

func loadTest() {
    if let dataAsset = NSDataAsset(name: "6502-test") {
        if dataAsset.data.count == 65536 {
            bus = dataAsset.data
            logger.log("Test successfully loaded")
        } else {
            logger.log("Data was wrong size")
        }
    } else {
        logger.log("Couldn't load data")
    }
    reg.pc = 0x400
}

With this test suite, and several opcode references, I was ready to implement the functionality of the CPU.


  1. A friend pointed out after I posted this that destructive load should also be marked mutating, since it may change state in the future.



Turns out if I want to actually play with my 6502 simulator, I need an interface for it, so I spent some time learning a little bit of SwiftUI.

It's neat, if a bit opaque. I love the declarative model, where you just write out what you want and Swift figures out when to display or update.

struct RegisterView : View {
    let name: String
    let value: Int
    
    var body: some View {
        HStack() {
            Text(name)
            Spacer()
            Text(String(format: "$%02X", value))
                .fontDesign(.monospaced)
        }
    }
}
syntax highlighting by codehost

Put my virtual machine into the environment, stick a few of those views into my window, and a similar thing for the flag register, and ta-dah! I can now monitor my VM.

struct EnvironmentView: View {
    @Environment(Hardware.Environment.self) var env
    
    var body: some View {
        VStack {
            RegisterView(name: "A", value: Int(env.reg.a))
            RegisterView(name: "X", value: Int(env.reg.x))
            RegisterView(name: "Y", value: Int(env.reg.y))
            ...

The downside is that I feel SwiftUI is both large and extremely underdocumented. It is really hard to get a complete picture of all the widgets, modifiers, and such that are available. It's also extremely magic. Everything is a closure, but it's not actually Swift, per se. It's a variant of Swift (the Result Builder DSL support I guess), so some things are subtly different, like using the ForEach() function rather than just a for loop. And it changes, quickly. Even in that simple example, I'm using some features that are only available in macOS 14 (Sonoma) or iOS 17. Luckily, there are lots of tutorials and example code supplied by Apple and others which at least help give an intuitive understanding.

With this and a few more buttons, I should be able to implement and test some simple opcodes.



The NES has three major parts, the CPU, the Picture Processing Unit for generating graphics, and the Audio Processing Unit for sound. Let's start modelling the processor first, which is a slightly modified 6502 CPU.

So, let's start modelling the 6502 CPU in Swift. Its cost reduced design makes it extremely simple.

First, it has some registers. Only 5, plus flags:

struct RegisterSet {
    var a : UInt8 = 0 // Accumulator
    var x : UInt8 = 0 // Index X
    var y : UInt8 = 0 // Index Y
    
    var pc : UInt16 = 0 // Program Counter
    var sp : UInt8 = 0 // Stack Pointer
    var p : FlagSet = FlagSet() // Processor status / flags
}
syntax highlighting by codehost

And it has some six flags, plus one non physical-flag that only appears when the flags are pushed on the stack:

struct FlagSet {
    // Represent the flags of the processor
    var c = false // carry
    var z = false // zero
    var i = false // interrupt disable
    var d = false // decimal mode
    var b = true  // break, virtual flag, normally on, except during IRQ processing
    var v = false // signed overflow
    var n = false // negative
}

I've decided to model the flags as individual booleans for clarity of code. However, the processor occasionally needs to pack them as a byte when it stacks them for interrupt processing, or when the program uses the PHP or PLP opcodes to push or pull the flag register to/from the stack. Since the Swift standard library includes a type OptionSet for bitmasks, I'll try using that rather than writing the C style bitmasking myself. Let's make it a private nested type:

    // Values used for converting to and from a byte (ie, when putting on the stack)
    private struct FlagValue : OptionSet {
        // NV1B_DIZC
        let rawValue : UInt8
        
        static let c = FlagValue(rawValue: 0x01)
        static let z = FlagValue(rawValue: 0x02)
        static let i = FlagValue(rawValue: 0x04)
        static let d = FlagValue(rawValue: 0x08)
        static let b = FlagValue(rawValue: 0x10)
        static let always = FlagValue(rawValue: 0x20)
        static let v = FlagValue(rawValue: 0x40)
        static let n = FlagValue(rawValue: 0x80)
    }

The processor has a quirk that the unused bit of the bitmask is always pushed as a 1.
A quick function to serialize it to a byte:

    func toByte() -> UInt8 {
        var values = FlagValue.always
        if c { values.insert(.c) }
        if z { values.insert(.z) }
        if i { values.insert(.i) }
        if d { values.insert(.d) }
        if b { values.insert(.b) }
        if v { values.insert(.v) }
        if n { values.insert(.n) }
        return values.rawValue
    }

And the reverse, creating a new FlagSet from a byte:

    init(fromByte : UInt8) {
        let values = FlagValue(rawValue:fromByte)
        c = values.contains(.c)
        z = values.contains(.z)
        i = values.contains(.i)
        d = values.contains(.d)
        b = values.contains(.b)
        v = values.contains(.v)
        n = values.contains(.n)
    }
syntax highlighting by codehost

I'm trying not to worry too much about efficiency right now, but I was curious and did check that these contains and insert function calls of the OptionSet bitset can be inlined, though the assembly seemed overly complex. (Did you know Godbolt has Swift now?)

In future installments: modelling the memory space and opcodes.