NekoBoy/docs/02-SYSTEM-ARCHITECTURE.md

7.7 KiB
Raw Blame History

System Architecture

Last Updated: 2026-05-27
Status: Draft

1. High-Level Overview

NekoBoy is built around a clear division of responsibilities between a microcontroller (CPU) and an FPGA. This hybrid architecture allows us to combine the flexibility of software with the deterministic, high-performance capabilities of custom hardware.

The system is designed so that the same high-level game code can run on both a PC emulator and the real hardware with minimal changes. This is achieved through a Core Abstraction Layer.

Core Components

Component Role Technology
STM32U575VIT6 Main CPU Game logic, AI, physics, high-level control Microcontroller
XC7S50-2CSGA324I PPU (Graphics) + APU (Audio) FPGA
Core Abstraction Layer Unified interface for PPU and APU C library
Emulator Software Reference PPU + APU (for development) PC software
NekoTracker Music creation tool Runs on PC and Hardware

2. Component Responsibilities

STM32U575VIT6 (Main CPU)

The microcontroller handles everything that benefits from flexible software logic:

  • Game logic, AI, and physics
  • High-level game state management
  • Asset streaming and coordination
  • Communication with the FPGA via FMC
  • Power management and system control
  • Cartridge interface and flashing
  • I2C communication with power, audio codec, and sensors

Design Principle: The CPU should stay focused on what should happen, while the FPGA handles how it is rendered and played.

XC7S50 FPGA (PPU + APU)

The FPGA is responsible for all real-time, performance-critical tasks:

  • PPU (Pixel Processing Unit): Tile layers, sprites, palettes, priority, effects, and display output
  • APU (Audio Processing Unit): FM synthesis, PCM playback, mixing, and lo-fi post-processing
  • Direct RGB output to the display
  • I2S audio output to the codec

The FPGA provides deterministic timing and high parallelism, which is ideal for graphics and audio.

Core Abstraction Layer

This is the most important software component in the project.

It defines a clean, consistent interface that games and tools use to interact with graphics and sound. The same interface is implemented in two ways:

  • Software Reference → Used by the Emulator on PC
  • Hardware Driver → Used on the real NekoBoy console

Because of this layer, most game code does not need to know whether it is running on an emulator or real hardware.

3. Data Flow

Game Code / NekoTracker
        │
        ▼
Core Abstraction Layer (PPU + APU commands)
        │
        ├──────────────────────┐
        ▼                      ▼
   Emulator (PC)          STM32U575 (Hardware)
        │                      │
        ▼                      ▼
Software Reference PPU/APU   FPGA (Real PPU + APU)

Typical Flow Example (Rendering a Sprite)

  1. Game calls ppu_draw_sprite(...) through the abstraction layer.
  2. On PC → The call goes to the Software Reference PPU.
  3. On Hardware → The call is translated into FMC commands sent to the FPGA.
  4. The FPGA renders the sprite using its hardware PPU logic.

The same pattern applies to audio playback.

4. Memory Architecture

NekoBoy uses a split memory design to avoid contention and maximize performance:

Memory Connected To Purpose Size
HyperRAM (CPU) STM32U575 Game state, decompressed assets, buffers 32 MB
HyperRAM (FPGA) XC7S50 Framebuffers, tilemaps, sprite data, audio 32 MB
Fast SRAM XC7S50 High-speed texture/sprite cache 2 MB

This separation ensures that the FPGA has fast, dedicated access to the memory it needs for rendering and audio without competing with the CPU.

5. Communication Interfaces

CPU ↔ FPGA Communication

  • Primary Interface: FMC (Flexible Memory Controller)

    • High-bandwidth parallel bus
    • Low latency for commands and data transfer
    • Used for PPU commands, sprite data, tilemap updates, etc.
  • Secondary: GPIO / IRQ lines for synchronization (e.g., VBlank interrupts)

Other Important Interfaces

Interface Connected Devices Purpose
I2C Power ICs, Audio Codec, Sensors Configuration and control
QSPI Cartridge NOR Flash Game ROM and assets
SPI FRAM (saves) Non-volatile save data
I2S Audio Codec (from FPGA) Digital audio output
RGB TTL Display (from FPGA) Direct video output

6. Development vs Hardware Execution

One of the core goals of NekoBoy is to make development as smooth as possible.

Environment PPU Implementation APU Implementation How Code Runs
Desktop Software Reference PPU Software Reference APU Directly against abstraction layer
Emulator Software Reference PPU Software Reference APU Same as desktop
Real Hardware FPGA PPU FPGA APU Same abstraction calls via driver

Because both the emulator and hardware use the same Core Abstraction Layer, developers can write and test most of their code on PC before moving to real hardware.

7. Block Diagram

graph TD
    A[Game Code / NekoTracker] --> B[Core Abstraction Layer]
    
    B --> C[Emulator on PC]
    B --> D[STM32U575VIT6]
    
    C --> E[Software Reference PPU]
    C --> F[Software Reference APU]
    
    D --> G[XC7S50 FPGA]
    G --> H[Hardware PPU]
    G --> I[Hardware APU]
    
    D --> J[HyperRAM - CPU]
    G --> K[HyperRAM + SRAM - FPGA]
    
    G --> L[Display - RGB TTL]
    G --> M[Audio Codec - I2S]

8. Key Design Decisions & Rationale

Decision Rationale
FPGA handles PPU + APU Provides deterministic timing and high performance for graphics and audio
CPU handles game logic Offers flexibility and easier development for AI, physics, and game systems
Split HyperRAM architecture Avoids memory contention and gives each processor fast access to the memory it needs most
Core Abstraction Layer Enables "write once, run anywhere" (emulator ↔ hardware) development
FMC as main CPU-FPGA bus High bandwidth and low latency for transferring graphics commands and data

This architecture is designed to be coherent, developer-friendly, and balanced between performance and ease of use. All major components work together through well-defined interfaces, with the Core Abstraction Layer serving as the central contract.