# System Architecture **Last Updated**: 2026-05-27 **Status**: Draft ## 1. High-Level Overview NekoBoy is built around a clear division of responsibilities between a microcontroller (CPU) and an FPGA. This hybrid architecture allows us to combine the flexibility of software with the deterministic, high-performance capabilities of custom hardware. The system is designed so that the same high-level game code can run on both a PC emulator and the real hardware with minimal changes. This is achieved through a **Core Abstraction Layer**. ### Core Components | Component | Role | Technology | |---------------------------|---------------------------------------------------|---------------------| | **STM32U575VIT6** | Main CPU – Game logic, AI, physics, high-level control | Microcontroller | | **XC7S50-2CSGA324I** | PPU (Graphics) + APU (Audio) | FPGA | | **Core Abstraction Layer**| Unified interface for PPU and APU | C library | | **Emulator** | Software Reference PPU + APU (for development) | PC software | | **NekoTracker** | Music creation tool | Runs on PC and Hardware | ## 2. Component Responsibilities ### STM32U575VIT6 (Main CPU) The microcontroller handles everything that benefits from flexible software logic: - Game logic, AI, and physics - High-level game state management - Asset streaming and coordination - Communication with the FPGA via FMC - Power management and system control - Cartridge interface and flashing - I2C communication with power, audio codec, and sensors **Design Principle**: The CPU should stay focused on *what* should happen, while the FPGA handles *how* it is rendered and played. ### XC7S50 FPGA (PPU + APU) The FPGA is responsible for all real-time, performance-critical tasks: - **PPU (Pixel Processing Unit)**: Tile layers, sprites, palettes, priority, effects, and display output - **APU (Audio Processing Unit)**: FM synthesis, PCM playback, mixing, and lo-fi post-processing - Direct RGB output to the display - I2S audio output to the codec The FPGA provides deterministic timing and high parallelism, which is ideal for graphics and audio. ### Core Abstraction Layer This is the most important software component in the project. It defines a clean, consistent interface that games and tools use to interact with graphics and sound. The same interface is implemented in two ways: - **Software Reference** → Used by the Emulator on PC - **Hardware Driver** → Used on the real NekoBoy console Because of this layer, most game code does not need to know whether it is running on an emulator or real hardware. ## 3. Data Flow ``` Game Code / NekoTracker │ ▼ Core Abstraction Layer (PPU + APU commands) │ ├──────────────────────┐ ▼ ▼ Emulator (PC) STM32U575 (Hardware) │ │ ▼ ▼ Software Reference PPU/APU FPGA (Real PPU + APU) ``` ### Typical Flow Example (Rendering a Sprite) 1. Game calls `ppu_draw_sprite(...)` through the abstraction layer. 2. On PC → The call goes to the Software Reference PPU. 3. On Hardware → The call is translated into FMC commands sent to the FPGA. 4. The FPGA renders the sprite using its hardware PPU logic. The same pattern applies to audio playback. ## 4. Memory Architecture NekoBoy uses a **split memory design** to avoid contention and maximize performance: | Memory | Connected To | Purpose | Size | |-------------------------|------------------|---------------------------------------------------|---------| | **HyperRAM (CPU)** | STM32U575 | Game state, decompressed assets, buffers | 32 MB | | **HyperRAM (FPGA)** | XC7S50 | Framebuffers, tilemaps, sprite data, audio | 32 MB | | **Fast SRAM** | XC7S50 | High-speed texture/sprite cache | 2 MB | This separation ensures that the FPGA has fast, dedicated access to the memory it needs for rendering and audio without competing with the CPU. ## 5. Communication Interfaces ### CPU ↔ FPGA Communication - **Primary Interface**: FMC (Flexible Memory Controller) - High-bandwidth parallel bus - Low latency for commands and data transfer - Used for PPU commands, sprite data, tilemap updates, etc. - **Secondary**: GPIO / IRQ lines for synchronization (e.g., VBlank interrupts) ### Other Important Interfaces | Interface | Connected Devices | Purpose | |---------------|--------------------------------------------|--------------------------------------| | **I2C** | Power ICs, Audio Codec, Sensors | Configuration and control | | **QSPI** | Cartridge NOR Flash | Game ROM and assets | | **SPI** | FRAM (saves) | Non-volatile save data | | **I2S** | Audio Codec (from FPGA) | Digital audio output | | **RGB TTL** | Display (from FPGA) | Direct video output | ## 6. Development vs Hardware Execution One of the core goals of NekoBoy is to make development as smooth as possible. | Environment | PPU Implementation | APU Implementation | How Code Runs | |-------------------|-----------------------------|-----------------------------|----------------------------------------| | **Desktop** | Software Reference PPU | Software Reference APU | Directly against abstraction layer | | **Emulator** | Software Reference PPU | Software Reference APU | Same as desktop | | **Real Hardware** | FPGA PPU | FPGA APU | Same abstraction calls via driver | Because both the emulator and hardware use the same **Core Abstraction Layer**, developers can write and test most of their code on PC before moving to real hardware. ## 7. Block Diagram ```mermaid graph TD A[Game Code / NekoTracker] --> B[Core Abstraction Layer] B --> C[Emulator on PC] B --> D[STM32U575VIT6] C --> E[Software Reference PPU] C --> F[Software Reference APU] D --> G[XC7S50 FPGA] G --> H[Hardware PPU] G --> I[Hardware APU] D --> J[HyperRAM - CPU] G --> K[HyperRAM + SRAM - FPGA] G --> L[Display - RGB TTL] G --> M[Audio Codec - I2S] ``` ## 8. Key Design Decisions & Rationale | Decision | Rationale | |------------------------------------|-----------| | **FPGA handles PPU + APU** | Provides deterministic timing and high performance for graphics and audio | | **CPU handles game logic** | Offers flexibility and easier development for AI, physics, and game systems | | **Split HyperRAM architecture** | Avoids memory contention and gives each processor fast access to the memory it needs most | | **Core Abstraction Layer** | Enables "write once, run anywhere" (emulator ↔ hardware) development | | **FMC as main CPU-FPGA bus** | High bandwidth and low latency for transferring graphics commands and data | --- This architecture is designed to be **coherent**, **developer-friendly**, and **balanced** between performance and ease of use. All major components work together through well-defined interfaces, with the Core Abstraction Layer serving as the central contract.