NekoBoy/docs/02-SYSTEM-ARCHITECTURE.md

165 lines
7.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# System Architecture
**Last Updated**: 2026-05-27
**Status**: Draft
## 1. High-Level Overview
NekoBoy is built around a clear division of responsibilities between a microcontroller (CPU) and an FPGA. This hybrid architecture allows us to combine the flexibility of software with the deterministic, high-performance capabilities of custom hardware.
The system is designed so that the same high-level game code can run on both a PC emulator and the real hardware with minimal changes. This is achieved through a **Core Abstraction Layer**.
### Core Components
| Component | Role | Technology |
|---------------------------|---------------------------------------------------|---------------------|
| **STM32U575VIT6** | Main CPU Game logic, AI, physics, high-level control | Microcontroller |
| **XC7S50-2CSGA324I** | PPU (Graphics) + APU (Audio) | FPGA |
| **Core Abstraction Layer**| Unified interface for PPU and APU | C library |
| **Emulator** | Software Reference PPU + APU (for development) | PC software |
| **NekoTracker** | Music creation tool | Runs on PC and Hardware |
## 2. Component Responsibilities
### STM32U575VIT6 (Main CPU)
The microcontroller handles everything that benefits from flexible software logic:
- Game logic, AI, and physics
- High-level game state management
- Asset streaming and coordination
- Communication with the FPGA via FMC
- Power management and system control
- Cartridge interface and flashing
- I2C communication with power, audio codec, and sensors
**Design Principle**: The CPU should stay focused on *what* should happen, while the FPGA handles *how* it is rendered and played.
### XC7S50 FPGA (PPU + APU)
The FPGA is responsible for all real-time, performance-critical tasks:
- **PPU (Pixel Processing Unit)**: Tile layers, sprites, palettes, priority, effects, and display output
- **APU (Audio Processing Unit)**: FM synthesis, PCM playback, mixing, and lo-fi post-processing
- Direct RGB output to the display
- I2S audio output to the codec
The FPGA provides deterministic timing and high parallelism, which is ideal for graphics and audio.
### Core Abstraction Layer
This is the most important software component in the project.
It defines a clean, consistent interface that games and tools use to interact with graphics and sound. The same interface is implemented in two ways:
- **Software Reference** → Used by the Emulator on PC
- **Hardware Driver** → Used on the real NekoBoy console
Because of this layer, most game code does not need to know whether it is running on an emulator or real hardware.
## 3. Data Flow
```
Game Code / NekoTracker
Core Abstraction Layer (PPU + APU commands)
├──────────────────────┐
▼ ▼
Emulator (PC) STM32U575 (Hardware)
│ │
▼ ▼
Software Reference PPU/APU FPGA (Real PPU + APU)
```
### Typical Flow Example (Rendering a Sprite)
1. Game calls `ppu_draw_sprite(...)` through the abstraction layer.
2. On PC → The call goes to the Software Reference PPU.
3. On Hardware → The call is translated into FMC commands sent to the FPGA.
4. The FPGA renders the sprite using its hardware PPU logic.
The same pattern applies to audio playback.
## 4. Memory Architecture
NekoBoy uses a **split memory design** to avoid contention and maximize performance:
| Memory | Connected To | Purpose | Size |
|-------------------------|------------------|---------------------------------------------------|---------|
| **HyperRAM (CPU)** | STM32U575 | Game state, decompressed assets, buffers | 32 MB |
| **HyperRAM (FPGA)** | XC7S50 | Framebuffers, tilemaps, sprite data, audio | 32 MB |
| **Fast SRAM** | XC7S50 | High-speed texture/sprite cache | 2 MB |
This separation ensures that the FPGA has fast, dedicated access to the memory it needs for rendering and audio without competing with the CPU.
## 5. Communication Interfaces
### CPU ↔ FPGA Communication
- **Primary Interface**: FMC (Flexible Memory Controller)
- High-bandwidth parallel bus
- Low latency for commands and data transfer
- Used for PPU commands, sprite data, tilemap updates, etc.
- **Secondary**: GPIO / IRQ lines for synchronization (e.g., VBlank interrupts)
### Other Important Interfaces
| Interface | Connected Devices | Purpose |
|---------------|--------------------------------------------|--------------------------------------|
| **I2C** | Power ICs, Audio Codec, Sensors | Configuration and control |
| **QSPI** | Cartridge NOR Flash | Game ROM and assets |
| **SPI** | FRAM (saves) | Non-volatile save data |
| **I2S** | Audio Codec (from FPGA) | Digital audio output |
| **RGB TTL** | Display (from FPGA) | Direct video output |
## 6. Development vs Hardware Execution
One of the core goals of NekoBoy is to make development as smooth as possible.
| Environment | PPU Implementation | APU Implementation | How Code Runs |
|-------------------|-----------------------------|-----------------------------|----------------------------------------|
| **Desktop** | Software Reference PPU | Software Reference APU | Directly against abstraction layer |
| **Emulator** | Software Reference PPU | Software Reference APU | Same as desktop |
| **Real Hardware** | FPGA PPU | FPGA APU | Same abstraction calls via driver |
Because both the emulator and hardware use the same **Core Abstraction Layer**, developers can write and test most of their code on PC before moving to real hardware.
## 7. Block Diagram
```mermaid
graph TD
A[Game Code / NekoTracker] --> B[Core Abstraction Layer]
B --> C[Emulator on PC]
B --> D[STM32U575VIT6]
C --> E[Software Reference PPU]
C --> F[Software Reference APU]
D --> G[XC7S50 FPGA]
G --> H[Hardware PPU]
G --> I[Hardware APU]
D --> J[HyperRAM - CPU]
G --> K[HyperRAM + SRAM - FPGA]
G --> L[Display - RGB TTL]
G --> M[Audio Codec - I2S]
```
## 8. Key Design Decisions & Rationale
| Decision | Rationale |
|------------------------------------|-----------|
| **FPGA handles PPU + APU** | Provides deterministic timing and high performance for graphics and audio |
| **CPU handles game logic** | Offers flexibility and easier development for AI, physics, and game systems |
| **Split HyperRAM architecture** | Avoids memory contention and gives each processor fast access to the memory it needs most |
| **Core Abstraction Layer** | Enables "write once, run anywhere" (emulator ↔ hardware) development |
| **FMC as main CPU-FPGA bus** | High bandwidth and low latency for transferring graphics commands and data |
---
This architecture is designed to be **coherent**, **developer-friendly**, and **balanced** between performance and ease of use. All major components work together through well-defined interfaces, with the Core Abstraction Layer serving as the central contract.