165 lines
7.7 KiB
Markdown
165 lines
7.7 KiB
Markdown
# System Architecture
|
||
|
||
**Last Updated**: 2026-05-27
|
||
**Status**: Draft
|
||
|
||
## 1. High-Level Overview
|
||
|
||
NekoBoy is built around a clear division of responsibilities between a microcontroller (CPU) and an FPGA. This hybrid architecture allows us to combine the flexibility of software with the deterministic, high-performance capabilities of custom hardware.
|
||
|
||
The system is designed so that the same high-level game code can run on both a PC emulator and the real hardware with minimal changes. This is achieved through a **Core Abstraction Layer**.
|
||
|
||
### Core Components
|
||
|
||
| Component | Role | Technology |
|
||
|---------------------------|---------------------------------------------------|---------------------|
|
||
| **STM32U575VIT6** | Main CPU – Game logic, AI, physics, high-level control | Microcontroller |
|
||
| **XC7S50-2CSGA324I** | PPU (Graphics) + APU (Audio) | FPGA |
|
||
| **Core Abstraction Layer**| Unified interface for PPU and APU | C library |
|
||
| **Emulator** | Software Reference PPU + APU (for development) | PC software |
|
||
| **NekoTracker** | Music creation tool | Runs on PC and Hardware |
|
||
|
||
## 2. Component Responsibilities
|
||
|
||
### STM32U575VIT6 (Main CPU)
|
||
|
||
The microcontroller handles everything that benefits from flexible software logic:
|
||
|
||
- Game logic, AI, and physics
|
||
- High-level game state management
|
||
- Asset streaming and coordination
|
||
- Communication with the FPGA via FMC
|
||
- Power management and system control
|
||
- Cartridge interface and flashing
|
||
- I2C communication with power, audio codec, and sensors
|
||
|
||
**Design Principle**: The CPU should stay focused on *what* should happen, while the FPGA handles *how* it is rendered and played.
|
||
|
||
### XC7S50 FPGA (PPU + APU)
|
||
|
||
The FPGA is responsible for all real-time, performance-critical tasks:
|
||
|
||
- **PPU (Pixel Processing Unit)**: Tile layers, sprites, palettes, priority, effects, and display output
|
||
- **APU (Audio Processing Unit)**: FM synthesis, PCM playback, mixing, and lo-fi post-processing
|
||
- Direct RGB output to the display
|
||
- I2S audio output to the codec
|
||
|
||
The FPGA provides deterministic timing and high parallelism, which is ideal for graphics and audio.
|
||
|
||
### Core Abstraction Layer
|
||
|
||
This is the most important software component in the project.
|
||
|
||
It defines a clean, consistent interface that games and tools use to interact with graphics and sound. The same interface is implemented in two ways:
|
||
|
||
- **Software Reference** → Used by the Emulator on PC
|
||
- **Hardware Driver** → Used on the real NekoBoy console
|
||
|
||
Because of this layer, most game code does not need to know whether it is running on an emulator or real hardware.
|
||
|
||
## 3. Data Flow
|
||
|
||
```
|
||
Game Code / NekoTracker
|
||
│
|
||
▼
|
||
Core Abstraction Layer (PPU + APU commands)
|
||
│
|
||
├──────────────────────┐
|
||
▼ ▼
|
||
Emulator (PC) STM32U575 (Hardware)
|
||
│ │
|
||
▼ ▼
|
||
Software Reference PPU/APU FPGA (Real PPU + APU)
|
||
```
|
||
|
||
### Typical Flow Example (Rendering a Sprite)
|
||
|
||
1. Game calls `ppu_draw_sprite(...)` through the abstraction layer.
|
||
2. On PC → The call goes to the Software Reference PPU.
|
||
3. On Hardware → The call is translated into FMC commands sent to the FPGA.
|
||
4. The FPGA renders the sprite using its hardware PPU logic.
|
||
|
||
The same pattern applies to audio playback.
|
||
|
||
## 4. Memory Architecture
|
||
|
||
NekoBoy uses a **split memory design** to avoid contention and maximize performance:
|
||
|
||
| Memory | Connected To | Purpose | Size |
|
||
|-------------------------|------------------|---------------------------------------------------|---------|
|
||
| **HyperRAM (CPU)** | STM32U575 | Game state, decompressed assets, buffers | 32 MB |
|
||
| **HyperRAM (FPGA)** | XC7S50 | Framebuffers, tilemaps, sprite data, audio | 32 MB |
|
||
| **Fast SRAM** | XC7S50 | High-speed texture/sprite cache | 2 MB |
|
||
|
||
This separation ensures that the FPGA has fast, dedicated access to the memory it needs for rendering and audio without competing with the CPU.
|
||
|
||
## 5. Communication Interfaces
|
||
|
||
### CPU ↔ FPGA Communication
|
||
|
||
- **Primary Interface**: FMC (Flexible Memory Controller)
|
||
- High-bandwidth parallel bus
|
||
- Low latency for commands and data transfer
|
||
- Used for PPU commands, sprite data, tilemap updates, etc.
|
||
|
||
- **Secondary**: GPIO / IRQ lines for synchronization (e.g., VBlank interrupts)
|
||
|
||
### Other Important Interfaces
|
||
|
||
| Interface | Connected Devices | Purpose |
|
||
|---------------|--------------------------------------------|--------------------------------------|
|
||
| **I2C** | Power ICs, Audio Codec, Sensors | Configuration and control |
|
||
| **QSPI** | Cartridge NOR Flash | Game ROM and assets |
|
||
| **SPI** | FRAM (saves) | Non-volatile save data |
|
||
| **I2S** | Audio Codec (from FPGA) | Digital audio output |
|
||
| **RGB TTL** | Display (from FPGA) | Direct video output |
|
||
|
||
## 6. Development vs Hardware Execution
|
||
|
||
One of the core goals of NekoBoy is to make development as smooth as possible.
|
||
|
||
| Environment | PPU Implementation | APU Implementation | How Code Runs |
|
||
|-------------------|-----------------------------|-----------------------------|----------------------------------------|
|
||
| **Desktop** | Software Reference PPU | Software Reference APU | Directly against abstraction layer |
|
||
| **Emulator** | Software Reference PPU | Software Reference APU | Same as desktop |
|
||
| **Real Hardware** | FPGA PPU | FPGA APU | Same abstraction calls via driver |
|
||
|
||
Because both the emulator and hardware use the same **Core Abstraction Layer**, developers can write and test most of their code on PC before moving to real hardware.
|
||
|
||
## 7. Block Diagram
|
||
|
||
```mermaid
|
||
graph TD
|
||
A[Game Code / NekoTracker] --> B[Core Abstraction Layer]
|
||
|
||
B --> C[Emulator on PC]
|
||
B --> D[STM32U575VIT6]
|
||
|
||
C --> E[Software Reference PPU]
|
||
C --> F[Software Reference APU]
|
||
|
||
D --> G[XC7S50 FPGA]
|
||
G --> H[Hardware PPU]
|
||
G --> I[Hardware APU]
|
||
|
||
D --> J[HyperRAM - CPU]
|
||
G --> K[HyperRAM + SRAM - FPGA]
|
||
|
||
G --> L[Display - RGB TTL]
|
||
G --> M[Audio Codec - I2S]
|
||
```
|
||
|
||
## 8. Key Design Decisions & Rationale
|
||
|
||
| Decision | Rationale |
|
||
|------------------------------------|-----------|
|
||
| **FPGA handles PPU + APU** | Provides deterministic timing and high performance for graphics and audio |
|
||
| **CPU handles game logic** | Offers flexibility and easier development for AI, physics, and game systems |
|
||
| **Split HyperRAM architecture** | Avoids memory contention and gives each processor fast access to the memory it needs most |
|
||
| **Core Abstraction Layer** | Enables "write once, run anywhere" (emulator ↔ hardware) development |
|
||
| **FMC as main CPU-FPGA bus** | High bandwidth and low latency for transferring graphics commands and data |
|
||
|
||
---
|
||
|
||
This architecture is designed to be **coherent**, **developer-friendly**, and **balanced** between performance and ease of use. All major components work together through well-defined interfaces, with the Core Abstraction Layer serving as the central contract.
|