# Keraunos System Architecture ## PCIe Tile Integration in Keraunos-E100 Chiplet Ecosystem **Version:** 2.0 **Date:** March 26, 2026 **Author:** System Architecture Team --- ## Executive Summary This document describes the system-level architecture of the Keraunos-E100 chiplet ecosystem and details how the Keraunos PCIe Tile integrates into the larger Grendel multi-chiplet architecture. The Keraunos PCIe Tile serves as a critical I/O interface, enabling host connectivity and system management while interfacing with the on-chip Network-on-Chip (NOC) infrastructure. --- ## Table of Contents 1. [System Overview](#1-system-overview) 2. [Keraunos-E100 Chiplet Architecture](#2-keraunos-e100-chiplet-architecture) 3. [PCIe Tile Position in the System](#3-pcie-tile-position-in-the-system) - [3.4 Model Integration: Host–RC–EP–PCIe Tile Connection Diagram](#34-model-integration-hostrceppcie-tile-connection-diagram) - [3.5 VDK Integration: PCIe Tile and Synopsys PCIe Controller in the Virtualizer](#35-vdk-integration-pcie-tile-and-synopsys-pcie-controller-in-the-virtualizer) - [3.5.6 DesignWare EP/RC Interfaces: Connect vs Stub](#356-designware-pcie-ep-and-rc-interfaces-connect-to-tile--system-vs-stub) 4. [Connectivity Architecture](#4-connectivity-architecture) 5. [Data Flow Paths](#5-data-flow-paths) 6. [Address Space Integration](#6-address-space-integration) 7. [System Use Cases](#7-system-use-cases) 8. [Final VDK Platform: Linux-Booting PCIe Tile Integration](#8-final-vdk-platform-linux-booting-pcie-tile-integration) 9. [Appendices](#9-appendices) --- ## 1. System Overview ### 1.1 Grendel Chiplet Ecosystem The Grendel chiplet ecosystem is a multi-chiplet heterogeneous computing platform designed for high-performance AI/ML workloads. The ecosystem consists of: - **Quasar Chiplets:** Compute chiplets containing AI/ML processing cores - **Mimir Chiplets:** Memory chiplets with GDDR interfaces - **Athena Chiplets:** Specialized compute chiplets - **Keraunos-E100 Chiplets:** I/O interface chiplets for high-speed connectivity ### 1.2 Keraunos-E100 Role Keraunos-E100 is the I/O interface chiplet family in the Grendel ecosystem, providing: - **Glueless scale-out** connectivity via 400G/800G Ethernet (Quasar-to-Quasar across packages) - **Host connectivity** via PCIe Gen5 (x16) - **Die-to-die (D2D)** connectivity within the package using BoW (Bridge-of-Wire) technology - **System management** capabilities via integrated SMC (System Management Controller) ```{mermaid} graph TB subgraph pkg["Grendel Package"] QUASAR[Quasar] MIMIR[Mimir] KERAUNOS[Keraunos-E100] QUASAR --- KERAUNOS MIMIR --- QUASAR end HOST[Host] <-->|PCIe x16| KERAUNOS REMOTE[Remote Package] <-->|Ethernet| KERAUNOS style KERAUNOS fill:#e1f5ff style QUASAR fill:#ffe1e1 style MIMIR fill:#e1ffe1 style HOST fill:#fff4e1 style REMOTE fill:#f0e1ff ``` --- ## 2. Keraunos-E100 Chiplet Architecture ### 2.1 High-Level Block Diagram The Keraunos-E100 chiplet contains the following major subsystems. The diagram is split into two parts for reliable rendering. **Part A — Internal subsystems and NOC:** ```{mermaid} graph TB subgraph harness["Chiplet Harness"] SMC[SMC] SEP[SEP] SMU[SMU] end subgraph pcie["PCIe Subsystem"] PCIE_TILE[PCIe Tile] PCIE_SERDES[PCIe SerDes] end subgraph hsio["HSIO Tiles (2)"] CCE0[CCE 0] CCE1[CCE 1] ETH0[TT Eth Ctrl 0] ETH1[TT Eth Ctrl 1] MAC0[MAC PCS 0] MAC1[MAC PCS 1] SRAM[HSIO SRAM] FABRIC[HSIO Fabric] end subgraph noc["NOC Infrastructure"] SMN[SMN] QNP[QNP Mesh] D2D[D2D Tiles] end PCIE_TILE --- SMN PCIE_TILE --- QNP PCIE_TILE --- PCIE_SERDES CCE0 --- FABRIC CCE1 --- FABRIC ETH0 --- FABRIC ETH1 --- FABRIC FABRIC --- SRAM FABRIC --- SMN FABRIC --- QNP ETH0 --- MAC0 ETH1 --- MAC1 SMN --- SMC SMN --- SEP SMN --- D2D QNP --- D2D style PCIE_TILE fill:#ffcccc,stroke:#c00 style FABRIC fill:#d9f7d9 style SMN fill:#e6ccff style QNP fill:#e6ccff ``` **Part B — External interfaces:** ```{mermaid} graph LR subgraph chip["Keraunos-E100 Chiplet"] PCIE[PCIe Tile] MAC_A[MAC PCS 0] MAC_B[MAC PCS 1] D2D_T[D2D Tiles] end HOST[PCIe Host] ETH_A[Ethernet 0] ETH_B[Ethernet 1] QUASAR[Quasar Mimir] HOST <-->|PCIe x16| PCIE MAC_A -->|800G| ETH_A MAC_B -->|800G| ETH_B D2D_T -->|BoW| QUASAR style PCIE fill:#ffcccc,stroke:#c00 ``` ### 2.2 Key Subsystems #### 2.2.1 Chiplet Harness - **SMC (System Management Controller):** 4-core RISC-V processor (Rocket core) running at 800 MHz - **SEP (Security Engine Processor):** Handles secure boot, attestation, and access filtering - **SMU (System Management Unit):** Clock generation (CGM PLLs), power management, reset sequencing #### 2.2.2 PCIe Subsystem - **PCIe Tile:** Contains TLB translation engines, configuration registers, and internal fabric; it interfaces to the PCIe Controller. The tile does not implement the link layer; it receives/sends TLPs via the controller. - **PCIe Controller:** On the Keraunos chip this is the **Synopsys PCIe Controller IP** (DesignWare), configured as an **Endpoint** (EP). The host system uses a **Root Complex** (RC). The link is therefore RC (host) ↔ EP (Keraunos). - **PCIe SerDes:** Physical layer (PHY) for PCIe Gen5/Gen6 connectivity #### 2.2.3 HSIO (High-Speed I/O) Tiles - **CCE (Keraunos Compute Engine):** DMA engines, DMRISC cores, data forwarding logic - **TT Ethernet Controller:** TX/RX queue controllers, packet processing, flow control - **MAC/PCS:** 800G Ethernet MAC and Physical Coding Sublayer (OmegaCore IP from AlphaWave) - **SRAM:** 8MB high-speed SRAM for packet buffering and data staging - **HSIO Fabric:** AXI-based crossbar interconnect #### 2.2.4 NOC Infrastructure - **SMN (System Management Network):** Carries control, configuration, and low-bandwidth traffic - **QNP Mesh (NOC-N):** High-bandwidth data fabric for payload transfer (1.5 GHz @ TT corner) - **D2D (Die-to-Die):** 5 BoW interfaces @ 2 GHz for chiplet-to-chiplet connectivity --- ## 3. PCIe Tile Position in the System ### 3.1 PCIe Tile Overview The **Keraunos PCIe Tile** developed in this project is a SystemC/TLM-2.0 model representing the PCIe subsystem of the Keraunos-E100 chiplet. It provides: 1. **Host Interface:** PCIe Gen5 x16 connectivity to the host CPU 2. **Internal Routing:** Bidirectional routing between PCIe, NOC-N (QNP), and SMN 3. **Address Translation:** TLB-based address mapping between PCIe address space and system address space 4. **Configuration Interface:** SMN-accessible configuration registers for TLBs, MSI relay, and error handling ### 3.2 Architectural Position ```{mermaid} graph TB subgraph host["Host System"] CPU[Host CPU] DRAM[Host DRAM] end subgraph tile["PCIe Tile"] PCIE_CTRL[PCIe Controller] TLB_IN[Inbound TLB] TLB_OUT[Outbound TLB] SWITCH[PCIe-SMN-IO Switch] MSI_RELAY[MSI Relay] end subgraph chip["Keraunos-E100"] SMN_NET[SMN] QNP_NET[QNP NOC-N] HSIO_TILES[HSIO Tiles] D2D_LINKS[D2D Links] end CPU --- PCIE_CTRL PCIE_CTRL --- TLB_IN PCIE_CTRL --- TLB_OUT TLB_IN --- SWITCH SWITCH --- QNP_NET SWITCH --- SMN_NET TLB_OUT --- QNP_NET TLB_OUT --- SMN_NET SMN_NET --- PCIE_CTRL SMN_NET --- HSIO_TILES QNP_NET --- HSIO_TILES QNP_NET --- D2D_LINKS MSI_RELAY --- PCIE_CTRL SMN_NET --- MSI_RELAY style PCIE_CTRL fill:#ffcccc,stroke:#c00 style TLB_IN fill:#ffe6cc style TLB_OUT fill:#ffe6cc style SWITCH fill:#d9f7d9 style SMN_NET fill:#e6ccff style QNP_NET fill:#cce5ff ``` ### 3.3 Key Interfaces | Interface | Protocol | Width | Purpose | |-----------|----------|-------|---------| | `pcie_inbound` | TLM-2.0 Target | 64-bit | Receives PCIe Memory Read/Write from host | | `noc_n_initiator` | TLM-2.0 Initiator | 64-bit | Forwards inbound PCIe traffic to NOC after TLB translation | | `smn_n_initiator` | TLM-2.0 Initiator | 64-bit | Forwards bypass/system traffic to SMN | | `noc_n_outbound` | TLM-2.0 Target | 64-bit | Receives outbound NOC traffic destined for PCIe | | `smn_outbound` | TLM-2.0 Target | 64-bit | Receives outbound SMN traffic destined for PCIe | | `pcie_controller_initiator` | TLM-2.0 Initiator | 64-bit | Sends outbound transactions to PCIe controller | | `smn_config` | TLM-2.0 Target | 64-bit | SMN access to PCIe Tile configuration registers | ### 3.4 Model Integration: Host–RC–EP–PCIe Tile Connection Diagram This section documents how the **inband** (TLM) and **sideband** (sc_in/sc_out) ports of the PCIe Tile connect across Host → Root Complex → Endpoint → PCIe Tile for **model integration** (e.g. connecting the tile to a Synopsys PCIe Controller model or test harness). **End-to-end connection path:** ```{mermaid} graph LR subgraph host["Host System"] CPU[Host CPU] DRAM[Host DRAM] end subgraph rc["Root Complex"] RC_CORE[RC Core] RC_SIDEBAND[Reset / Power / Sideband] end subgraph link["PCIe Link"] TLP[TLPs] end subgraph ep["Endpoint - Synopsys PCIe Controller"] EP_CORE[EP Core] EP_AXI[AXI / Data IF] EP_SIDEBAND[Sideband Signals] end subgraph tile["PCIe Tile DUT"] TLM_IN[TLM Target] TLM_OUT[TLM Initiator] SC_IN[sc_in] SC_OUT[sc_out] end CPU --- RC_CORE RC_CORE --- TLP TLP --- EP_CORE EP_CORE --- EP_AXI EP_CORE --- EP_SIDEBAND EP_AXI --- TLM_IN EP_AXI --- TLM_OUT EP_SIDEBAND --- SC_IN EP_SIDEBAND --- SC_OUT RC_SIDEBAND -.->|optional| EP_SIDEBAND style TLP fill:#e3f2fd style EP_AXI fill:#fff3e0 style EP_SIDEBAND fill:#f3e5f5 style TLM_IN fill:#e8f5e9 style TLM_OUT fill:#e8f5e9 style SC_IN fill:#fce4ec style SC_OUT fill:#fce4ec ``` **Inband (TLM) connections — EP ↔ PCIe Tile:** | Tile port | Direction | Width | Connected to (EP side) | Description | |-----------|-----------|-------|------------------------|-------------| | `pcie_controller_target` | Target (in) | 64-bit | **EP BusMaster** (TLM master) | Inbound TLPs from host: EP pushes Memory Read/Write to tile | | `pcie_controller_initiator` | Initiator (out) | 64-bit | **EP AXI_Slave** (TLM slave) | Outbound TLPs to host: tile pushes Memory Read/Write/Completion to EP | | `noc_n_target` | Target (in) | 64-bit | NOC fabric | Outbound path: NOC traffic destined for PCIe | | `noc_n_initiator` | Initiator (out) | 64-bit | NOC fabric | Inbound path: tile forwards translated traffic to NOC | | `smn_n_target` | Target (in) | 64-bit | SMN fabric | Outbound path: SMN traffic destined for PCIe | | `smn_n_initiator` | Initiator (out) | 64-bit | SMN fabric | Inbound path: tile forwards system traffic to SMN | **Sideband signals — EP → PCIe Tile (sc_in to tile):** These are driven by the **Synopsys PCIe Controller (EP)** or by the **system**; the tile receives them as `sc_in`. | Tile port (sc_in) | Type | Source | Description | |-------------------|------|--------|-------------| | `pcie_core_clk` | bool | EP | PCIe core clock from controller | | `pcie_controller_reset_n` | bool | EP | Controller reset (active low) | | `pcie_cii_hv` | bool | EP | CII header valid (SII / config info) | | `pcie_cii_hdr_type` | sc_bv<5> | EP | CII header type [4:0] | | `pcie_cii_hdr_addr` | sc_bv<12> | EP | CII header address [11:0] | | `pcie_flr_request` | bool | EP | Function Level Reset request | | `pcie_hot_reset` | bool | EP | Hot reset from link | | `pcie_ras_error` | bool | EP | RAS error indication | | `pcie_dma_completion` | bool | EP | DMA completion notification | | `pcie_misc_int` | bool | EP | Miscellaneous interrupt from controller | | `cold_reset_n` | bool | System (SMC) | SoC cold reset (active low) | | `warm_reset_n` | bool | System (SMC) | SoC warm reset (active low) | | `isolate_req` | bool | System | Isolation request | | `axi_clk` | bool | System | AXI clock | **Sideband signals — PCIe Tile → EP (sc_out from tile):** The tile drives these; the **EP** or **system** receives them. | Tile port (sc_out) | Type | Sink | Description | |--------------------|------|------|-------------| | `pcie_app_bus_num` | uint8_t | EP | PCIe bus number for app | | `pcie_app_dev_num` | uint8_t | EP | PCIe device number for app | | `pcie_device_type` | bool | EP | Device type indicator | | `pcie_sys_int` | bool | EP | System interrupt to controller | | `function_level_reset` | bool | EP | FLR completion / request to EP | | `hot_reset_requested` | bool | EP | Hot reset requested | | `config_update` | bool | EP | Configuration update indicator | | `ras_error` | bool | EP | RAS error to controller | | `dma_completion` | bool | EP | DMA completion to controller | | `controller_misc_int` | bool | EP | Controller miscellaneous interrupt | | `noc_timeout` | sc_bv<3> | EP / system | NOC timeout status | **Summary diagram — sideband and inband to PCIe Tile:** ```{mermaid} graph TB subgraph host_rc["Host / Root Complex"] H[Host] RC[RC] H --- RC end subgraph pcie_link["PCIe Link - Inband TLPs"] L[TLP] end subgraph ep_block["Synopsys EP - PCIe Controller"] EP[EP] EP_CLK[clk, reset_n] EP_FLR[flr_request, hot_reset] EP_ERR[ras_error, dma_completion, misc_int] EP_CII[cii_hv, cii_hdr_type, cii_hdr_addr] EP --- EP_CLK EP --- EP_FLR EP --- EP_ERR EP --- EP_CII end subgraph tile["PCIe Tile"] TLM_T[pcie_controller_target] TLM_I[pcie_controller_initiator] IN[sc_in ports] OUT[sc_out ports] end RC --- L L --- EP EP ---|AXI/TLM| TLM_T EP ---|AXI/TLM| TLM_I EP_CLK --- IN EP_FLR --- IN EP_ERR --- IN EP_CII --- IN OUT --- EP IN -.->|cold_reset_n, warm_reset_n, isolate_req, axi_clk| SYS[System SMC] style L fill:#e3f2fd style TLM_T fill:#c8e6c9 style TLM_I fill:#c8e6c9 style IN fill:#f8bbd0 style OUT fill:#f8bbd0 ``` **Integration notes:** - **Inband:** Connect the EP’s **BusMaster** (TLM master) to the tile’s `pcie_controller_target` — the EP delivers inbound TLPs from the host to the tile. Connect the tile’s `pcie_controller_initiator` to the EP’s **AXI_Slave** (TLM slave) — the tile sends outbound TLPs to the EP, which forwards them over the link to the RC. - **Sideband:** Drive all tile `sc_in` ports from the EP model or system (clocks, resets, CII, FLR, hot_reset, ras_error, dma_completion, pcie_misc_int; plus cold_reset_n, warm_reset_n, isolate_req, axi_clk from system). Connect all tile `sc_out` ports to the EP or system as required by the EP datasheet and platform design. ### 3.5 VDK Integration: PCIe Tile and Synopsys PCIe Controller in the Virtualizer This section describes how the **Keraunos PCIe Tile** and the **Synopsys PCIe Controller** (DesignWare, as RC and EP) are connected in the **Synopsys Virtualizer VDK** so that the virtual platform aligns with the Keraunos system architecture. The final validated VDK uses a **direct RC–EP link** between two chiplet groups: **Host_Chiplet** (Root Complex side) and **Keraunos_PCIE_Chiplet** (Endpoint side). #### 3.5.1 VDK Topology The VDK instantiates a **Host_Chiplet** (with PCIE_RC, RISC-V CPU running Linux, DRAM, UART, PLIC) and a **Keraunos_PCIE_Chiplet** (with PCIe_EP, PCIE_TILE, Target_Memory, and a second RISC-V CPU running bare-metal firmware). **PCIe model in VDK:** Synopsys **DESIGNWARE_PCIE / PCIe_2_0** is used for both the Root Complex (PCIE_RC on Host_Chiplet) and the Endpoint (PCIe_EP on Keraunos_PCIE_Chiplet). The PCIe link uses a **direct peer-to-peer** binding: - **RC PCIMem** (master) ↔ **EP PCIMem_Slave** (slave) — TLPs from RC to EP - **RC PCIMem_Slave** (slave) ↔ **EP PCIMem** (master) — TLPs from EP to RC #### 3.5.2 Alignment with Keraunos: Where the PCIe Tile Fits In the **Keraunos-E100** architecture, the host uses a **Root Complex** and the Keraunos chip uses a **Synopsys PCIe Controller as Endpoint**. The **PCIe Tile** sits behind the EP and provides TLB translation and routing to NOC/SMN. In the VDK: - **RC:** The **PCIE_RC** on Host_Chiplet models the host side. - **EP:** On the **Keraunos_PCIE_Chiplet**, the **Synopsys PCIe EP** is the PCIe controller; the **Keraunos PCIe Tile** (PCIE_TILE) is inserted **between** this EP and the rest of the chip (NOC/SMN/Target_Memory). The topology is: **Host ↔ RC ↔ [direct PCIe link] ↔ EP ↔ PCIe Tile ↔ NOC/SMN**. The tile does not replace the EP; it connects to the EP’s application-side (AXI/TLM) and sideband interfaces as in Section 3.4. #### 3.5.3 Interface-Level Connection Diagram (VDK) The following diagram shows the VDK topology and where the PCIe Tile and Synopsys RC/EP connect: ```{mermaid} graph TB subgraph vdk_root["VDK: Keraunos_PCIE_Tile"] subgraph host["Host_Chiplet"] RST_H[RST_GEN] SMM_P[SharedMemoryMap] SMC_H[SMC] RC[PCIE_RC - Synopsys RC] DRAM_H[DRAM] SMC_H --- RC end subgraph device["Keraunos_PCIE_Chiplet"] RST_D[RST_GEN] SMM_S[SharedMemoryMap] SMC_D[SMC_Configure] EP[PCIe_EP - Synopsys EP] TILE[PCIE_TILE - Keraunos PCIe Tile] MEM[Target_Memory] EP ---|BusMaster / AXI_Slave| TILE TILE ---|noc_n / smn_n| SMM_S SMM_S --- MEM end end RC ---|PCIMem / PCIMem_Slave direct| EP RC ---|AXI_Slave, AXI_DBI, BusMaster| SMM_P EP -.->|sideband| TILE style RC fill:#e3f2fd style EP fill:#fff3e0 style TILE fill:#c8e6c9 style MEM fill:#ffe6cc ``` **Inband (TLM) connections:** - **PCIE_RC:** AXI_Slave, AXI_DBI, BusMaster bound to Host_Chiplet **SharedMemoryMap** (config and memory space). - **PCIe_EP:** AXI_DBI bound to Keraunos_PCIE_Chiplet **SharedMemoryMap**; **PCIMem** / **PCIMem_Slave** connected directly to **PCIE_RC** (TLP traffic). **BusMaster** connected to **PCIE_TILE.pcie_controller_target** (inbound TLPs to tile). **PCIe Tile connections:** - EP **BusMaster** to the tile’s **pcie_controller_target** (inbound TLPs from host). The tile’s **pcie_controller_initiator** to EP **AXI_Slave** (outbound TLPs to host). EP **PCIMem** / **PCIMem_Slave** are connected directly to the RC for the PCIe link. - The tile’s **noc_n_target** / **noc_n_initiator** and **smn_n_target** / **smn_n_initiator** connect to the chiplet’s SharedMemoryMap, which decodes to Target_Memory and tile register windows. #### 3.5.4 Signal- and Interface-Level Mapping (EP ↔ PCIe Tile) The Synopsys DesignWare PCIe model (PCIe_2_0) exposes the following interface groups. The mapping to the Keraunos PCIe Tile ports enables a drop-in style integration when the tile is added to the VDK. **TLM (inband) — DesignWare EP ↔ PCIe Tile:** | DesignWare EP interface (VDK) | Direction | PCIe Tile port | Description | |-------------------------------|-----------|----------------|-------------| | AXI_Slave | Slave (in) | **pcie_controller_initiator** | Outbound TLPs: tile sends Memory Read/Write/Completion to EP; EP receives on AXI_Slave and sends over link to RC. | | AXI_DBI | Slave (in) | — | DBI/config; may remain to SharedMemoryMap or be routed per platform. | | BusMaster | Master (out)| **pcie_controller_target** | Inbound TLPs: EP delivers host Memory Read/Write to tile (EP BusMaster → tile target). | | PCIMem | Master (out)| — | EP as master toward link (direct to RC). Not connected to tile. | | PCIMem_Slave | Slave (in) | — | Inbound TLPs from link (RC → EP); connects directly to RC, not to tile. | So: **BusMaster (EP)** → **pcie_controller_target (Tile)** for inbound TLPs; **pcie_controller_initiator (Tile)** → **AXI_Slave (EP)** for outbound TLPs. PCIMem/PCIMem_Slave stay on the link side (EP ↔ RC direct). AXI_DBI can remain to SharedMemoryMap. **Sideband — DesignWare EP ↔ PCIe Tile (sc_in / sc_out):** DesignWare PCIe_2_0 exposes a number of reset, clock, and sideband pins. Map them to the tile’s `sc_in` and `sc_out` as follows so that the VDK integration matches Section 3.4. | Tile sc_in (receive) | Source (EP or system) | DesignWare EP / system signal (typical name) | |----------------------|------------------------|---------------------------------------------| | pcie_core_clk | EP | cc_core_clk or equivalent core clock | | pcie_controller_reset_n | EP | pcie_axi_ares or combined reset_n | | pcie_cii_hv | EP | CII header valid | | pcie_cii_hdr_type | EP | CII header type [4:0] | | pcie_cii_hdr_addr | EP | CII header address [11:0] | | pcie_flr_request | EP | FLR request | | pcie_hot_reset | EP | Hot reset | | pcie_ras_error | EP | RAS error | | pcie_dma_completion | EP | DMA completion | | pcie_misc_int | EP | Miscellaneous interrupt | | cold_reset_n | System (e.g. CustomResetController) | SoC cold reset | | warm_reset_n | System | SoC warm reset | | isolate_req | System | Isolation request | | axi_clk | System | AXI clock | | Tile sc_out (drive) | Sink (EP or system) | DesignWare EP / system signal (typical name) | |----------------------|------------------------|---------------------------------------------| | pcie_app_bus_num | EP | App bus number | | pcie_app_dev_num | EP | App device number | | pcie_device_type | EP | Device type | | pcie_sys_int | EP | System interrupt to controller | | function_level_reset | EP | FLR completion | | hot_reset_requested | EP | Hot reset requested | | config_update | EP | Config update | | ras_error | EP | RAS error to controller | | dma_completion | EP | DMA completion to controller | | controller_misc_int | EP | Controller misc interrupt | | noc_timeout | EP / system | NOC timeout [2:0] | (Exact DesignWare signal names may vary by IP version; use the EP model’s documentation or RTL interface list to align names.) #### 3.5.5 Connection Diagram for Easy Integration A single diagram that ties VDK instances to tile ports and EP ports is below. Use it as a checklist when wiring the PCIe Tile into the VDK behind the Synopsys EP. ```{mermaid} graph LR subgraph host_side["Host / RC (VDK Primary_Chiplet)"] RC[PCIe_RC] end subgraph keraunos_chiplet["Keraunos Chiplet (e.g. Secondary_Chiplet_1)"] subgraph ep_block["Synopsys EP"] EP[PCIe_2_0] EP_PCIMem[PCIMem] EP_PCIMemSlave[PCIMem_Slave] EP_BusM[BusMaster] EP_AXI_S[AXI_Slave] EP_AXI_DBI[AXI_DBI] EP_RST[resets] EP_CLK[clocks] EP_SB[sideband] end subgraph tile_block["Keraunos PCIe Tile"] T_tgt[pcie_controller_target] T_init[pcie_controller_initiator] T_noc_t[noc_n_target] T_noc_i[noc_n_initiator] T_smn_t[smn_n_target] T_smn_i[smn_n_initiator] T_sc_in[sc_in] T_sc_out[sc_out] end end RC -->|PCIMem direct| EP_PCIMemSlave EP_PCIMem -->|PCIMem direct| RC EP_BusM -->|TLM inbound| T_tgt T_init -->|TLM outbound| EP_AXI_S EP_RST --> T_sc_in EP_CLK --> T_sc_in EP_SB --> T_sc_in T_sc_out --> EP_SB T_noc_i --> NOC[NOC] T_smn_i --> SMN[SMN] NOC --> T_noc_t SMN --> T_smn_t style T_tgt fill:#c8e6c9 style T_init fill:#c8e6c9 style T_sc_in fill:#f8bbd0 style T_sc_out fill:#f8bbd0 ``` **Integration checklist:** 1. **Inband:** Bind EP **BusMaster** (inbound TLPs to device) to the tile’s **pcie_controller_target**. Bind the tile’s **pcie_controller_initiator** (outbound TLPs to host) to EP **AXI_Slave**. Keep EP **PCIMem** / **PCIMem_Slave** connected directly to the RC. 2. **Sideband:** Connect all EP and system reset/clock/sideband outputs to the tile’s **sc_in**; connect all tile **sc_out** to the EP (and system) inputs as in the table above. 3. **NOC/SMN:** Connect tile **noc_n_target** / **noc_n_initiator** and **smn_n_target** / **smn_n_initiator** to the chiplet’s SharedMemoryMap, which decodes to Target_Memory and tile register windows. 4. **RC–EP link:** Use a **direct** link: bind **RC PCIMem** to **EP PCIMem_Slave** and **RC PCIMem_Slave** to **EP PCIMem**. This ensures the VDK integration of the PCIe Tile and Synopsys PCIe Controller (RC and EP) matches the Keraunos system architecture and Section 3.4, and can be integrated with minimal rework. #### 3.5.6 DesignWare PCIe EP and RC Interfaces: Connect to Tile / System vs Stub The Ascalon chiplet vdksys **Peripherals** section instantiates the Synopsys DesignWare **PCIe_2_0** model for both **PCIe_RC** (Primary_Chiplet) and **PCIe_EP** (Secondary_Chiplet_1/2/3). The model exposes a large set of TLM, RESET, CLOCK, and Default (sideband) interfaces. This subsection first gives direct **signal/interface correspondence tables** (PCIe Tile ↔ EP, and Host/DRAM ↔ RC), then lists disposition (connect vs stub) for each interface group. **Table 1 — PCIe Tile ↔ DesignWare PCIe EP: signal and interface correspondence** Each row shows which PCIe Tile port connects to which DesignWare PCIe EP port. Connect the **Tile** column to the **DesignWare EP** column as indicated. | PCIe Tile (signal / interface) | DesignWare PCIe EP (signal / interface) | |--------------------------------|----------------------------------------| | **TLM** | | | `pcie_controller_target` (TLM target) | `BusMaster` (EP TLM master) — EP delivers inbound TLPs from host to tile | | `pcie_controller_initiator` (TLM initiator) | `AXI_Slave` (EP TLM slave) — tile sends outbound TLPs to EP | | **Clocks (tile sc_in)** | | | `pcie_core_clk` | `cc_core_clk` | | `axi_clk` | `cc_dbi_aclk` (or system SYSCLK) | | **Resets (tile sc_in)** | | | `pcie_controller_reset_n` | `pcie_axi_ares` (invert for active-low), or combined from `cc_dbi_ares`, `cc_core_ares`, `cc_pwr_ares`, `cc_phy_ares` | | `cold_reset_n` | From system (e.g. CustomResetController), not EP | | `warm_reset_n` | From system, not EP | | **CII — EP to tile (tile sc_in)** | | | `pcie_cii_hv` | `lbc_cii_hv` | | `pcie_cii_hdr_type` | `lbc_cii_hdr_type` | | `pcie_cii_hdr_addr` | `lbc_cii_hdr_addr` | | **FLR / hot reset — EP to tile (tile sc_in)** | | | `pcie_flr_request` | `cfg_flr_pf_active_x` (EP drives FLR request to tile) | | `pcie_hot_reset` | `link_req_rst_not` or `training_rst_n` / `smlh_req_rst_not` (link/controller hot reset) | | **Error / DMA / misc — EP to tile (tile sc_in)** | | | `pcie_ras_error` | `pcie_parc_int` or `app_err_*` / `cfg_aer_*` (RAS/error from EP) | | `pcie_dma_completion` | `dma_wdxfer_done_togg[]` / `dma_rdxfer_done_togg[]` or `edma_int` (DMA completion) | | `pcie_misc_int` | `edma_int_rd_chan[]` / `edma_int_wr_chan[]` or other controller misc interrupt | | **System to tile (tile sc_in)** | | | `isolate_req` | From system (isolation), not EP | | **Tile to EP (tile sc_out)** | | | `pcie_app_bus_num` | `app_bus_num` | | `pcie_app_dev_num` | `app_dev_num` | | `pcie_device_type` | `device_type` | | `pcie_sys_int` | `sys_int` | | `function_level_reset` | `app_flr_pf_done_x` (tile signals FLR done to EP) | | `hot_reset_requested` | To EP hot-reset input (e.g. app_init_rst or link side as applicable) | | `config_update` | To EP config-update input if present; otherwise stub | | `ras_error` | To EP RAS/error input (e.g. app_err_* side) | | `dma_completion` | To EP DMA completion input (e.g. dma_*xfer_go_togg or equivalent) | | `controller_misc_int` | To EP misc interrupt input | | `noc_timeout` | To EP or system (NOC timeout status) | *Note:* EP **AXI_Slave** is connected to the tile’s **pcie_controller_initiator** (outbound path). EP **AXI_DBI**, **PCIMem**, **ELBIMaster** are not connected to the tile: AXI_DBI can go to SharedMemoryMap; PCIMem/PCIMem_Slave connect directly to the RC for the PCIe link. Tile ports **noc_n_target**, **noc_n_initiator**, **smn_n_target**, **smn_n_initiator** connect to the chiplet SharedMemoryMap, not to the EP. --- **Table 2 — Host / DRAM ↔ DesignWare PCIe RC: signal and interface connection** Each row shows which Host- or system-side element connects to which DesignWare PCIe RC port. The RC has no PCIe Tile behind it; it connects to host resources and to the PCIe link (directly to EP). | Host / DRAM (or system element) | DesignWare PCIe RC (signal / interface) | |---------------------------------|----------------------------------------| | **Host config / MMIO (config space, DBI)** | | | SharedMemoryMap (config space region) | `AXI_Slave` | | SharedMemoryMap (DBI region) | `AXI_DBI` | | **Host memory (RC as master — downstream TLPs)** | | | SharedMemoryMap (memory region for host-initiated TLPs) | `BusMaster` | | **PCIe link (TLPs to/from EP)** | | | EP `PCIMem_Slave` (direct link) | `PCIMem` — RC sends downstream TLPs | | EP `PCIMem` (direct link) | `PCIMem_Slave` — RC receives upstream TLPs from EP | | **Clocks** | | | SYSCLK (e.g. Primary_Chiplet SYSCLK) | `cc_core_clk` | | SYSCLK | `cc_dbi_aclk` | | **Resets** | | | Peripherals Reset / RST_GEN | `pcie_axi_ares`, `cc_dbi_ares`, `cc_core_ares`, `cc_pwr_ares`, `cc_phy_ares` | | **Interrupts (host side)** | | | TT_APLIC_TLM2 (e.g. irqS[3] in vdksys) | `msi_ctrl_int` — RC MSI to host interrupt controller | | **Not connected (stub)** | | | — | `ELBIMaster` — stub | | — | `cc_pipe_clk`, `cc_aux_clk`, `refclk`, `cc_aclkSlv`, `cc_aclkMstr` — stub | | — | All optional sideband (sys_int, device_type, link_up, app_ltssm_en, power/L1/L2, etc.) — stub | *Note:* In the vdksys, Host/DRAM is represented by **SharedMemoryMap** and **SYSCLK** on Host_Chiplet. Host CPU traffic is modeled via the RC’s **AXI_Slave** (config), **AXI_DBI** (DBI), and **BusMaster** (memory TLPs) bound to SharedMemoryMap. The **PCIMem** / **PCIMem_Slave** connect the RC directly to the EP for the PCIe link. --- **A. PCIe Endpoint (EP) — interfaces and disposition** | Category | DesignWare EP interface (vdksys) | Connect to PCIe Tile | Connect to system / other | Stub | Notes | |----------|----------------------------------|----------------------|---------------------------|------|--------| | **TLM** | AXI_Slave | **pcie_controller_initiator** | — | — | Outbound TLPs: tile → EP (tile initiates to EP AXI_Slave). | | **TLM** | AXI_DBI | — | SharedMemoryMap (DBI region) | — | DBI/config. | | **TLM** | BusMaster | **pcie_controller_target** | — | — | Inbound TLPs: EP delivers to tile (EP BusMaster → tile target). | | **TLM** | ELBIMaster | — | — | Yes (vdksys: auto stub) | Optional ELBI; not used for tile. | | **TLM** | PCIMem | — | Link (direct to RC) | — | EP as master toward link. | | **TLM** | PCIMem_Slave | — | Link (direct to RC) | — | EP receives from link; not connected to tile. | | **RESET** | pcie_axi_ares, cc_dbi_ares, cc_core_ares, cc_pwr_ares, cc_phy_ares | **pcie_controller_reset_n** (or combine) | Peripherals Reset / RST_GEN | — | Drive tile reset from same source as EP. | | **CLOCK** | cc_core_clk | **pcie_core_clk** (tile sc_in) | — | — | EP core clock to tile. | | **CLOCK** | cc_dbi_aclk | — | SYSCLK (bound in vdksys) | — | DBI clock; also usable as axi_clk for tile. | | **CLOCK** | cc_pipe_clk, cc_aclkSlv, cc_aclkMstr, cc_aux_clk, refclk | — | — | Yes (vdksys: cc_pipe_clk stubbed) | Internal/PHY clocks; stub if not driving tile. | | **Sideband (CII)** | lbc_cii_hv, lbc_cii_dv, lbc_cii_hdr_type, lbc_cii_hdr_addr, lbc_cii_hdr_* | **pcie_cii_hv**, **pcie_cii_hdr_type**, **pcie_cii_hdr_addr** (tile sc_in) | — | Rest of CII if tile does not use | CII = Configuration Interface Info; map key signals to tile. | | **Sideband (FLR)** | cfg_flr_pf_active_x, app_flr_pf_done_x | **pcie_flr_request** (in), **function_level_reset** (out) | — | — | FLR handshake between EP and tile. | | **Sideband (hot reset, etc.)** | link_req_rst_not, training_rst_n, smlh_req_rst_not | **pcie_hot_reset** (tile sc_in), **hot_reset_requested** (tile sc_out) | — | — | As needed for tile behavior. | | **Sideband (bus/dev)** | app_bus_num, app_dev_num | **pcie_app_bus_num**, **pcie_app_dev_num** (tile sc_out) | — | — | Tile drives EP with assigned BDF. | | **Sideband (device type)** | device_type (slave on EP) | **pcie_device_type** (tile sc_out) | — | Yes if not using | vdksys stubs; connect from tile when integrated. | | **Sideband (interrupt)** | sys_int (slave on EP) | **pcie_sys_int** (tile sc_out) | — | Yes in vdksys | Connect tile pcie_sys_int to EP sys_int when integrated. | | **Sideband (DMA)** | dma_wdxfer_done_togg[], dma_rdxfer_done_togg[], edma_int_rd_chan[], edma_int_wr_chan[], edma_int | **pcie_dma_completion**, **controller_misc_int** (tile sc_in/sc_out) | — | Optional | Map DMA completion / misc int to tile as needed. | | **Sideband (RAS/error)** | pcie_parc_int, app_err_*, cfg_aer_* | **pcie_ras_error** (tile sc_in), **ras_error** (tile sc_out) | — | Optional | Connect if tile implements RAS/error reporting. | | **MSI** | msi_ctrl_int, msi_ctrl_int_vec_*[], msi_gen, ven_msi_*, msix_addr, msix_data, cfg_msix_* | — | APLIC / interrupt controller (e.g. TT_APLIC_TLM2) | Stub msi_gen in vdksys | RC binds msi_ctrl_int to APLIC; EP same for device MSI. | | **Power / L1/L2** | apps_pm_xmt_turnoff, app_req_entr_l1, app_req_exit_l1, pme_en, pme_stat, clk_req, clk_req_in, pm_linkst_*, pm_dstate, radm_pm_* | — | — | Yes (vdksys stubs many) | Power management; stub for minimal tile integration. | | **Other sideband** | ready_entr_l23, app_ltssm_en, link_up, sys_pre_det_state, app_unlock_msg, app_ltr_*, app_init_rst, bridge_flush_not, hp_int, hp_msi, RADM_inta/b/c/d, cfg_pme_*, radm_pm_to_ack, slv_*misc_info, mstr_*misc_info, app_hdr_log, app_tlp_prfx_log, app_err_*, ven_msi_tc, ven_msi_vector, cfg_msi_*, CxlRegAccess, ptm_*, etc. | — | — | Yes | Optional or debug; stub to simplify integration. | **B. PCIe Root Complex (RC) — interfaces and disposition** The RC has the same DesignWare PCIe_2_0 interface set. There is **no PCIe Tile** behind the RC (the tile is behind the EP on the Keraunos chiplet). So RC interfaces either connect to the **link** (directly to EP), to the **system** (SharedMemoryMap, SYSCLK, APLIC), or are **stubbed**. | Category | DesignWare RC interface (vdksys) | Connect to link (EP / Switch) | Connect to system | Stub | Notes | |----------|----------------------------------|-------------------------------|-------------------|------|--------| | **TLM** | AXI_Slave, AXI_DBI | — | SharedMemoryMap (config, DBI) | — | Host config space; bound in vdksys. | | **TLM** | BusMaster | — | SharedMemoryMap (memory) | — | Host-initiated TLPs; bound in vdksys. | | **TLM** | ELBIMaster | — | — | Yes (vdksys: auto stub) | Optional. | | **TLM** | PCIMem | EP PCIMem_Slave (direct) | — | — | Downstream TLPs. | | **TLM** | PCIMem_Slave | EP PCIMem (direct) | — | — | Upstream TLPs from EP. | | **RESET** | pcie_axi_ares, cc_*_ares | — | Peripherals Reset / RST_GEN | — | Same as EP. | | **CLOCK** | cc_core_clk, cc_dbi_aclk | — | SYSCLK | — | Bound in vdksys. | | **CLOCK** | cc_pipe_clk, cc_aux_clk, refclk, cc_aclkSlv, cc_aclkMstr | — | — | Yes | Stub if not used. | | **MSI** | msi_ctrl_int | — | TT_APLIC_TLM2 (irqS[3] in vdksys) | — | Host RC MSI to APLIC. | | **Sideband** | sys_int, device_type, ready_entr_l23, app_ltssm_en, link_up, sys_pre_det_state, apps_pm_xmt_turnoff, clk_req_in, app_req_entr_l1, app_req_exit_l1, app_flr_pf_done_x, app_ltr_*, msi_gen, and all other Default/sideband | — | — | Yes (vdksys stubs many) | No tile; stub optional RC sideband. | **C. Summary** - **EP:** Connect to **PCIe Tile**: EP **BusMaster** → tile **pcie_controller_target** (inbound); tile **pcie_controller_initiator** → EP **AXI_Slave** (outbound). Resets and cc_core_clk (and optionally cc_dbi_aclk) to tile sc_in; CII, FLR, hot reset, app_bus_num, app_dev_num, device_type, sys_int, dma_completion, ras_error to/from tile sc_in/sc_out as in Section 3.4. Connect to **system**: AXI_DBI to SharedMemoryMap; cc_dbi_aclk to SYSCLK; msi_ctrl_int to APLIC. **Stub**: ELBIMaster; optional/PHY clocks (cc_pipe_clk, etc.); power/L1/L2 and other optional sideband. - **RC:** Connect to **link**: PCIMem, PCIMem_Slave directly to EP. Connect to **system**: AXI_Slave, AXI_DBI, BusMaster to SharedMemoryMap; clocks to SYSCLK; msi_ctrl_int to APLIC. **Stub**: All optional sideband and PHY clocks as in vdksys. --- ## 4. Connectivity Architecture ### 4.1 Inbound Data Path (Host → Chip) **Use Case:** Host CPU writes data to Quasar compute cores or Mimir memory. The host communicates with the PCIe Tile via a **PCIe Controller**. On the host side the controller is a **Root Complex** (RC); on the Keraunos chip side the controller is the **Synopsys PCIe Controller IP** (DesignWare), configured as an **Endpoint** (EP). So the link is Host (RC) ↔ Keraunos (EP); the PCIe Tile sits behind the endpoint and receives TLPs from it over the internal interface. ```{mermaid} sequenceDiagram participant Host as Host CPU participant Ctrl as PCIe Controller participant Tile as PCIe Tile participant TLB as Inbound TLB participant Switch as PCIe-SMN-IO Switch participant NOC as NOC-N participant SMN as SMN participant Quasar as Quasar Host->>Ctrl: Memory Write Ctrl->>Tile: Memory Write TLP Tile->>TLB: Lookup Translation TLB-->>Tile: System Address Tile->>Switch: Forward Switch->>Switch: Route Decision alt NOC-bound Switch->>NOC: Forward NOC->>Quasar: D2D to Quasar Quasar-->>NOC: Response NOC-->>Switch: Response else SMN-bound Switch->>SMN: Forward SMN-->>Switch: Response end Switch-->>Tile: Completion Tile-->>Ctrl: Completion TLP Ctrl-->>Host: Completion ``` **Key Steps:** 1. Host initiates PCIe Memory Write targeting Keraunos BAR (Base Address Register); the request is sent via the **PCIe Controller** over the PCIe link. 2. PCIe Controller delivers the Memory Write TLP to the PCIe Tile; the tile receives the transaction via its `pcie_inbound` socket. 3. Inbound TLB translates host address to system address space 4. PCIe-SMN-IO Switch routes based on address: - **0x0000\_0000\_0000 - 0x0000\_FFFF\_FFFF:** NOC-bound (via `noc_n_initiator`) - **0x1000\_0000\_0000 - 0x1FFF\_FFFF\_FFFF:** SMN-bound (via `smn_n_initiator`) 5. Transaction forwarded to NOC-N or SMN 6. NOC-N routes via D2D links to destination Quasar/Mimir chiplet 7. Response traverses back through the same path; PCIe Tile sends Completion TLP to the PCIe Controller, which delivers it to the Host. **Sideband signal flow (inbound use case):** During inbound (Host → Chip), the EP and system drive sideband inputs to the PCIe Tile so the tile can accept and process TLPs; the tile drives sideband outputs back to the EP. The flow is: ```{mermaid} graph LR subgraph host_rc["Host / RC"] H[Host] end subgraph ep["EP - Synopsys Controller"] EP_IN[Sideband Out] EP_OUT[Sideband In] end subgraph tile["PCIe Tile"] SC_IN[sc_in] SC_OUT[sc_out] end subgraph sys["System SMC"] SYS[Reset / Isolate] end H -->|TLP| EP_IN EP_IN -->|clk, reset_n, CII, FLR, hot_reset, ras, dma_cpl, misc_int| SC_IN SYS -->|cold_reset_n, warm_reset_n, isolate_req, axi_clk| SC_IN SC_OUT -->|FLR out, hot_reset_req, config_update, ras, dma_cpl, controller_misc_int, noc_timeout, bus/dev, device_type, sys_int| EP_OUT EP_OUT -->|Optional to RC| H style SC_IN fill:#fce4ec style SC_OUT fill:#e8f5e9 ``` | Direction | Signals | Role in inbound use case | |-----------|---------|--------------------------| | EP → Tile (sc_in) | `pcie_core_clk`, `pcie_controller_reset_n`, `pcie_cii_*`, `pcie_flr_request`, `pcie_hot_reset`, `pcie_ras_error`, `pcie_dma_completion`, `pcie_misc_int` | Clock and reset so tile is ready; CII for config info; EP may assert FLR/hot_reset/errors during or after inbound TLPs. | | System → Tile (sc_in) | `cold_reset_n`, `warm_reset_n`, `isolate_req`, `axi_clk` | SoC reset and isolation; AXI clock for config/MMIO. | | Tile → EP (sc_out) | `function_level_reset`, `hot_reset_requested`, `config_update`, `ras_error`, `dma_completion`, `controller_misc_int`, `noc_timeout`, `pcie_app_bus_num`, `pcie_app_dev_num`, `pcie_device_type`, `pcie_sys_int` | FLR/hot_reset handshake; config and error reporting; NOC timeout; SII bus/dev and interrupt to EP. | ### 4.2 Outbound Data Path (Chip → Host) **Use Case:** Quasar compute cores send results back to host DRAM or trigger MSI interrupts. ```{mermaid} sequenceDiagram participant Quasar as Quasar participant NOC as NOC-N participant PCIe as PCIe Tile participant TLB as Outbound TLB participant Ctrl as PCIe Controller participant Host as Host Quasar->>NOC: Write NOC->>PCIe: Forward PCIe->>TLB: TLB Lookup TLB-->>PCIe: Host Address PCIe->>Ctrl: Forward Ctrl->>Host: Memory Write TLP Host-->>Ctrl: Completion Ctrl-->>PCIe: Response PCIe-->>NOC: Response NOC-->>Quasar: Completion ``` **Key Steps:** 1. Quasar initiates write targeting PCIe address range (typically host DRAM) 2. NOC-N routes to Keraunos PCIe Tile via `noc_n_outbound` socket 3. Outbound TLB translates system address back to host physical address 4. PCIe Controller (`pcie_controller_initiator`) generates PCIe Memory Write TLP 5. Transaction sent over PCIe link to host 6. Host DRAM responds with completion 7. Response propagates back through PCIe Tile → NOC → Quasar **Sideband signal flow (outbound use case):** During outbound (Chip → Host), the same sideband links carry status and handshake: the EP may drive reset/FLR; the tile uses sideband outputs to signal completion and errors to the EP so the EP can complete TLPs toward the host. ```{mermaid} graph LR subgraph quasar_noc["Quasar / NOC"] Q[Quasar] end subgraph tile["PCIe Tile"] SC_IN[sc_in] SC_OUT[sc_out] end subgraph ep["EP - Synopsys Controller"] EP_IN[Sideband In] EP_OUT[Sideband Out] end subgraph host["Host"] H[Host DRAM] end Q -->|TLP| tile EP_OUT -->|clk, reset_n, CII, FLR, hot_reset, ras, dma_cpl, misc_int| SC_IN SC_OUT -->|dma_completion, controller_misc_int, config_update, ras_error, noc_timeout, FLR out, hot_reset_req| EP_IN EP_IN -->|TLP to host| H H -->|Completion| EP_IN EP_IN -.->|Completion sideband if any| EP_OUT style SC_IN fill:#fce4ec style SC_OUT fill:#e8f5e9 ``` | Direction | Signals | Role in outbound use case | |-----------|---------|---------------------------| | EP → Tile (sc_in) | Same as inbound | Clock and reset; CII; EP can assert FLR/hot_reset or error sideband during outbound. | | Tile → EP (sc_out) | `dma_completion`, `controller_misc_int`, `config_update`, `ras_error`, `noc_timeout`, `function_level_reset`, `hot_reset_requested` | Tell EP when tile has completed work (e.g. outbound DMA) or hit errors; FLR/hot_reset handshake; NOC timeout so EP can report or retry. | ### 4.3 Configuration Path (SMN → PCIe Tile Registers) **Use Case:** SMC programs PCIe Tile TLBs, enables MSI relay, or reads error status. ```{mermaid} sequenceDiagram participant SMC as SMC participant SMN as SMN participant PCIe as PCIe Config SMC->>SMN: SMN Write SMN->>PCIe: Forward PCIe->>PCIe: Update TLB or MSI PCIe-->>SMN: Response SMN-->>SMC: Complete ``` **Addressable Registers (via SMN):** - **0x1804\_0000 - 0x1804\_07FF:** Inbound TLB configurations (8 entries) - **0x1804\_0800 - 0x1804\_0FFF:** Outbound TLB configurations (8 entries) - **0x1800\_0000 - 0x1800\_0FFF:** MSI Relay registers - **0x1802\_0000 - 0x1802\_0FFF:** PCIe error status and control ### 4.4 MSI Interrupt Path (Chip → Host) **Use Case:** Ethernet controller or Quasar triggers interrupt to host driver. ```{mermaid} sequenceDiagram participant HSIO as HSIO Tile participant SMN as SMN participant MSI as MSI Relay participant PCIe as PCIe Controller participant Host as Host CPU HSIO->>SMN: Trigger Interrupt SMN->>MSI: Forward MSI->>MSI: Translate to MSI-X MSI->>PCIe: MSI-X TLP PCIe->>Host: MSI-X Write Host->>Host: ISR ``` --- ## 5. Data Flow Paths ### 5.1 End-to-End Data Flow Example: Host DMA to Quasar **Scenario:** Host writes 4KB of neural network weights to Quasar L1 memory. ```{mermaid} graph LR subgraph host["Host System"] A[Host DMA] end subgraph pcie_tile["PCIe Tile"] B[PCIe Inbound] C[Inbound TLB] D[Switch] E[NOC Initiator] end subgraph noc["NOC"] F[QNP Mesh] G[D2D Tile] end subgraph quasar["Quasar"] H[NOC Router] I[L1 Memory] end A -->|PCIe Write| B B --> C C -->|Translate| D D --> E E --> F F -->|QNP| G G -->|BoW| H H --> I style B fill:#ffcccc style C fill:#ffe6cc style D fill:#d9f7d9 style E fill:#cce5ff style F fill:#cce5ff style G fill:#e6ccff style I fill:#ffe6cc ``` **Address Translation:** - **Host Address:** `0x8000_0000` (PCIe BAR + offset) - **Inbound TLB Lookup:** Maps application region 0 → NOC address - **System Address:** `0x0000_0000_4000_0000` (Quasar chiplet, NOC coordinates, L1 offset) - **Physical Routing:** QNP mesh routes to D2D tile 2 → Quasar chiplet ID 1 → Tensix core (4,5) ### 5.2 Multi-Hop Data Flow: Quasar → PCIe → Host → PCIe → Quasar **Scenario:** Quasar chiplet 0 sends data to Quasar chiplet 1 in a different Grendel package via host DRAM (zero-copy). ```{mermaid} graph TB subgraph pkg0["Package 0"] Q0[Quasar 0] K0[Keraunos PCIe 0] Q0 -->|1. NOC Write| K0 end H[Host DRAM] K0 -->|2. PCIe Write| H subgraph pkg1["Package 1"] K1[Keraunos PCIe 1] Q1[Quasar 1] K1 -->|4. NOC Write| Q1 end H -->|3. PCIe Read| K1 style Q0 fill:#ffe6cc style K0 fill:#ffcccc style H fill:#fff4e1 style K1 fill:#ffcccc style Q1 fill:#ffe6cc ``` --- ## 6. Address Space Integration ### 6.1 System Address Map The Keraunos-E100 local address map is a subset of the broader Grendel system address map: | Address Range | Target | Description | |---------------|--------|-------------| | `0x0000_0000_0000 - 0x0000_FFFF_FFFF` | NOC-N | Quasar/Mimir chiplets via D2D | | `0x1000_0000_0000 - 0x1000_0FFF_FFFF` | SMN (SEP) | Security Engine Processor | | `0x1001_0000_0000 - 0x1001_0FFF_FFFF` | SMN (SMC) | System Management Controller | | `0x1800_0000_0000 - 0x1800_0FFF_FFFF` | SMN (MSI) | MSI Relay in PCIe Tile | | `0x1802_0000_0000 - 0x1802_0FFF_FFFF` | SMN (PCIe Err) | PCIe Tile error registers | | `0x1804_0000_0000 - 0x1804_0FFF_FFFF` | SMN (TLB) | PCIe Tile TLB configurations | | `0x2000_0000_0000 - 0x2000_00FF_FFFF` | HSIO | HSIO tile 0 (CCE, Ethernet, SRAM) | | `0x2001_0000_0000 - 0x2001_00FF_FFFF` | HSIO | HSIO tile 1 (CCE, Ethernet, SRAM) | ### 6.2 PCIe BAR (Base Address Register) Mapping The PCIe Tile exposes multiple BARs to the host: | BAR | Size | Type | Purpose | |-----|------|------|---------| | BAR0 | 256MB | Memory, 64-bit | Main data path (DMA to/from Quasar) | | BAR2 | 16MB | Memory, 64-bit | Configuration space (SMC mailboxes, TLB programming) | | BAR4 | 64KB | Memory, 64-bit | MSI-X table | **BAR0 Inbound TLB Mapping Example:** - Host writes to `BAR0 + 0x1000_0000` (256MB offset) - Inbound TLB Entry 1 (Application region 1): - **Input Range:** `0x1000_0000 - 0x1FFF_FFFF` (256MB) - **Output Base:** `0x0000_0000_4000_0000` (NOC address for Quasar chiplet 1) - Translated Address: `0x0000_0000_4000_0000` (sent to NOC-N) ### 6.3 Address Translation Stages ```{mermaid} graph LR A[Host Addr] B[PCIe TLP] C[Inbound TLB] D[System Addr] E[NOC Routing] F[D2D Translate] G[Quasar Addr] A --> B B --> C C --> D D --> E E --> F F --> G style C fill:#ffe6cc style D fill:#cce5ff style E fill:#e6ccff ``` --- ## 7. System Use Cases ### 7.1 Use Case 1: Model Initialization **Objective:** Load a 10GB large language model from host to distributed Quasar memory. **Flow:** 1. Host driver programs PCIe Tile Inbound TLBs (8 entries for 8 memory regions) 2. Host DMA engine streams model weights via PCIe Memory Writes 3. PCIe Tile translates addresses and routes to NOC-N 4. NOC-N distributes data across multiple Quasar chiplets via D2D links 5. Quasar chiplets store weights in local L1/DRAM **Performance:** - PCIe Gen5 x16: ~64 GB/s theoretical, ~50 GB/s effective - Load time: 10GB / 50 GB/s = **200ms** ### 7.2 Use Case 2: Inference Execution **Objective:** Run inference on Quasar chiplets, stream results back to host. **Flow:** 1. Host sends inference request descriptor via PCIe write (small payload: 256 bytes) 2. Quasar chiplets execute inference using cached model weights 3. Quasar writes results to host DRAM via outbound TLB (PCIe Memory Write) 4. Quasar triggers MSI-X interrupt via SMN → MSI Relay → PCIe 5. Host driver processes results **Latency:** - Request descriptor: ~1μs (PCIe TLP overhead) - Inference execution: Variable (model-dependent) - Result transfer (1MB): 1MB / 50 GB/s = **20μs** - MSI interrupt latency: ~2μs ### 7.3 Use Case 3: Package-to-Package Communication **Objective:** Enable Quasar chiplets in Package 0 to communicate with Package 1 over Ethernet. **Flow (Keraunos Ethernet-based):** 1. Quasar in Package 0 writes data to HSIO SRAM via NOC-N 2. CCE in HSIO tile prepares Ethernet packet 3. TT Ethernet Controller sends packet via 800G Ethernet to Package 1 4. Package 1 Ethernet Controller receives packet, writes to local HSIO SRAM 5. Local NOC-N forwards data to destination Quasar **Alternative Flow (PCIe-based, for same-host deployments):** 1. Quasar in Package 0 writes to host DRAM via PCIe Tile (outbound) 2. Package 1 PCIe Tile reads from host DRAM (inbound) 3. Forwarded to Package 1 Quasar via NOC-N ### 7.4 Use Case 4: System Management **Objective:** SMC monitors PCIe link status and reconfigures TLBs dynamically. **Flow:** 1. SMC reads PCIe link status registers via SMN (0x1802_0xxx) 2. Detects link degradation (Gen5 x16 → Gen5 x8) 3. SMC reprograms TLB entries to reduce traffic load 4. SMC triggers software notification via MSI-X 5. Host driver adjusts DMA batch sizes --- ## 8. Final VDK Platform: Linux-Booting PCIe Tile Integration ### 8.1 Overview The final validated VDK platform demonstrates a complete end-to-end PCIe data path with Linux running on the host. The platform: - **Boots RISC-V Linux** on the Host_Chiplet via OpenSBI (`fw_payload.elf`) - **Enumerates the PCIe Endpoint** using the Linux `snps,dw-pcie` driver - **Transfers data** from the host through the PCIe complex to memory attached to the PCIe Tile's `noc_n_initiator` port - Runs a **userspace application** (`pcie_xfer`) for interactive read/write operations through the PCIe BAR This section documents the final validated architecture as implemented in the reference workspace. ### 8.2 Dual-Chiplet VDK Topology The platform consists of two chiplet groups connected via a direct PCIe link: ```{mermaid} graph TB subgraph host["Host_Chiplet (Root Complex Side)"] HOST_CPU[TT_Rocket_LT RISC-V CPU
Runs Linux via OpenSBI] HOST_DRAM[DRAM @ 0x80000000
256 MB] HOST_UART[UART @ 0xC000A000] HOST_PLIC[PLIC @ 0xC4000000] HOST_CLINT[CLINT @ 0x02000000] HOST_RC[PCIE_RC
Synopsys DWC PCIe 2.0
Root Complex] HOST_SMM[SharedMemoryMap] HOST_RST[RST_GEN / CLK_GEN] HOST_CPU --> HOST_SMM HOST_SMM --> HOST_DRAM HOST_SMM --> HOST_UART HOST_SMM --> HOST_PLIC HOST_SMM --> HOST_CLINT HOST_SMM -->|DBI: 0x44000000| HOST_RC HOST_SMM -->|AXI: 0x70000000| HOST_RC HOST_RC --- HOST_RST end subgraph device["Keraunos_PCIE_Chiplet (Endpoint Side)"] DEV_CPU[TT_Rocket_LT RISC-V CPU
Runs pcie_bringup firmware] DEV_EP[PCIe_EP
Synopsys DWC PCIe 2.0
Endpoint] DEV_TILE[PCIE_TILE
Keraunos PCIe Tile] DEV_MEM[Target_Memory
16 MB @ 0x0] DEV_SMM[SharedMemoryMap] DEV_RST[RST_GEN / CLK_GEN] DEV_CPU --> DEV_SMM DEV_EP -->|BusMaster| DEV_TILE DEV_TILE -->|noc_n_initiator| DEV_SMM DEV_TILE -->|smn_n_initiator| DEV_SMM DEV_SMM --> DEV_MEM DEV_EP --- DEV_RST DEV_TILE --- DEV_RST end HOST_RC <-->|PCIMem / PCIMem_Slave
Direct PCIe Link| DEV_EP style HOST_CPU fill:#e3f2fd style HOST_RC fill:#ffcccc,stroke:#c00 style DEV_EP fill:#fff3e0 style DEV_TILE fill:#c8e6c9 style DEV_MEM fill:#ffe6cc style HOST_DRAM fill:#e1ffe1 ``` **Key architectural decisions in the final platform:** 1. **Direct RC–EP link** — RC's `PCIMem` binds to EP's `PCIMem_Slave` and vice versa 2. **Two independent RISC-V CPUs** — Host runs Linux; Device runs bare-metal firmware 3. **Target_Memory on noc_n_initiator path** — 16 MB memory at address 0x0 on the chiplet bus, reachable from the host through `EP → PCIE_TILE → noc_n_initiator → SharedMemoryMap → Target_Memory` 4. **MSI interrupt** — RC's `msi_ctrl_int` connected to Host SMC's `irqS[11]` for PCIe MSI-to-host notification ### 8.3 Host Memory Map The Host_Chiplet CPU sees the following address space: | Address | Size | Component | Purpose | |---------|------|-----------|---------| | `0x02000000` | 64 KB | CLINT | Timer and software interrupts | | `0x44000000` | 4 MB | PCIE_RC DBI | PCIe RC configuration (DBI registers) | | `0x44300000` | 128 KB | PCIE_RC ATU | iATU outbound/inbound windows (via DBI CS2) | | `0x70000000` | 256 MB | PCIE_RC AXI_Slave | PCIe config + memory window | | `0x70000000` | 16 MB | — Config sub-window | Type 0/1 config TLPs via iATU | | `0x71000000` | 240 MB | — MEM sub-window | Memory TLPs to EP BARs | | `0x80000000` | 256 MB | DRAM | Host main memory (Linux runs here) | | `0xC000A000` | 256 B | UART | DW APB UART (115.2 MHz clock) | | `0xC4000000` | 2 MB | PLIC | Platform-Level Interrupt Controller | ### 8.4 Device (Keraunos_PCIE_Chiplet) Memory Map The Keraunos_PCIE_Chiplet SharedMemoryMap provides the following decode for all initiators (EP BusMaster, PCIE_TILE noc_n/smn_n, SMC_Configure CPU): | Address | Size | Component | Purpose | |---------|------|-----------|---------| | `0x00000000` | 16 MB | Target_Memory | Main data memory (host-accessible via PCIe BAR) | | `0x18000000` | 8 MB | PCIE_TILE smn_n_target | SMN-side target window into the tile | | `0x44000000` | 4 MB | PCIe_EP AXI_DBI | EP DBI configuration registers | | `0x44400000` | 16 MB | PCIE_TILE noc_n_target | NoC-side target window into the tile | ### 8.5 End-to-End Data Path The critical data path for host-to-device memory transfers traverses: ```{mermaid} graph LR subgraph host["Host (Linux)"] APP[pcie_xfer app] CPU[RISC-V CPU] APP --> CPU end subgraph rc["Root Complex"] AXI_S[AXI_Slave
0x70000000] iATU[iATU
Address Translation] PCIMEM[PCIMem] AXI_S --> iATU iATU --> PCIMEM end subgraph link["PCIe Link"] TLP[Memory TLP] end subgraph ep["Endpoint"] PCIMEM_S[PCIMem_Slave] BUSM[BusMaster] PCIMEM_S --> BUSM end subgraph tile["PCIE_TILE"] PCT[pcie_controller_target] NOC_I[noc_n_initiator] PCT --> NOC_I end subgraph mem["Target Memory"] MEM[16 MB @ 0x0] end CPU -->|MMIO Write| AXI_S PCIMEM -->|TLP| TLP TLP --> PCIMEM_S BUSM --> PCT NOC_I --> MEM style APP fill:#e3f2fd style iATU fill:#ffe6cc style TLP fill:#e3f2fd style PCT fill:#c8e6c9 style NOC_I fill:#c8e6c9 style MEM fill:#ffe6cc ``` **Step-by-step flow:** 1. **Host application** (`pcie_xfer`) performs MMIO write to BAR0 (mapped via sysfs `resource0` or `/dev/mem`) 2. **CPU** issues a store to the PCIe MEM window (e.g. `0x71000000 + offset`) 3. **RC AXI_Slave** receives the transaction at `0x70000000` 4. **iATU** translates the address to a Memory Write TLP targeting the EP 5. **RC PCIMem** sends the TLP over the virtual PCIe link 6. **EP PCIMem_Slave** receives the TLP 7. **EP BusMaster** forwards the decoded transaction to `PCIE_TILE.pcie_controller_target` 8. **PCIE_TILE** routes the transaction through internal fabric to `noc_n_initiator` 9. **noc_n_initiator** accesses the chiplet SharedMemoryMap, which decodes to `Target_Memory` at address `0x0` 10. **Target_Memory** completes the write; the response propagates back through the same path ### 8.6 Linux Boot Flow ```{mermaid} sequenceDiagram participant VP as Virtualizer participant HOST as Host_Chiplet CPU participant SBI as OpenSBI (M-mode) participant LNX as Linux Kernel participant DRV as dw-pcie Driver participant EP as PCIe Endpoint VP->>HOST: Load fw_payload.elf (entry 0x80000000) VP->>EP: Load pcie_bringup.elf (device CPU) HOST->>SBI: CPU starts at 0x80000000 SBI->>SBI: Initialize M-mode, set up S-mode SBI->>LNX: Jump to kernel (S-mode) LNX->>LNX: Parse device tree (keraunos_host.dtb) LNX->>LNX: Initialize CLINT, PLIC, UART LNX->>DRV: Probe snps,dw-pcie (DBI @ 0x44000000) DRV->>DRV: Program iATU outbound windows DRV->>EP: Type 0 Config Read (bus 1, dev 0) EP-->>DRV: Return Vendor/Device ID DRV->>DRV: Enumerate EP, assign BARs DRV->>LNX: PCIe subsystem ready LNX->>LNX: Boot initramfs (rdinit=/init) LNX->>LNX: Shell prompt available ``` **Boot details:** - **OpenSBI** (`fw_payload.elf`) is loaded with ELF segment addresses (entry at `0x80000000`). The VP must **not** override the load address to `0x0`. - **No U-Boot** — OpenSBI directly boots the embedded Linux kernel. - **Device tree** (`keraunos_host.dts`) specifies the `snps,dw-pcie` compatible node with DBI at `0x44000000` and config window at `0x70000000`. - **Boot arguments**: `console=hvc0 earlycon=sbi rdinit=/init pci=realloc pci=assign-busses pci=noaer pcie_aspm=off` - The `pci=realloc` and `pci=assign-busses` flags are critical for the VP where BIOS/firmware has not pre-assigned PCI resources. - **Host CPU ISA**: `rv64imac` (no FPU) — kernel built with `CONFIG_FPU=n`, userspace uses musl `rv64imac` toolchain. ### 8.7 PCIe Enumeration The Linux `dw-pcie` driver enumerates the endpoint: | Property | Value | |----------|-------| | **Bus topology** | Bus 0: RC, Bus 1: EP (direct link) | | **Config access** | iATU-programmed window at `0x70000000` (16 MB), no ECAM | | **MEM window** | `0x71000000–0x7FFFFFFF` (240 MB prefetchable) | | **EP BAR0** | 16 MB memory BAR (`BAR0_MASK = 0xFFFFFF`) | | **MSI** | RC `msi_ctrl_int` → Host PLIC IRQ 32; INTx → PLIC IRQ 33 | | **Lanes** | 4 (VP config; physical Keraunos uses x16) | The device tree PCIe node: ``` pcie@44000000 { compatible = "snps,dw-pcie"; reg = <0x0 0x44000000 0x0 0x400000>, /* DBI */ <0x0 0x70000000 0x0 0x01000000>; /* config */ bus-range = <0x0 0x1>; ranges = <0x02000000 0x0 0x71000000 0x0 0x71000000 0x0 0x0F000000>; /* 240 MB MEM */ interrupts = <32>, <33>; /* MSI, INTx */ }; ``` ### 8.8 The pcie_xfer Application `pcie_xfer` is a Linux userspace utility that demonstrates the complete host-to-tile data path. It maps EP BAR0 via sysfs and performs MMIO read/write operations. **Capabilities:** | Command | Description | |---------|-------------| | `write ` | 32-bit MMIO write at BAR0 + offset | | `read ` | 32-bit MMIO read at BAR0 + offset | | `fill ` | Fill `count` dwords with a value | | `dump ` | Hex dump `count` dwords | | `pattern ` | Write incrementing pattern and verify readback | | `burst ` | Timed burst write for throughput measurement | | `verify ` | Verify `count` dwords match expected value | | `send ` | Write binary file contents to BAR0 | **Data path confirmed by pcie_xfer:** ``` Host CPU → RC AXI_Slave → iATU → PCIe TLP → EP PCIMem_Slave → EP BusMaster → PCIE_TILE.pcie_controller_target → noc_n_initiator → [Target_Memory] ``` **Usage example:** ```bash # Auto-detect EP and enter interactive mode pcie_xfer # Write 0xDEADBEEF at BAR0 offset 0x100 pcie_xfer -c "write 0x100 0xDEADBEEF" # Read back and verify pcie_xfer -c "read 0x100" # Write incrementing pattern (256 dwords) and verify pcie_xfer -c "pattern 0x0 0x100" ``` ### 8.9 Device-Side Firmware (pcie_bringup) The Keraunos_PCIE_Chiplet runs `pcie_bringup.elf` on its TT_Rocket_LT CPU (SMC_Configure). This bare-metal firmware: 1. Initializes the PCIe Endpoint controller via DBI registers at `0x44000000` 2. Programs Inbound TLB entries for BAR address translation 3. Sets BAR sizes (BAR0_MASK = 0xFFFFFF for 16 MB) 4. Asserts `system_ready` to signal the EP is ready for host enumeration 5. Waits for PCIe link-up before the host attempts configuration reads The device CPU and host CPU boot independently; the VP configuration ensures both start with `active_at_start = true` on their respective `RST_GEN`. ### 8.10 VP Configuration Two VP configurations are available: | Config | Host Image | Device Image | Quantum | Use Case | |--------|------------|--------------|---------|----------| | `default/default` | `riscv64-linux/output/fw_payload.elf` | `pcie_bringup/pcie_bringup.elf` | 6000 ps | Full Linux with standard kernel | | `mini_riscv64_linux/mini_riscv64_linux` | `mini-riscv64-linux/output/fw_payload.elf` | `pcie_bringup/pcie_bringup.elf` | 1000 ps | Fast-boot mini Linux for rapid iteration | **Key VPCFG overrides (both configs):** - **PCIE_RC / PCIe_EP clocks**: `cc_pipe_clk` at 250 MHz, all AXI/DBI/aux clocks at 100 MHz - **SHARED_DBI_ENABLED**: `false` (separate DBI window) - **UART_CLK**: 115.2 MHz - **Chiplet SharedMemoryMap decode**: `0x44000000:0x00400000:s;0x18000000:0x00800000;0x44400000:0x01000000;0x0:0x1000000` ### 8.11 Sideband Connections in the Final Platform The final VDK connects the following sideband signals between the PCIe models and the PCIE_TILE: **EP → PCIE_TILE:** | Signal | Source | Destination | |--------|--------|-------------| | `edma_int` / `edma_int_*` | EP DMA interrupts | `pcie_misc_int` on PCIE_TILE | | `pcie_parc_int` | EP parity/RAS error | `pcie_ras_error` on PCIE_TILE | | `lbc_cii_hv`, `lbc_cii_hdr_type`, `lbc_cii_hdr_addr` | EP CII signals | `pcie_cii_*` on PCIE_TILE | | `cfg_flr_pf_active_x[0]` | EP FLR request | `pcie_flr_request` on PCIE_TILE | **System → PCIE_TILE:** | Signal | Source | Destination | |--------|--------|-------------| | Clock | Chiplet CLK_GEN | PCIE_TILE clock input | | Reset | Chiplet RST_GEN | PCIE_TILE reset input | **Host MSI path:** `PCIE_RC.msi_ctrl_int` → `Host_Chiplet.SMC.irqS[11]` --- ## 9. Appendices ### 9.1 Acronyms and Abbreviations | Term | Definition | |------|------------| | AXI | Advanced eXtensible Interface (ARM AMBA standard) | | BAR | Base Address Register (PCIe configuration space) | | BoW | Bridge-of-Wire (die-to-die interconnect technology) | | CCE | Keraunos Compute Engine (DMA and packet processing) | | D2D | Die-to-Die (chiplet interconnect interface) | | DMA | Direct Memory Access | | HSIO | High-Speed Input/Output (Ethernet subsystem in Keraunos) | | ISR | Interrupt Service Routine | | MAC | Media Access Control (Ethernet layer) | | MSI | Message Signaled Interrupt (PCIe interrupt mechanism) | | NOC | Network-on-Chip | | PCS | Physical Coding Sublayer (Ethernet layer) | | QNP | Quasar NOC Protocol (internal NOC protocol) | | RISC-V | Reduced Instruction Set Computer - Version 5 (open ISA) | | SCML2 | SystemC Modeling Library 2 (Synopsys verification library) | | SEP | Security Engine Processor | | SMC | System Management Controller | | SMN | System Management Network (control plane NOC) | | SMU | System Management Unit (clock/power/reset control) | | SRAM | Static Random-Access Memory | | TLB | Translation Lookaside Buffer (address translation cache) | | TLP | Transaction Layer Packet (PCIe protocol) | | TLM | Transaction-Level Modeling (SystemC abstraction) | ### 9.2 Reference Documents 1. **Keraunos-E100 Architecture Specification** (keraunos-e100-for-review.pdf, v0.9.14) 2. **Keraunos PCIe Tile High-Level Design** (Keraunos_PCIe_Tile_HLD.md, v2.0) 3. **Keraunos PCIe Tile SystemC Design Document** (Keraunos_PCIE_Tile_SystemC_Design_Document.md, v2.1) 4. **Keraunos PCIe Tile Test Plan** (Keraunos_PCIE_Tile_Testplan.md, v2.1) 5. **PCIe Base Specification 5.0** (PCI-SIG, 2019) 6. **AMBA AXI and ACE Protocol Specification** (ARM IHI 0022E) 7. **SystemC TLM-2.0 Language Reference Manual** (IEEE 1666-2011) ### 9.3 Revision History | Version | Date | Author | Description | |---------|------|--------|-------------| | 1.0 | 2026-02-10 | System Architecture Team | Initial release | | 2.0 | 2026-03-26 | System Architecture Team | Added Section 8: Final VDK Platform with Linux boot, PCIe enumeration, pcie_xfer application, dual-chiplet topology, and end-to-end data path through noc_n_initiator to Target_Memory | --- **Document Control:** - **Classification:** Internal Use Only - **Distribution:** Keraunos Project Team, Grendel Architecture Team - **Review Cycle:** Quarterly or upon major architecture changes --- **End of Document**