Xilinx QDMA Tutorial

Project Creation

Open Vivado and Create a New Project. Select the destination folder and tick Do not specify sources at this time. When asked for the default part, move to the Boards pane and select your target board. For me, it’s the Alveo U280.

Block Design Creation

Once the project is open, under the Flow Navigator, click on IP Integrator -> Create Blok Design. Type in pcie_bd in the BD name and continue. We’re going to use the Vivado Board Flow for this design.

Once the BD is open, move to the Board pane. Here you can see all the external connections that your board has.

Right-click on PCI Express and select Connect Board Component. Select the mode to 16x and select the Xilinx Queue DMA IP Core.

A block should pop up called qdma_0 with ex-ports connected to it. For this design, we just want to attach a BRAM to the AXI Lite Master bus, so that we can R/W this memory over the PCIe Bus.

QDMA Setup

Before connecting other components, we must configure the QDMA IP core. Double-click on the block to open the IP Customization windows. Let’s make the following customizations:

Basic --> Mode --> Basic
Basic --> DMA Interface Options --> AXI MM
Capabilities --> Enable Mailbox among functions
PCIe: Bars --> Untick and tick again BAR2 (AXI Lite Master)

As per the QDMA IP Core Manual, Table 69 and Table 71, we must tie to 1b'1 both tm_dsc_sts_rdy and qsts_out_rdy. To do so, let’s click on the + sign and let’s add a Constant. The constant is already initialized as a 1b'1, so expand the tm_dsc_sts and qsts_out interfaces and connect the above-mentioned signals to the constant.

We now have to clock the QDMA. Let’s add an Utility Buffer IP core and make the following customization on the only board interface:

Board --> Board Interface --> pcie refclk

Connect the outputs of this block as:

(Utility Buffer).IBUF_OUT       <--> (QDMA).sys_clk_gt
(Utility Buffer).IBUF_DS_ODIV2  <--> (QDMA).sys_clk

Let’s click on Run Connection Automation which now connects the pcie_refclk and instantiates the timing and I/O planning constraints under the hood.

Block RAM Connection

Click on the + sign, and let’s add a Block RAM Generator. Leave it as it is for now. Click on the + sign, and let’s add an AXI BRAM Controller. Double click on the IP and make the following customizations:

AXI Protocol --> AXI Lite
Number of BRAM Interfaces --> 1

Make now the following connections

(AXI BRAM Controller).S_AXI
(AXI BRAM Controller).s_axi_aclk
(AXI BRAM Controller).S_axi_aresetn <--> (QDMA).axi_aresetn

The generated BD should look like this:


Before closing the design, from the Diagram pane, move to the Address Editor pane. Under qdma_0/M_AXI_LITE , right-click on the entry and hit Assign . Leave the default value for now and select 4k as space size.

Save and close the block design. Under Design Sources , you should be able to see the BD. Right-click on the BD, and let’s generate an HDL wrapper. The default option on the following pop-up it’s fine. When the design it’s bolded out, we can move to constrain the design.

Constraining and Implementing the design

Let’s click now on Run Implementation . On the following pop-ups, click always OK . When Implementation is finished, let’s select generate Bitstream on the pop-up. Once the bitstream generation completes, let’s select Open Hardware Manager on the pop-up.

On the Hardware manager, click on Open target -> Autoconnect Right-click the board name (for me, it’s xcu280_u55c_0), and select Program Device. Select the bitstream which is in <prj-folder>/<prj- name>.runs/impl_1/pcie_bd_wrapper.bit .

PCIe Driver

Xilinx Dispatches an open-source QDMA driver. Let’s download the driver from here. Execute the following commands:

$ sudo apt install libaio-dev
$ git clone https://github.com/Xilinx/dma_ip_drivers
$ cd ./dma_ip_drivers/QDMA/linux-kernel
$ make
$ sudo make install-mods
$ sudo make install-apps
$ sudo modprobe qdma-pf

If running on UEFI, please add the following line on top of the Makefile in dma_ip_drivers/QDMA/linux-kernel :


Putting all together

$ lspci | grep Xil
3b:00.0 Memory controller: Xilinx Corporation Device 903f
$ dmesg
[287268.681369] pci 0000:3b:00.0: Removing from iommu group 66
[287269.746689] pci 0000:3b:00.0: [10ee:903f] type 00 class 0x058000
[287269.746719] pci 0000:3b:00.0: reg 0x10: [mem 0x00000000-0x0003ffff 64bit]
[287269.746732] pci 0000:3b:00.0: reg 0x18: [mem 0x00000000-0x00000fff 64bit]
[287269.746757] pci 0000:3b:00.0: Max Payload Size set to 256 (was 128, max 1024)
[287269.747173] pci 0000:3b:00.0: Adding to iommu group 66
[287269.747274] pci 0000:3b:00.0: BAR 0: assigned [mem 0xb8600000-0xb863ffff 64bit]
[287269.747286] pci 0000:3b:00.0: BAR 2: assigned [mem 0xb8640000-0xb8640fff 64bit]
[287269.747420] qdma_pf:probe_one: 0000:3b:00.0: func 0x0, p/v 0/0,0x0000000000000000.
[287269.747424] qdma_pf:probe_one: Configuring '3b:00:0' as master pf
[287269.747426] qdma_pf:probe_one: Driver is loaded in auto(0) mode
[287269.747432] qdma_pf:qdma_device_open: qdma-pf, 3b:00.00, pdev 0x00000000105df4cb,
[287269.747446] qdma-pf 0000:3b:00.0: enabling device (0100 -> 0102)
[287269.747672] Device Type: Soft IP
[287269.747673] IP Type: EQDMA Soft IP
[287269.747674] Vivado Release: vivado 2020.2
[287269.747684] qdma_pf:qdma_device_attributes_get: qdma3b000-p0000:3b:00.0: num_pfs:1,
num_qs:512, flr_present:0, st_en:0, mm_en:1, mm_cmpt_en:0, mailbox_en:1,
mm_channel_max:1, qid2vec_ctx:0, cmpt_ovf_chk_dis:1, mailbox_intr:1, sw_desc_64b:1,
cmpt_desc_64b:1, dynamic_bar:1, legacy_intr:1, cmpt_trig_count_timer:1
[287269.747687] qdma_pf:qdma_device_open: Vivado version = vivado 2020.2
[287269.747690] qdma_dev_entry_create: Created the dev entry successfully
[287269.747695] qdma_pf:intr_setup: current device supports only (8) msix vectors per
function. ignoring input for (32) vectors
[287269.751489] qdma_pf:xdev_identify_bars: AXI Master Lite BAR 2.
[287269.751494] qdma_pf:qdma_device_open: 0000:3b:00.0, 3b000, pdev 0x00000000105df4cb,
xdev 0x000000002b7f5110, ch 1, q 0, vf 0.

The driver recognizes the PCIe IP core. If not, reload the bitstream and cycle through the .ko module. Now, we can use the dma-ctl util to test our shell:

$ dma-ctl dev list
qdma3b000 0000:3b:00.0 max QP: 0, -~-
$ dma-ctl qdma3b000 reg write bar 2 0x00000000 0xdeadbeef �qdma3b000, 3b:00.00, bar#2, 0x0 = 0xdeadbeef.
$ dma-ctl qdma3b000 reg read bar 2 0x00000000
qdma3b000, 3b:00.00, bar#2, 0x0 = 0xdeadbeef.

We are R/W on the Block RAM attached to the AXI Lite bus, exposed on the BAR 2.