Peeking into the Alpha-Data KU3

At this point in my digital design adventure, I wanted to get my feet wet learning how to debug on live hardware. My goal was simple, I want to be able to interactively write to and read from a single register running on an implemented design. This post documents the process that can be followed to replicate my end results with Vivado and an Alpha-Data KU3 card.

Starting fresh

I like to start learning new things with a fairly bare-bones approach. So my first step here is to create a new project in Vivado. I named my project ku3-peeking, chose RTL Project for my project type and picked the FPGA part that’s on the KU3, xcku060-ffva1156-2-e.

New Project

Adding some source

Next I’ll add a couple SystemVerilog files. One for my top level source and one for the module that I’ll be peeking and poking at through the debugger.

My test module is pretty simple, just a positive edge triggered register.

`timescale 1ns / 1ps

module testmodule (
  input clk,
  input data_in,
  output logic data_out

  always_ff @ (posedge clk) begin
    data_out <= data_in;


My top module initially will just instantiate my testmodule and a few wires to eventually hook everything up.

`timescale 1ns / 1ps

module top ();

  wire clk;
  wire data_in;
  wire data_out;

  testmodule my_test (


By default Vivado will look for the ‘top’ module when building the project, so this is good for now.

Adding a debug core

With the basic design in place, you can pull up the IP Catalog and select from a few different debug cores. For my goal, the VIO core is nice as it’ll let me read and drive signals.

VIO in catalog

Double clicking on the VIO core will bring up a prompt of options that can be set for the debug core.

VIO Customization Wizard

All of these defaults are fine for my case of a 1 bit register, but you can see you can easily add additional input and output probes. After hitting OK here you’ll get a prompt for starting an out-of-context synthesis run, which will synthesize the VIO core while you continue to work in Vivado.

Within the IP Sources tab, you can drill down into the hierarchy and find templates to instantiate this new core in your design files.

VIO Instance Template

So I’ll modify my top file, to add a new block that wires the VIO core to the clock and my testmodule instance.

  vio_0 my_test_vio (

Clocking in

For me, pulling in the clock for a legit hardware design was the most difficult part to figure out. I spelunked documentation and did much Googling, but eventually I reached out to some more experienced folks that helped lead me in the right direction.

The first step in this process is to check the users manual for the device at hand. In this case I’m looking to use the Fabric Clocks described in section 3.2.2 of Alpha Data’s ADM-PCIE-KU3 User Manual. The relevant bits here say there are 2 available fabric clocks, a 200MHz and a 250MHz clock. It lets me know that these pins use the LVDS I/O standard, what pins are available for each clock, and it also notes that I must set a constraint to set the DIFF_TERM_ADV to TERM_100 as there is a requirement that these clocks are terminated within the FPGA. I don’t fully comprehend what that means quite yet but the doc tells me to do it so I oblige.

To use the LVDS structures built into the FPGA, I need to instantiate a module that can take the LVDS input signals and provide a simple clock output. The module for this is called a Differential Input Buffer; it can be found in Vivado’s Language Templates window.

Differential Input Buffer Template

I’ll copy and clean up the template into my top module file, I’ll also add clk_p and clk_n as inputs to the top module so I can route them into the IBUFDS instance.

module top (
  input clk_p,
  input clk_n

  wire clk;
  wire data_in;
  wire data_out;

  ) IBUFDS_inst (

Since I’m using this clock to drive multiple blocks, it’s a best practice to use a General Clock Buffer (BUFG) so that Vivado will choose an appropriate buffer for this FPGA and minimize clock skewing. This is found in the Templates as well, under Verilog->Device Primitive Instantiation->Kintex UltraScale->CLOCK->BUFFER->General Clock Buffer (BUFG).

I’ll add this to my top module as well, routing the output of my IBUFDS buffer to the BUFG. The output of BUFG will be the clock signal used by my module and by the VIO core.

module top (
  input clk_p,
  input clk_n

  wire clk_int;
  wire clk;
  wire data_in;
  wire data_out;

  ) IBUFDS_inst (

  BUFG BUFG_inst (

At this point I can run Synthesis and inspect the design. Here’s the schematic view that shows the big picture.


Selecting the clock pins

From the schematic view I can click on the clk_p pin and it’ll select that pin from the I/O Ports view below the schematic in the IDE. From there, I can set the Site, I/O Std and DIFF_TERM_ADV as described in the KU3 User Manual. In this case I’m choosing to use the 250MHz Fabric Clock.

IO Port Planning

After making these changes I used CTRL+S to save my settings, and Vivado prompted me to save a .xdc constraints file which I named myconstraints.xdc.

The file generated has these contents, part of the contents here are from the selections I made in the IDE and some are added by Vivado for the VIO core I instantiated.

set_property PACKAGE_PIN AA24 [get_ports clk_p]
set_property IOSTANDARD LVDS [get_ports clk_p]
set_property DIFF_TERM_ADV TERM_100 [get_ports clk_p]
set_property C_CLK_INPUT_FREQ_HZ 300000000 [get_debug_cores dbg_hub]
set_property C_ENABLE_CLK_DIVIDER false [get_debug_cores dbg_hub]
set_property C_USER_SCAN_CHAIN 1 [get_debug_cores dbg_hub]
connect_debug_port dbg_hub/clk [get_nets clk]

With that in place I can re-run synthesis, run implementation and generate a bitstream for my KU3.

Deploying and testing

With the bitstream generated, I can open the Hardware Manager. After connecting to my device I can use the Program Device option, which will auto-populate with my bitfile and debug probes file, hit Program and wait for the magic to happen.

Program Device

Once programmed, if all works well Vivado will automatically open a VIO dashboard window. In that window you can hit the green + to add the input and output probes.

Adding Probes

With that open I can start poking at the data_in signal and watch the updates reflect in the data_out signal.

Peeking and poking signals


With all this setup I have an easy means to do some interactive control with designs that are running in a live FPGA. I hope other Vivado noobies can follow this guide to help in their digital design adventures. If you follow this guide and run into any issues reach out to me and I’ll try to help you out.

I’d like to thank JT Kellington, Kevin Irick and Mark Paluszkiewicz for offering their help and experience. I ran into many issues trying to hack my way through this and their assistance was extremely helpful in getting this up and running. Thank you!

Hello AFU on Alpha-Data KU3

Picking up on the Hello AFU project, I’ve recently gone through the motions of building the Hello AFU project for an actual CAPI device and tested it out. This post documents the process I followed to build and deploy this on real hardware.


To complete this process you’ll need a few things:
* A POWER8 based machine, for me I’m using a Barreleye server
* An Alpha-Data KU3 card
* The latest HDK archive from Alpha Data’s support site, at this time that file is named
* A licensed version of Xilinx’s Vivado

Preparing files for the build

First off, we need to extract the HDK


In the HDK by default, there will be some AFU source files in adku060_capi_1_1_release/Sources/afu/ we’ll jump in there and delete them, then copy over the SystemVerilog files from the hello-afu repository

cd adku060_capi_1_1_release/Sources/afu/
rm *
cp ~/projects/hello-afu/*.sv .

Next, open the project file adku060_capi_1_1_release/Sources/prj/psl_fpga.prj in a text editor to change a few lines. Remove all of the lines that start with verilog work, then add lines to reference the source files we copied into the afu directory. Some bash-fu for that:

cd ../prj
sed -i '/^verilog work/d' psl_fpga.prj
for i in `ls ../afu/*.sv | cut -d'/' -f3`; do echo "verilog work \"afu/$i\"" >> psl_fpga.prj; done

That should have us setup to build our AFU in leiu of the one that comes with the HDK!

Build and flash the binfile

With our files in the right spot and our project file modified, we just need to run a few of the tcl scripts in the HDK through vivado.

vivado -mode batch -source psl_fpga.tcl -notrace
vivado -mode batch -source write_bitstream.tcl -notrace

The first run here does the heavy lifting of synthesis, place and route, etc. The second command generates the actual binfile and bitfile that we can use to flash the device. The first command takes a significant amount of time on my i7-equipped laptop, about 40 minutes, the second command completed in about 9 seconds. Maybe someday we’ll have a CAPI-based accelerator for synthesis and place & route! Now that the building is complete I have my bitfile at capi-adku060/psl_fpga_flash.bin

To flash this to your device to a card that already has the PSL working you can use the capi-flash-script utility. If your card is factory-fresh or in a bad state, you can use a JTAG programmer and Vivado’s Hardware Manager to flash directly from your laptop, or remotely via xvcserver.

Using the AFU

After I flashed my AFU, I ensured libcxl was setup on my server. Since I’m running Ubunt 16.04 I simply installed it via apt.

apt-get install -y libcxl-dev

Next I rebooted the machine so that everything is nice and fresh, as part of the PCIe reset the bitfile from the KU3’s flash chip will be flashed onto the FPGA. I can verify the card is in a good state because I have my cxl device at /dev/cxl/afu0.0d.

I run my test_afu binary from the hello-afu project and boom! The same result as I get from simulation, woo-hoo!