Finally was able to run the biquad filter on a KC705 board we had lying around, and I'm measuring a latency of 50ns (5 cycles at 100MHz) rather than the 30ns I was getting on simulations.
Next up I'll try to baseline performance on the Artix 7 FPGA as we're looking to design some boards around it, and of course play around with implementation details to save some cycles or increase clock rate.