A design of 4096-point radix-8 FFT is implemented on Field-Programming Gate Array (FPGA). Traditional radix-2 and radix-4 FFT processors could not satisfy the requirements of modern high-speed digital signal processing, so the radix-8 shared-memory architecture is used at the top-level. The butterfly module, the twiddle factor generation module, the input-output interface module are analyzed and optimized. A novel method to generate twiddle factors is proposed and compared with the traditional method. The pipeline style design increases the computing speed and decreases the FPGA resource utilization. Simulation verification is done and the result is compared with that of Matlab fixed-point model. The design is finally programmed to an Altera EP2S60F672I4 device and is verified with the help of a digital signal processor. The computing results with various input patterns are retrieved to Matlab and compared with the fixed-point model bit by bit. Under the clock frequency of 100MHz, the design takes 2.048?滋s to finish 4096-point radix-8 FFT, so it can meet the requirement of high speed digital signal processing.