

# Accelerated processing for modern radio telescopes using a combination of KNL and FPGA

Sajish Chandrababu, Yogindra S. Abhyankar, Sarun O. S

Centre for Development of Advanced Computing, Pune - 411 008, India sajishc@cdac.in,yogindra@cdac.in,sarunn@cdac.in

### C. R. Subrahmanya

Raman Research Institute, Bengaluru - 560 080, India crs@rri.res.in



### Introduction

Modern radio telescopes have hundreds of antenna elements with wide operational bandwidth. The signal processing involves fusing the responses of these elements in many different ways. One common fusion operation is the crosscorrelation of response from every element with itself and with every other element. Due to the nature of this operation and to maximize the power efficiency, it makes sense to use the latest **Intel Knights Landing.** 

## **Ooty Radio Telescope (ORT)**

- Cylindrical paraboloid, 530 m long and 30 m wide, operating at 326.5 MHz with bandwidth of 15 MHz.
- **♦** One of the sensitive radio telescopes in the world.
- Plan to expand the antennas (1056 dipoles along the feed array) to increase
- Peta-scale range supercomputer required for software correlation of the planned system.

KNL has 72 cores, each with two 512-bit vector processing

- Simultaneous processing for as many spectral channels as the number of available hardware threads in the KNL processor.
- Coarse level subbands are allocated to different nodes of the KNL cluster.







## Knights Landing (KNL) processor

units and each core capable of running four threads on it.

### **Cross – Correlation**

- The broadband responses of individual elements are channelized into many subbands.
- Fusion operations are performed independently for each subband.
- Each thread within the KNL processor performed crosscorrelation for the entire array for a single spectral channel.

# Challenges Node 1 from HPC different Node 2 antennas Switch Fig 1. Data flow illustration from antennas to the HPC nodes

- **♦** The biggest challenge in doing such a software correlator on a HPC cluster is the communication bandwidth required for all-to-all connectivity.
- ◆ The all-to-all is required as at any given time, signals from all the antennas should be forwarded to a single HPC node.
- ♦ When the signals from all the antennas are to be forwarded to a single HPC node, the switch port connected to the HPC node becomes the bottleneck.
- ◆ To meet the computational requirement in an efficient way, KNL processor is used for software correlation; however this in turn further increases the communication bottleneck making the use of KNL processor inefficient.



## **Proposed Solution**

In order to overcome the bottlenecks, in this work we are proposing to use FPGA boards with designed functionality to segregate the communication between the ORT antennas and the HPC nodes. All the FPGA boards are connected to the switch and an index based approach is used for performing communication between all the FPGA boards to all the HPC nodes.





- Digitized data from antenna is forwarded to the FPGA board using JESD204B serial interfaces, each working at 5 Gbps.
- ◆ Data of multiple subbands is first stored in the onboard memory and then the stored data corresponding to a subband is transferred to a HPC node.
- ◆ To avoid congestion, different FPGAs are sequenced in such a way, that no two FPGA boards are sending data to the same HPC node at the same time.

<sup>\*</sup> Thankful to the Ministry of Electronics and Information Technology (Meity), Government of India, for all the support.