Session 1: Keynote Talk
Chair: Kees Vissers
9:00 a.m.
Derek Chiou, Microsoft

Microsoft's Production Configurable Cloud [slides]

Microsoft has been building data centers with a programmable hardware in every server, creating a novel Configurable Cloud. The reconfigurable hardware, in the form of a field programmable gate array (FPGA) is cabled between the NIC and the data center network, as well as being attached to the CPUs via PCIe. This architecture enables an FPGA-centric, rather than CPU-centric, computational model since all communication in and out of the server is first processed by the FPGA that handles common tasks without CPU involvement and passes uncommon, complex tasks to the CPU that acts as a "complexity" offload engine. Microsoft has deployed a diverse set of applications, including deep neural networks and software defined networking acceleration, across its Configurable Cloud. I will describe the Cloud, some of its applications, and their performance.

Bio: Derek Chiou is a Partner Architect at Microsoft where he leads the Azure Cloud Silicon team responsible for FPGAs and ASICs for data centers and a research professor in the Electrical and Computer Engineering Department at The University of Texas at Austin. Before Microsoft/UT, Dr. Chiou was a system architect and lead the performance modeling team at Avici Systems, a manufacturer of terabit core routers.

Session 2: Lightning Talks 1
Chair: Michael Lysaght
Kazutomo Yoshii, Hal Finkel and Franck Cappello, "Benchmarking under the hood of OpenCL FPGA platforms"
Yaman Umuroglu, Nicholas James Fraser, Giulio Gambardella and Michaela Blott, "A C++ Library for Rapid Exploration of Binary Neural Networks on Reconfigurable Logic"
Jiayi Sheng, Chen Yang, Ahmed Sanaullah, Qingqing Xiong and Martin Herbordt, "Strong Scaling of MD Simulations with FPGA-Centric Clouds and Clusters"

10:00 a.m.
Poster Session 1
Xin Fang, Stratis Ioannidis and Miriam Leeser, "Garbled Circuits for Preserving Privacy in the Datacenter" [poster]
Sakil Barbhuiya, Yun Wu, Karen Murphy, Hans Vandierendonck, George Karakonstantis and Dimitrios Nikolopoulos, "Accelerating Data Center Applications with Reconfigurable DataFlow Engines" [poster]
Giulio Stramondo, Ana Lucia Varbanescu and Catalin Ciobanu, "The Case for Custom Parallel Memories: an Application-centric Analysis" [poster]
Syed Waqar Nabi and Wim Vanderbauwhede, "MP-STREAM: A Multi-Platform FPGA-Centric Memory Performance Benchmark"
Abhishek Jain, Douglas Maskell and Suhaib Fahmy, "Coarse-Grained FPGA Overlays for On-demand Acceleration of Data Center Workloads"

Session 3: Invited Talks
Chair: Franck Cappello
Gustavo Alonso, ETH Zurich

Data Processing on the Fast Lane [slides]

Data processing is changing in radical ways. On the one hand, data science and big data have brought an unprecedented growth and variety in data sizes, demanding workloads, data types, and applications. On the other hand, hardware is no longer a source of performance as it has been in the last decades. Instead, it has become a complex, fast evolving, highly specialized, and heterogeneous platform that requires considerable tuning and effort to use optimally. In this talk I will discuss the problem, arguing that there is an opportunity for specialized designs and showing the challenges to data processing resulting from modern hardware. I will illustrate the points with examples from research and recent developments from industry to argue there is a significant opportunity for hardware acceleration in data centers if one focuses on the correct problems and finds the proper architecture for the complete system.

Bio: Gustavo Alonso is a professor at the Department of Computer Science of ETH Zurich (ETHZ) in Switzerland, where he is a member of the Systems Group. Gustavo has a M.S. and a Ph.D. in Computer Science from UC Santa Barbara. Before joining ETH, he was at the IBM Almaden Research Center. His research interests encompass almost all aspects of distributed systems and databases, with an emphasis on system architecture. Current research is related to multi-core architectures, large clusters, FPGAs, and big data, mainly working on adapting traditional system software (OS, database, middleware) to modern hardware platforms. Gustavo is a Fellow of the ACM and of the IEEE.
Viktor K. Prasanna, Univ. of Southern California

Large Scale Graph Analytics on FPGAs [slides]

Graph analytics has recently become a key tool in many applications of Data Science. This talk explores FPGA-based parallel architectures and algorithms for graph analytics and proposes optimizations for high throughput and energy efficient accelerator designs for such problems. Our approach is based on high level abstractions of the reconfigurable platforms and efficient data structures and algorithms. We also illustrate hybrid algorithms for tightly coupled heterogeneous architectures that consist of multi-core processors, FPGAs and coherent memory. We discuss the performance improvements on such systems and demonstrate the suitability of FPGAs for these computations including basic graph primitives and deep learning.

Bio: Viktor K. Prasanna is Charles Lee Powell Chair in Engineering in the Ming Hsieh Department of Electrical Engineering and Professor of Computer Science at the University of Southern California. He is the director of the Center for Energy Informatics. He was the executive director of the USC-Infosys Center for Advanced Software Technologies (CAST) and a member of the USC-Chevron Center of Excellence for Research and Academic Training on Interactive Smart Oilfield Technologies (CiSoft). His research interests include parallel and distributed systems including networked sensor systems, embedded systems, configurable architectures and high performance computing. He served as the Editor-in-Chief of the IEEE Transactions on Computers during 2003-06 and is currently the Editor-in-Chief of the Journal of Parallel and Distributed Computing. Prasanna was the founding Chair of the IEEE Computer Society Technical Committee on Parallel Processing. He is the steering chair of the IEEE International Conference on High Performance Computing ( He is a Fellow of the IEEE, the ACM and the American Association for Advancement of Science (AAAS). He is a recipient of 2009 Outstanding Engineering Alumnus Award from the Pennsylvania State University. He received the 2015 W. Wallace McDowell award from the IEEE Computer Society for his contributions to reconfigurable computing.
Paul Chow, Univ. of Toronto

Making FPGAs Programmable as Computers and Doing It At Scale [slides]

The benefits of using application-specific architectures to improve performance have been known for decades, and today, power is also just as important. FPGAs have long been shown to address both these issues but they are still not used in the mainstream of computing. They are hard to program and there has been much effort towards building high-level synthesis tools to address this problem. Such tools are necessary, but not sufficient. Making FPGAs easy to use for computation requires more than developing accessible tools for creating hardware targeted for FPGAs. The software computing world has a lot of taken-for-granted, sometimes invisible and good open source infrastructure that is missing for using FPGAs as computing devices. I will present the need for some common infrastructure and abstraction layers to support the use of FPGAs for computing. It is also important to leverage existing programming models when introducing FPGAs into the computing ecosystem. Of particular relevance to the HPC community, I will describe the work we have done at the University of Toronto with incorporating FPGAs into the MPI and PGAS programming models.

Bio: Paul Chow is a Professor in the Department of Electrical and Computer Engineering at the University of Toronto where he holds the Dusan and Anne Miklas Chair in Engineering Design. Prior to joining UofT in 1988 he was at the Computer Systems Laboratory at Stanford University, Stanford, CA, as a Research Associate, where he was a major contributor to an early RISC microprocessor design called MIPS-X, one of the first microprocessors with an on-chip instruction cache and the root of many concepts used in processors today. His research interests include high performance computer architectures, reconfigurable computing, embedded and application-specific processors, and field-programmable gate array architectures and applications.

Paul was the Program Chair for the 2008 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA 2008), the premier conference for FPGAs and General Chair for FPGA 2009. In 2011, he was the Program Chair for the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM 2011), the main conference for the reconfigurable computing area. He was the FCCM 2012 General Chair. In addition, Paul is on the technical program committee for the four main FPGA conferences: FPGA, FCCM, FPL, FPT.
Christoph Hagleitner, IBM Research

Hyperscale FPGAs for HPC and Cloud [slides]

Abstract: The performance, power-efficiency, and reconfigurability of contemporary FPGAs are attractive features for HPC and cloud-applications. The slow-down of Moores law and the lack of viable alternatives to CMOS technology make it increasingly difficult to achieve substantial improvements from one generation to the next. Despite these attractive features, FPGAs have not (yet) been able to establish themselves as credible alternatives to the dominant two-socket server systems. In this presentations I'll present two research projects - CloudFPGA and Supervessel - that address the scalability and system integration issues, which have limited progress so far.

Session 4: Lightning Talks 2
Chair: TBD
Tobias Kenter and Christian Plessl, "Microdisk Cavity FDTD Simulation on FPGA using OpenCL" [slides]
Vito Giovanni Castellana, Marco Minutoli, Antonino Tumeo, Marco Lattuada and Fabrizio Ferrandi, "Enabling Design Automation for Data Analytics"
Lukas Sommer, Julian Oppermann and Andreas Koch, "C-based Synthesis of Area-Efficient Accelerators for OpenMP Worksharing Loops" [slides]

Poster Session 2
Petr Kaštovský, Pavel Benácek and Viktor Puš, "P4-to-FPGA: High Performance Reconfigurable Networking" [poster]
Sajish Chandrababu, Yogindra Abhyankar, Sarun O. S. and Subrahmanya C. R., "Accelerated processing for modern radio telescopes using a combination of KNL and FPGA" [poster]
Kentaro Sano, Tomohiro Ueno, Daichi Tanaka and Satoru Yamamoto, "High-Performance Fluid Simulation using Multiple FPGAs with Bandwidth-Compressed Links" [poster]
Andrew Kongs, Kyle Cook, John Hale, Michael Frohlich and Peter Hawrylak, "Experimental Heterogeneous Computing with OpenCL"
Abhilash Devalapurarajagopala and Ron Sass, "The Waverun Programming model for FPGAs" [poster]