GPU Club: GPU, FPGA & Cloud
Software for GPUs inc. compiler/directives, maths libs & tools (debuggers and profilers)
10:00 (prompt) -11:30, Tues 25 Sept
Lecture Theatre 1.4, School of Computer Science, Kilburn Building
brief update from ITS about compute/data facilities available to UoM researchers
Christos Delivorias, "Financial Modelling Case Studies on GPU, Cloud & FPGA Implementations"
Case Studies in Acceleration of the Heston Stochastic Volatility Financial Engineering Model:
GPU, Cloud and FPGA Implementations
Here we present a comparative insight of the performances of the Heston stochastic volatility model on different acceleration platforms. The implementation of this model made use of Quasi-random variates, using the NAG library, to reduce the variance, as well as the Quadratic Exponential discretisation scheme.
The main implementation of the model was in MATLAB, which was then ported to the GPUs, and the Techila platforms. The FGPA code was more native to the platform, and was developed in C++/JAVA. The model was tested against an Intel CPU, a Techila grid server hosted on Microsoft’s Azure cloud, a GPU node hosted by Boston Ltd, and an FPGA node hosted by Maxeler Technologies Ltd. Temporal data was collected and compared against the CPU baseline to provide quantified acceleration benefits from all platforms.
It was clearly highlighted during this project, that the closer to the machine level the better the acceleration achieved. And in the case of the FPGAs and GPUs, the more all the pipes/cores are utilised the better the performance. The best results were performed on the FPGAs with the Maxeler DataFlow platform. While the GPU code was ported using the Parallel Computing Toolbox in Matlab.
The handicap with the specialised development approach though is the high friction and niche specialisation in the implementation of such code. Time and monetary constraints need also be taken into account when making a final assessment.
Time limitations during this project only allowed for the code to be ported to work with GPU instead of a native code approach. This presented the scenario where a researcher could adapt existing code to parallelisation platforms. Future work will include a CUDA implementation of the model for the GPUs.
Christos is affliated to
Scottish Widows Investment Partnership 60 Morrison Street, Edinburgh
University of Edinburgh, JCM Building, Kings Buildings, Edinburgh
Discussion Points From Meeting
Please read these in combination with the slides
- Christos explained that some timings were made during his MSc (3 months') project so, for example, it may be useful to normalise against a modern CPU and to additionally give information per device and %age peak performance achieved;
- TCO is a key metric for many businesses and researchers, so would be an interesting project to investigate further how to measure this in a sensible, portable manner
Christos gave a demo of using an MS Azure Large Instance - 365 cores were seen to be in use. Total run time (for 10x example size used on CSF & GPU) was 14.7 secs inc overheads to cloud
Running example on CSF NVIDIA card too 8.5 secs of which apparently half is in generating random numbers. This high RNG cost is a known issue for GPUs. NAG will be releasing a RNG for NAG Toolkit for MATLAB (no timescale given). Porting to Techila seemed to be swapping for for cloudfor (and end to cloudend) with a couple of "%cf:" directives controlling number of workers etc.
- Several options for optimising the MATLAB was noted, including compiling to MEX and then linking with NAG RNG library
Mr. Christos Delivorias is a Quantitative Analyst at Scottish Widows Investment Partnership (SWIP). He has a software and optimisation background and has gained a BSc in Computer Science from the University of Strathclyde, and an MSc in Operational Research with Risk Management from the University of Edinburgh. Christos currently works with the Analytics and Improvement Department at SWIP, designing and implementing derivatives models.
GPU Club Meetings
25th Nov 2014: 1.30-3.30pm, 2.220 University Place
Tues 26 Nov 2013: 2-3pm, B8 George Begg. Christian Obrecht on GPU implementations of fluid dynamics simulations on regular meshes: some recent advances
Weds 13 Nov: 2pm, Univ Place. John Michalakes (NOAA) and Craig Davies (Maxeler Dataflow)
Weds 30 Oct: Intermediate CUDA training run by NVIDIA
Tues 29 Oct: 2pm, Univ Place, NVIDIA and Stephen Longshaw.
Weds 2 Oct 2013 - Large Scale Optimization and High Performance Computing for Asset Management, Daniel Egloff (QuantAlea)
Tuesday 23 July MathWorks (GPUs for MATLAB) and NVIDIA (GPUs & CUDA)
Thur 2 May 2013 Lessons from GTC and on using the Intel Xeon Phi
Mon 10 Dec 2012 Dataflow and MultiGPU SPH
Tues 25 Sept Seminar on implementing financial models on GPUs, FPGAs and in the Cloud
Mon 15 Oct: OpenCL training from UoM IT Services
Thurs 25 Oct: Hands-on "OpenACC" workshop run by Cray UK Ltd.
17 May 2012 Speakers on healthcare policy simulation in OpenCL, MHD algorithms in CUDA, Tridiagonal Solvers in CUDA
20 April 2012 Francois Bodin, CAPS: "Programming Heterogeneous Many-Cores using Directives" using HMPP
23 March 2012 Roko Grubisic, ARM: "Embedded Computer Graphics and ARM Mali GPUs"
02 March 2012 Speakers on profiling, sparse matrix algebra and atmospheric chemistry
09 Dec 2011 MPI and GPUs, directives-based programming, FPGA and GPU comparison, ideas for 2012
30 Sept 2011 GPU programming in FORTRAN, multiple GPUs, image reconstruction
15 July 2011 Jack Dongarra key note on Emerging Technologies
18 Mar 2011 OpenCL, debugging and profiling tools, porting C to CUDA, real time analysis
26 Nov 2010 biological MD, smoothed particle hydrodynamics, Monte Carlo financial models, Markov models