AQUAgpusph

C++ ♥ OpenCL ♥ Python

Jose Luis Cercos Pita
Antonio Souto Iglesias
Leo Miguel Gonzalez

SPH

Meshless method
Lagrangian

A lot of neighbours!

3D example

$$\frac{h}{\Delta r} = 4$$

$$n_{neighs} \simeq \mathrm{int}\left(\frac{V}{\Delta r^3}\right) \simeq \mathrm{int}\left(\frac{4}{3} \pi \left(\frac{2 h}{\Delta r} \right)^3\right)$$

$$n_{neighs} \simeq 2145$$

$$flop_{neigh} \simeq 100$$

$$flop_{particle} \simeq 214500$$

$$n_{particles} \simeq 10000$$

$$\Delta t \simeq 10^{-4} \, \mathrm{s}$$ $$t_{sim} \simeq 10 \, \mathrm{s}$$

$$n_{t} \simeq 10^5$$

$$flop_{particle} \simeq 214500$$ $$n_{particles} \simeq 10000$$ $$n_{t} \simeq 10^5$$

$$flop \simeq 10^{14}$$

Intel Core i7 980 XE = $10^8$ FLOPS

$$t_{i7} \simeq 10^6 \, \mathrm{s} \simeq \color{red}{10 \, \mathrm{days}}$$

We want to use Python (and pyopencl)!!!

Python is the best thing happened to the computation science since...

THE PIZZA!
Pizza & Code

pysph


cdef void get_eigenvalues(double a[3][3], double *result) nogil:
    cdef double m = (a[0][0]+ a[1][1]+a[2][2])/3.0
    cdef double K[3][3]
    memcpy(&K[0][0], &a[0][0], sizeof(double)*9)
    K[0][0], K[1][1], K[2][2] = a[0][0] - m, a[1][1] - m, a[2][2] - m
    cdef double q = det(K)*0.5
    cdef double p = 0
    p += K[0][0]*K[0][0] + 2*K[1][2]*K[1][2]
    p += K[1][1]*K[1][1] + 2*K[0][2]*K[0][2]
    p += K[2][2]*K[2][2] + 2*K[0][1]*K[0][1]
    p /= 6.0
    cdef double pi = M_PI
    cdef double phi = 0.5*pi
    cdef double tmp = p**3 - q**2

    if q == 0.0 and p == 0.0:
        # singular zero matrix
        result[0] = result[1] = result[2] = m
        return
    elif tmp < 0.0 or fabs(tmp) < EPS: # eliminate roundoff error
        phi = 0
    else:
        phi = atan2(sqrt(tmp), q)/3.0
    if phi == 0 and q < 0:
        phi = pi

    result[0] = m + 2*sqrt(p)*cos(phi)
    result[1] = m - sqrt(p)*(cos(phi) + sqrt(3)*sin(phi))
    result[2] = m - sqrt(p)*(cos(phi) - sqrt(3)*sin(phi))

cpdef py_get_eigenvalues(double[:,:] m):
    '''Return the eigenvalues of symmetric matrix.
    '''
    res = numpy.empty(3, float)
    cdef double[:] _res  = res
    get_eigenvalues(&m[0][0], &_res[0])
    return res

THIS IS MORE C THAN PYTHON!!!

I said...

We want to use...

PYTHON!

Writing Python is better than C++
The OpenCL API is really ugly
C++ is faster than Python
OpenCL is faster than C++
Bypassing C++ with PyOpenCL is probably implying a hard to assume computational bureaucracy

“Computational bureaucracy: Overhead caused by a CPU moving memory while the actual computational device is idle”

Vicente Cuellar

Dogma

We want to get a C++ based code
C++ and the OpenCL API should be hidden to the user
If we want to manipulate the particles, OpenCL codes should be applied
Otherwise Python should be the king

A modular application will do the job!

Let's see a practical application

S.P. Singh, S. Mittal

Vortex-induced oscillations at low Reynolds numbers: Hysteresis and vortex-shedding modes

$$y(t) = A \, \sin \left( \frac{2 \pi}{T} t \right)$$ $$\Delta y(t) = y(t) - y(t + \Delta t)$$ $$\frac{dy}{dt}(t) = \frac{2 \pi}{T} A \, \cos \left( \frac{2 \pi}{T} t \right)$$ $$\frac{d^2y}{dt^2}(t) = -\left(\frac{2 \pi}{T}\right)^2 A \, \sin \left( \frac{2 \pi}{T} t \right)$$

Main.xml


(...)
    
    
    
(...)

Motion.xml

Create 3 scalar variables to control the motion
Before finishing the time step execute Motion.py and Motion.cl

Motion.py


import aquagpusph as aqua
import numpy as np

T = 4.0
A = 0.1
Y = 0.0
W = 2.0 * np.pi / T

def main():
    global Y
    t = aqua.get("t")
    y = A * np.sin(W * t)
    dy = y - Y
    dydt = W * A * np.cos(W * t)
    ddyddt = -W**2 * A * np.sin(W * t)
    aqua.set("dy", dy)
    aqua.set("dydt", dydt)
    aqua.set("ddyddt", ddyddt)
    Y = y
    return True

Motion.cl


#ifndef HAVE_3D
    #include "/home/pepe/SPH/Code/aquagpusph.cmake/resources/OpenCL/types/2D.h"
#else
    #include "/home/pepe/SPH/Code/aquagpusph.cmake/resources/OpenCL/types/3D.h"
#endif

__kernel void entry(const __global uint* iset,
                    __global vec* r,
                    __global vec* u,
                    __global vec* dudt,
                    float dy,
                    float dydt,
                    float ddyddt,
                    unsigned int N)
{
    unsigned int i = get_global_id(0);
    if(i >= N)
        return;
    if(iset[i] != 1){
        return;
    }

    r[i].y += dy;
    u[i].y = dydt;
    dudt[i].y = ddyddt;
}