Wednesday 18 September 2013

OpenCL- Unleash the Power of your GPU

      Hi everyone! In this series of posts, I will introduce you to the new cutting edge API OpenCL which is being rigorously developed by technology giants like AMD, Nvidia, Intel and IBM right now. We shall use Python to get introduced to OpenCL environment and revise your C programming basics to write Kernel programs. In this post we shall get familiar with the terminologies of OpenCL. Why wait!? Let us get started!

     According to Moore's law, transistors per cm square would double every year, and it is no surprise that he was indeed correct. When we look back a decade, we realise how fast we have developed in computing performance. But one should admit that, we are reaching the fundamental limits in the size of transistors! We can not improve computing performance with more transistors in the same chip any more! That is precisely where Parallel Computing spreads a ray of hope, and OpenCL is the brightest source of hope right now!


    Real power of OpenCL lies in the Kernel programs, which are executed on multiple devices in synch. Suppose, if you have an AMD processor and an AMD graphics card, you may use OpenCL to improve the performance by the proper use computing power. In SIMD(Single Instrn Multiple Data) type of requirements like Graphics rendering, Image processing etc, OpenCL is very handy! Main reasons why OpenCL must be your choice of parallel computing is,


  1. CPU and GPU portability:  OpenCL is a very flexible API that allows you to switch between devices for a better performance.
  2. OpenGL and DirectX compatibility: You can have an interoperability between the Graphic APIs for a better performance in graphics rendering.
  3. It can even run on a computer without an OpenCL enabled GPU! Your CPU will still support it!


    If you are a hard core programmer, not satisfied with speed of execution of your program, or if you are a graphics developer and want to improve rendering performance, or if you are an OpenGL/DirectX developer trying to improve performance, or if(I can go on)....., you are in the damn right place!


    Let us dive right in, and get introduced to some basic terms in OpenCL.


  • Device : They represent the computing resources in the computer available for OpenCL. For ex, GPU.
  • Kernels  : Kernels are the programmes that is executed on the device. It basically contains the main Instruction that you want to accelerate. It is written in C (with some restrictions and keywords). A proper understanding of OpenCL functioning can help the coder to write efficient Kernels.  
  • Context  : An OpenCL context allows the system to execute the Kernels and transfer of data. Suppose you are adding two vectors, with the kernel and host programs, it represents the context.
  • Command Queue : In a specific context, in the runtime environment, command queues are created so as to instruct the devices to carry out the operation.
       I shall explain some more basics with a simple analogy, which actually made me understand what OpenCL actually  is! Here we go!
   
      Consider a building under construction. Workers are spread across the place, unless they are arranged properly, it is a mess. Now there comes the saviour, the contractor! He would arrange the workers into groups, and assign a work to all the groups. Every group shall do a similar work, but at different places. Worker has to fetch the equipments from the main store in the beginning. Contractor will know every work group, and he would also identify each worker individually. Now we are ready to learn more! Keep an eye on the words in apostrophes!
   
      Building construction is like the OpenCL context. Every individual worker is known as a 'Work-Item'. The groups formed are called 'Work-Groups' containing Work-Items. Contractor knows work groups by an ID. Every worker has a 'Global ID' and a 'Local ID' specific to the Work-Group. Worker fetches instruments from the store, analogous to a work item fetching data from a 'Global Memory'. Work items have some instruments already in their 'Private Memory'. Work groups have some instruments local to the group called 'Local Memory'. Finally, the work represents your Kernel.
    
   
   
     Now guess who the OpenCL is!!?? I am sure you have guessed it right! Check out the next post about creating a basic OpenCL environment in Python!

      You are most welcome to leave a comment below!


2 comments: