Nvidia CUDA


Contents

CUDA

You most likely have some idea of what CUDA is all about or you wouldn't be here, but for a general introduction see:

http://en.wikipedia.org/wiki/CUDA

http://www.nvidia.com/object/fermi_architecture.html

In short, CUDA provides a language (CUDA C), compiler, SDK and run time environment to allow you to write general purpose C code which is executed in a massively parallel manner by taking advantage of the capabilities of NVIDIA GPUs. It is actually quite easy to write code for and requires only a basic or intermediate level proficiency with C programming and does not require any knowledge of OpenGL or other graphics languages whatsoever. If you have an NVIDIA 8 series or newer card, including many mobile chipsets, it most likely supports CUDA and you can start writing parallel code to run on the device more easily than you may realize. It is sitting there quietly, waiting for you to dive in and take advantage of it.

OpenCL

In comparison to Nvidia's CUDA, there is also OpenCL. Newer Nvidia GPUs, when using the correct drivers, also support OpenCL

http://www.khronos.org/opencl/

Nvidia is one of the companies developing OpenCL and you can find more information and driver at the Nvidia OpenCL page:

http://www.nvidia.com/object/cuda_opencl_new.html

AMD also supports OpenCL and describes the implementation of the language at:

http://www.amd.com/us/products/technologies/stream-technology/opencl/pages/opencl-intro.aspx

A short tutorial in OpenCL can be found here.

(You can use the GPU Caps utility to see if your GPU supports OpenCL.)

Tutorials

Here is a good CUDA intro tutorial

There is also the tutorial series "Supercomputing for the Masses" by Rob Farber (a senior scientist at Pacific Northwest National Laboratory) available here and also described in the Linux Journal article.

A 2011 Linux Journal article by Alejandro Segovia is available here

Of course, there is also the Official Programming Guide as a definitive reference.

I would also recommend CUDA by Example: An Introduction to General Purpose GPU Programming by Jason Sanders and Edward Kandrot as an excellent way to get started. I am reading this now and find it quite helpful.

Additionally, there are of course, the SDK code examples you can download and get started with immediately.

CUDA Toolkit 4.0 Update

Documentation is now at:

http://developer.nvidia.com/nvidia-gpu-computing-documentation

Release notes are now here

Getting Started Guide is now here

Earlier versions, release notes, etc. are at http://developer.nvidia.com/cuda-toolkit-archive

CUDA SDK 3.2 Update

UPDATE I just did a fresh install of Suse 11.3 64-bit and the new CUDA SDK 3.2 and found the following issues:

  • Unfortunately, the CUDA SDK 3.2 release notes do not include the same mention of the missing symlinks as do the 3.1 release notes. Follow the procedure in the 3.1 notes, which is also described below.
  • While I had hoped the new CUDA SDK 3.2 would be compatible with gcc 4.5, it is not as described in this Nvidia Forum thread. You will still need to install and configure gcc43 and gcc43-g++ as alternatives as described below in the section GCC Version Issues.
  • Several X11 development libraries are required. On Suse 11.3 install these and their dependencies with the following:
zypper install libXi6-devel libXmu-devel xorg-x11-libXmu-devel xorg-x11-libXext-devel
  • For RHEL the packages are:
yum install libXi-devel libXmu-devel libXext-devel freeglut freeglut-devel

Make Errors:

The error:

86_64-suse-linux/bin/ld: cannot find -lX11

Is caused from a missing xorg-x11-libXext-devel

Errors like:

iomanip(64): error: expected an expression

Are likely from using an incompatible version of gcc (like 4.5). This is best resolved by installed gcc43 as an alternative (below) or as described on the Nvidia post:

A simple workaround to compiling the rest of the SDK would be to remove the folder Interval from the SDK's src folder and put it somewhere in your home directory where you will remember it later.

The complete list of programs that don't compile from the 3.2 SDK with GCC 4.5.1 are:
Interval
SobelFilter
FunctionPointers

Errors regarding:

AC_PROG_LIBTOOL

Can be solved by installing libtool

The remaining setup for SDK 3.2 is as given for the previous version.

Install / Compile Issues

NOTE: Please see Nvidia-Settings for information on the Nvidia driver itself, module error codes, install options, etc.

Firstly, follow the very detailed Getting Started Linux guide to get your development environment setup. The documentation is quite good and provides all the basic info on installing the Nvidia driver, the CUDA Toolkit and SDK and setting up your paths for locating the binaries and libraries. All required files are available on the CUDA downloads page.

Note: Ensure your NVIDIA driver meets or exceeds the version required by the SDK. The version in your distro's restricted driver (non-OSS) repository may not be sufficient. I recommend installing the NVIDIA driver from the CUDA downloads area to prevent any potential trouble.

Missing Symlinks

(Applies to Red Hat, CentOS, OpenSUSE, etc.)

It is important to read the SDK 3.1 SDK release notes - really, stop now and read them. You will fine under Section III. (b) Known Issues on CUDA SDK for Linux that often there are a few key libraries which need to have symlinks created so the linker can find them. In my case this was required for libglut and libGLU. If missing you will see errors such as:

/usr/bin/ld: cannot find -lglut
/usr/bin/ld: cannot find -lGLU

In either case, create the required symlinks:

#ln -s /usr/lib/libglut.so.3 /usr/lib/libglut.so
#ln -s /usr/lib/libGLU.so.1 /usr/lib/libGLU.so
#ln -s /usr/lib/libX11.so.6 /usr/lib/libX11
(Or /usr/lib64/ if running 64-bit)

If no write access to /usr/lib, create in another location and modify -L in Makefile (or add the path to the symlink to /etc/ld.so.conf and run ldconfig).

If you get a linking error: cannot find -lcuda then create this additional symlink (your library version may vary slightly):

#ln -s /usr/lib/libcuda.so.256.35 /usr/lib/libcuda.so

Compiling Order - Do what the Guide Says

Being an eager beaver, I decided to just compile the deviceQuery program first to make sure everything was working. This was a bad decision as it, like many others, requires shared objects which had not been created yet. As a result, I was getting compile errors such as:

paracelsus@Callandor:~/NVIDIA_GPU_Computing_SDK/C/src/deviceQuery> make
deviceQuery.cpp:126:11: warning: extra tokens at end of #else directive
deviceQuery.cpp:135:11: warning: extra tokens at end of #else directive
/usr/lib/gcc/i586-suse-linux/4.5/../../../../i586-suse-linux/bin/ld: cannot find -lcutil_i386
collect2: ld returned 1 exit status
make: *** [../../bin/linux/release/deviceQuery] Error 1

I resolved this by compiling in this order:

paracelsus@Callandor:~/NVIDIA_GPU_Computing_SDK/C/common> make
paracelsus@bob:~/NVIDIA_GPU_Computing_SDK/shared> make
paracelsus@Callandor:~/NVIDIA_GPU_Computing_SDK/C/src/deviceQuery> make

The real solution though, oddly enough, is to do what the Getting Started Guide said: You should compile them all by changing to NVIDIA_GPU_Computing_SDK/C in the userʹs home directory and typing make. The resulting binaries will be installed under the home directory in NVIDIA_GPU_Computing_SDK/C/bin/linux/release

So, just compile them all and avoid such problems:

paracelsus@Callandor:~/NVIDIA_GPU_Computing_SDK/C/make

GCC Version Issues

(For OpenSUSE 11.3, Fedora 13, Ubuntu 10.04, etc.)

If you are running a distro newer than the latest release of the CUDA Toolkit and SDK, then your versions of gcc and glibc may be newer and not compatible. This is generally resolveable by installing the earlier versions, and setting up your system to allow easy selection of which version to compile with. On my OpenSuse 11.3 install I used the following links to implement the solution which follows:

http://lukas.ahrenberg.se/archives/154

http://forums.nvidia.com/index.php?showtopic=157513

http://forums.nvidia.com/index.php?showtopic=170454

http://forums.nvidia.com/lofiversion/index.php?t50404.html

bob:~ # gcc --version
gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292]  ### <== Too new, does not allow code examples to compile!

(Use Yast to install gcc43 gcc43-c++ gcc43-info and any required dependencies. My packages are as follows:)

bob:~ # rpm -qa | grep gcc
gcc-4.5-4.2.i586
gcc45-info-4.5.0_20100604-1.12.noarch
gcc43-4.3.4_20091019-3.1.i586
libgcc45-4.5.0_20100604-1.12.i586
gcc-info-4.5-4.2.i586
gcc45-4.5.0_20100604-1.12.i586
gcc43-c++-4.3.4_20091019-3.1.i586
libstlport_gcc4-4.6.2-6.1.i586
gcc-c++-4.5-4.2.i586
gcc43-info-4.3.4_20091019-3.1.i586
gcc45-c++-4.5.0_20100604-1.12.i586

(Now, set up both versions so you can easily select which one is the default:)

bob:~ #sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.3 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.3
bob:~ #sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.5 40 --slave /usr/bin/g++ g++ /usr/bin/g++-4.5
bob:~ # update-alternatives --config gcc

There are 2 alternatives which provide `gcc'.

  Selection    Alternative
-----------------------------------------------
*+        1    /usr/bin/gcc-4.3
          2    /usr/bin/gcc-4.5

Press enter to keep the default[*], or type selection number: 1
Using '/usr/bin/gcc-4.3' to provide 'gcc'.
bob:~ # gcc --version
gcc (SUSE Linux) 4.3.4 [gcc-4_3-branch revision 152973]  ## <== You are now set to use CUDA with SDK version 3.1

No X Windows Running?

If you are installing on a server, and do not have X running, you will need to ensure the required device files are created. The getting started guide provides the following script to accomplish such. You can save this as nvidia_setup.sh and then invoke it from /etc/rc.local during boot:

#!/bin/bash
/sbin/modprobe nvidia
if [ "$?" -eq 0 ]; then 
# Count the number of NVIDIA controllers found. 
NVDEVS=`lspci | grep -i NVIDIA`
N3D=`echo "$NVDEVS" | grep "3D controller" | wc -l`
NVGA=`echo "$NVDEVS" | grep "VGA compatible controller" | wc -l` 

N=`expr $N3D + $NVGA - 1` 
for i in `seq 0 $N`; do 
  mknod -m 666 /dev/nvidia$i c 195 $i
done

mknod -m 666 /dev/nvidiactl c 195 255

# Set GPUs to persistent mode so driver stays loaded
nvidia-smi -pm 1

else
 exit 1

fi

MultiGPU Systems

If you have multiple GPUs, you can select which ones are available to the runtime environment with:

$ export CUDA_VISIBLE_DEVICES=0,1,3

This will mask GPU 2. This does not change what nvidia-smi -L shows, but does seem to direct which GPU the runtime environment uses.

OS X

Some OS X OpenCL documentation is here

CUDA vs OpenCL

http://www.infoworld.com/d/developer-world/cuda-and-opencl-265

GPU Caps Viewer

This utility polls GPUs and returns the device capabilities.

This utility can run under Wine on Linux as well, at least as of version 1.7:

http://www.geeks3d.com/20090414/gpu-caps-viewer-170-available-with-cuda-support/

http://www.ozone3d.net/gpu_caps_viewer/index.php#screens

Now what?

So what else can you do with CUDA? What applications exist out there to sink the teeth of your GPU into? Well, the list is growing all the time, but I've started a page where I'll be adding some too as I discover them, and you can find it at CUDA Applications

Check out CUDA Data Parallel Primitives Library - CUDPP

Explore the CUDA Performance Profiler

VirtualGL

If you wish to view OpenGL simulations remotely you will quickly discover that OpenGL is actually rendered on the client. Although the computations may be done on the remote system, all rendering commands are sent to the client to be displayed locally, essentially killing performance. VirtualGL provides a solution to this by rendering results on the server, storing them in a pixel buffer and transferring that to the client. It does so in a clever way, by attaching a loadable module to the binary run which intercepts the OpenGL calls and redirects them locally.

See the official VirtualGL site for complete details. There is also some useful information in this Sun documentation and this Nvidia forum thread.

(The newest version does not require the libjpeg-turbo. There are several steps required to configure X, so see the install guide prior to trying the below commands.)

To start a VGL forwarded SSH connection from the client:

/opt/VirtualGL/bin/vglconnect -force paracelsus@10.100.10.48

Then, start the OpenGL app on the server with vglrun:

paracelsus@bob:~> cd NVIDIA_GPU_Computing_SDK/C/bin/linux/release/
paracelsus@bob:~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release> /opt/VirtualGL/bin/vglrun ./oceanFFT

Overclocking

Coolbits

You can enable overclocking via the nvidia-settings utility by simply adding an option to your xorg.conf Nvidia Device section:

   Option "Coolbits" "1"

Restart X and the nvidia-setting GUI will now have overclocking options.

http://www.overclockers.com/forums/showthread.php?t=605405

http://www.phoronix.com/scan.php?page=article&item=197&num=1

NVClock

http://www.linuxhardware.org/nvclock/

Provides basic overclocking capabilities, though newer cards may not be supported. (Unfortunately, the prospects of porting other overclocking tools such as EVGA's Precision utility are not showing much promise at this time. This does not run under Wine and a Linux implementation is not planned per this post.

The qt version seems to be broken in ./configure but the gtk version compiles fine. Command line and GTK versions binaries are:

nvclock0.8b4> src/gtk/nvclock_gtk

nvclock0.8b4> src/nvclock

#!/bin/bash
## This will overclock the GPU to 300 and Memory to 400 – Change accordingly!
nvclock -b coolbits -n 300.000 -m 400.000

Language Bindings

Looking for more ways to leverage CUDAs capabilities from other languages? Try these resources:

Kappa Framework:

http://psilambda.com/

CUDA Library for R:

http://brainarray.mbni.med.umich.edu/brainarray/rgpgpu/

CUDA Library for IDL:

http://gpulib.blogspot.com/

PyCUDA:

http://mathema.tician.de/software/pycuda

Books

Programming Massively Parallel Processors: A Hands-on Approach

ISBN-13: 978-0123814722


CUDA by Example: An Introduction to General-Purpose GPU Programming

ISBN-13: 978-0131387683


Scientific Computing with Multicore and Accelerators

ISBN: 978-1-4398253-6-5

Looking for something?

Use the form below to search the wiki:

 

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!