Click here to Skip to main content
15,439,832 members
Everything / Programming Languages / CUDA

CUDA

CUDA

Great Reads

by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
by Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
by Ryan S White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.

Latest Articles

by Intel
This document demonstrates how a linear algebra Jacobi iterative method written in CUDA* can be migrated to the SYCL* heterogenous programing language.
by Sergiu Ovidiu Oprea
This article is a hands-on look at the process of converting CUDA to SYCL.
by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
by Dhruv__Patel
In this article we compare and contrast SYCL and CUDA, and discuss how the oneAPI compiler can work with SYCL.

All Articles

Sort by Score

CUDA 

3 Apr 2022 by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
16 Jul 2012 by Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
16 Sep 2013 by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
11 Jul 2020 by Ryan S White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.
17 Sep 2012 by ObiWan_MCC
A C# SMTP server (receiver).
22 May 2013 by John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
2 May 2017 by Arthur V. Ratz
This article is a practical guide on using Intel® Threading Building Blocks (TBB) and OpenMP libraries for C++ based on the example of delivering parallel scalable code that implements Burrows-Wheeler Transformation (BWT) algorithm.
9 Feb 2013 by Debdatta Basu
Examine the various approaches to implementing Radix sort on the GPU
16 Feb 2016 by Max R McCarty
OWASP's #6 most vulnerable security risk has to do with keeping secrets secret.
16 Sep 2013 by Nick Kopp
Ultra high quality frequency domain image rotation on a GPU.
16 Jan 2021 by Shao Voon Wong
How to convert a code from parallel C++ ray-tracing code to CUDA, then to SYCL 2020 via Intel® DPC++
27 Jul 2022 by Intel
This document demonstrates how a linear algebra Jacobi iterative method written in CUDA* can be migrated to the SYCL* heterogenous programing language.
3 May 2017 by Intel Corporation
In this blog post, we highlight one particular class of low precision networks named binarized neural networks (BNNs), the fundamental concepts underlying this class, and introduce a Neon CPU and GPU implementation.
10 Nov 2020 by Jeremy C. Ong
A quick 5-minute introduction to porting a CUDA app to Data Parallel C++ (DPC++)
2 Mar 2022 by Dhruv__Patel
In this article we compare and contrast SYCL and CUDA, and discuss how the oneAPI compiler can work with SYCL.
13 Apr 2022 by Sergiu Ovidiu Oprea
This article is a hands-on look at the process of converting CUDA to SYCL.
4 May 2017 by Intel Corporation
Theano is a Python library developed at the LISA lab to define, optimize, and evaluate mathematical expressions, including the ones with multi-dimensional arrays (numpy.ndarray)
2 Apr 2012 by manythreads
This sixth article in a series on portable multithreaded programming using OpenCL™ where Rob Farber discusses how to calculate data in OpenCL™ and render it with OpenGL within the same application.
14 Sep 2016 by Mike Lanzetta
In this post, I'll walk you through how to get one of the most popular toolkits up and running on Windows, and run through and explain some fun examples.
18 Sep 2017 by Intel Corporation
TotalView includes a set of tools that provide scientific and academic developers with controlover processes and thread execution, along with deep visibility into program states and data.
10 Dec 2018 by Apriorit Inc, Vadym Zhernovyi
The experience of improving Mask R-CNN performance six to ten times by applying TensorRT
13 Oct 2012 by Maxim Kartavenkov
Article describes how to make DirectShow Filters in .NET, it consist of BaseClasses and couple of samples
22 Jul 2016 by Afzaal Ahmad Zeeshan
In this post, I am going to walk you through creating your own central hub to allow your connected devices to authenticate people using facial recognition system.
9 Dec 2016 by Arthur V. Ratz
In this article, we'll demonstrate an approach the allows to increase the performance (up to 600%) of the code that implements the conventional distribution counting algorithm (DCA) using NVIDIA CUDA 8.0 Runtime API
16 Sep 2013 by Nick Kopp
An introduction to using Cudafy.NET to perform processing on a GPU
2 Nov 2018 by Vangos
This post will show you how to build OpenCV for Windows with CUDA.
16 Sep 2013 by Nick Kopp
How to get 30x performance increase for queries by using your Graphics Processing Unit (GPU) instead of LINQ and PLINQ.
20 Sep 2015 by Bartlomiej Filipek
A little guide about modern OpenGL and why it gives us so much value.
26 May 2014 by CatchExAs
How to make best use of current technology for computationally intensive applications?
13 Oct 2012 by Alesiani Marco
A Wave PDE simulation using GPGPU capabilities
25 Oct 2010 by hax_
Introduction to the open-source hxGrid library for distributed computing. Main benefits of the library: cluster uses only idle time of Windows 2000/XP/Vista workstation (no dedicated workstations required); easy to use; free.
13 May 2017 by CMalcheski
64-bit calling convention
10 Jan 2011 by phoaivu
GPU Implementation of Extended Gaussian mixture model for Background Subtraction
10 May 2010 by Kevin Drzycimski
Unroll loops at compile time, deduced by a template argument.
13 Mar 2008 by billconan, kavinguy
This article describes the implementation of a neural network with CUDA.
16 Sep 2013 by Nick Kopp
Performing base64 encoding on a graphics processing unit using CUDAfy.NET (CUDA in .NET).
25 Jul 2016 by Igor Gribanov
Performing linear static analysis on a tetrahedral mesh with a little bit of help from a third-party solver.
18 Dec 2013 by Joren Heit
A Hybrid Framework Code-Generator for CUDA
23 Jul 2020 by wqaxs36
Math explanation and game engine coding.
26 Jul 2012 by headmyshoulder, Denis Demidov
This article shows how ordinary differential equations can be solved with OpenCL. In detail it shows how odeint - a C++ library for ordinary differential equations - can be adapted to work with VexCL - a library for OpenCL. The resulting performance is studied on two examples.
27 Jun 2010 by Wayne Wood
Verify the execution efficiency of a short CUDA program when using the library thrust
9 Nov 2011 by grilialex
How-To Embed Xilinx FPGA Configuration Data to AVRILOS
17 Apr 2016 by Ryan S White
an assembler/compiler for AMD’s GCN (Generation Core Next Architecture) Assembly Language
9 Mar 2012 by Adnan Boz
From spam filters to movie recommendation and face detection, nowadays machine learning algorithms are used everywhere to make the machine think for us. But, running these algorithms require high computation power and in most cases supercomputers. This is where the 500 core GPUs step in...
1 Oct 2008 by Andrew Kirillov
This article describes the implementation of parallel computations using plain C#.
8 Sep 2010 by Dan Buskirk
Understanding the organization of a Visual Studio project for CUDA development
22 Jun 2020 by Thomas Daniels
In this article, let’s dive into Keras, a high-level library for neural networks.
11 Jul 2013 by Matthew Faithfull
Querysoft Open Runtime: Architecture compatibility aspect.
15 Mar 2011 by Roman Ginzburg
A text overlay filter and a JPEG/JPEG2000 encoder using transform filters.
10 Jul 2012 by Kerem Kat
Process webcam images on the CPU and GPU with OpenCV, CUDA and C++ AMP
30 Nov 2016 by Dino Konstantopoulos
Running Theano with an Nvidia 1070 GPU on Windows 10, with CUDA 8 and Visual Studio 2015
9 Jan 2013 by Denis Demidov
This article is an introduction to VexCL. VexCL is vector expression template library created for ease of C++ based OpenCL development.
1 Sep 2009 by ChaoJui
Image processing with a burst of performance from CUDA
24 Dec 2012 by Mark H Bishop
Getting Cuda started on a VS Express budget
1 Jul 2010 by Wayne Wood
Verify the execution efficiency of a series of short .NET 4.0 parallel programming samples
21 Sep 2013 by Mark H Bishop
Tutorial: GPU computing with JCuda and Nsight (Eclipse)
28 May 2011 by grilialex
Flow and tools to convert Xilinx bitstreams to C source code for programming FPGA/CPLD
24 Oct 2017 by Packt Publishing
In this section, we'll take our first steps in using the low-level TensorFlow API.
3 Aug 2014 by Sushil Sh.
How to setup android development enviornment using eclipse and Android studio.
19 Oct 2011 by headmyshoulder
odeint v2 - Solving ordinary differential equations in C++
21 May 2012 by Jeff B. Cromwell
Granger Causality in both R and C#.NET with open source libraries.
20 Jan 2015 by Android on Intel
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add
22 Feb 2013 by Asif Bahrainwala
HPC via Compute Shaders (GPGPU).
12 Apr 2016 by Shao Voon Wong
Finding lexicographical permutations on GPU
6 Jan 2014 by Adam Wojnar
Simple .jp2/.j2k viewer using Kakadu executables demonstration pack for decoding
24 Aug 2003 by Alex Mikunov
Runtime MSIL Code Instrumentation and .NET Metadata Extensions
10 Sep 2009 by ChaoJui
High performance and good quality of image blurring
24 Jun 2005 by Philippe Kirsanov
A small class representing DateTime in seconds elapsed since "01 Jan, 0001 00:00:00".