Click here to Skip to main content
15,116,277 members
Everything / Programming Languages / CUDA

CUDA

CUDA

Great Reads

by Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
by Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
by Ryan S White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.
by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing

Latest Articles

by Shao Voon Wong
How to convert a code from parallel C++ ray-tracing code to CUDA, then to SYCL 2020 via Intel® DPC++
by Jeremy C. Ong
A quick 5-minute introduction to porting a CUDA app to Data Parallel C++ (DPC++)
by Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
by wqaxs36
A time-saver algorithm to do a number generator tool.

All Articles

Sort by Score

CUDA 

Please Sign up or sign in to vote.
16 Jul 2012
Maxim Kartavenkov
Article describes how to make H.264 Video Encoder DirectShow Filter using NVIDIA encoder API in C#
Please Sign up or sign in to vote.
16 Sep 2013
Nick Kopp
This article builds upon the earlier High Performance Queries: GPU vs. PLINQ vs. LINQ and ports this to also support OpenCL devices and adds benchmarking so you can easily compare performance.
Please Sign up or sign in to vote.
11 Jul 2020
Ryan S White
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of your Cuda code.
Please Sign up or sign in to vote.
27 Aug 2020
Carlos Jiménez de Parga
A reusable Visual C++ framework for real-time volumetric cloud rendering, animation and morphing
Please Sign up or sign in to vote.
17 Sep 2012
ObiWan_MCC
A C# SMTP server (receiver).
Please Sign up or sign in to vote.
22 May 2013
John Michael Hauck
It has never been easier for C# desktop developers to write code that takes advantage of the amazing computing performance of modern graphics cards. In this post I will share some techniques for solving a simple (but still interesting) image analysis problem. Source Code https://www.assembla.com/co
Please Sign up or sign in to vote.
2 May 2017
Arthur V. Ratz
This article is a practical guide on using Intel® Threading Building Blocks (TBB) and OpenMP libraries for C++ based on the example of delivering parallel scalable code that implements Burrows-Wheeler Transformation (BWT) algorithm.
Please Sign up or sign in to vote.
9 Feb 2013
Debdatta Basu
Examine the various approaches to implementing Radix sort on the GPU
Please Sign up or sign in to vote.
16 Feb 2016
Max R McCarty
OWASP's #6 most vulnerable security risk has to do with keeping secrets secret.
Please Sign up or sign in to vote.
16 Sep 2013
Nick Kopp
Ultra high quality frequency domain image rotation on a GPU.
Please Sign up or sign in to vote.
16 Jan 2021
Shao Voon Wong
How to convert a code from parallel C++ ray-tracing code to CUDA, then to SYCL 2020 via Intel® DPC++
Please Sign up or sign in to vote.
3 May 2017
Intel Corporation
In this blog post, we highlight one particular class of low precision networks named binarized neural networks (BNNs), the fundamental concepts underlying this class, and introduce a Neon CPU and GPU implementation.
Please Sign up or sign in to vote.
10 Nov 2020
Jeremy C. Ong
A quick 5-minute introduction to porting a CUDA app to Data Parallel C++ (DPC++)
Please Sign up or sign in to vote.
4 May 2017
Intel Corporation
Theano is a Python library developed at the LISA lab to define, optimize, and evaluate mathematical expressions, including the ones with multi-dimensional arrays (numpy.ndarray)
14 Sep 2016
Mike Lanzetta
In this post, I'll walk you through how to get one of the most popular toolkits up and running on Windows, and run through and explain some fun examples.
2 Apr 2012
manythreads
This sixth article in a series on portable multithreaded programming using OpenCL™ where Rob Farber discusses how to calculate data in OpenCL™ and render it with OpenGL within the same application.
18 Sep 2017
Intel Corporation
TotalView includes a set of tools that provide scientific and academic developers with controlover processes and thread execution, along with deep visibility into program states and data.
Please Sign up or sign in to vote.
10 Dec 2018
Apriorit Inc, Vadym Zhernovyi
The experience of improving Mask R-CNN performance six to ten times by applying TensorRT
Please Sign up or sign in to vote.
13 Oct 2012
Maxim Kartavenkov
Article describes how to make DirectShow Filters in .NET, it consist of BaseClasses and couple of samples
Please Sign up or sign in to vote.
22 Jul 2016
Afzaal Ahmad Zeeshan
In this post, I am going to walk you through creating your own central hub to allow your connected devices to authenticate people using facial recognition system.
Please Sign up or sign in to vote.
9 Dec 2016
Arthur V. Ratz
In this article, we'll demonstrate an approach the allows to increase the performance (up to 600%) of the code that implements the conventional distribution counting algorithm (DCA) using NVIDIA CUDA 8.0 Runtime API
Please Sign up or sign in to vote.
16 Sep 2013
Nick Kopp
An introduction to using Cudafy.NET to perform processing on a GPU
Please Sign up or sign in to vote.
16 Sep 2013
Nick Kopp
How to get 30x performance increase for queries by using your Graphics Processing Unit (GPU) instead of LINQ and PLINQ.
Please Sign up or sign in to vote.
20 Sep 2015
Bartlomiej Filipek
A little guide about modern OpenGL and why it gives us so much value.
Please Sign up or sign in to vote.
26 May 2014
CatchExAs
How to make best use of current technology for computationally intensive applications?
Please Sign up or sign in to vote.
13 Aug 2019
Sau002
How to create C# applications using TensorFlowSharp
Please Sign up or sign in to vote.
13 Oct 2012
Alesiani Marco
A Wave PDE simulation using GPGPU capabilities
Please Sign up or sign in to vote.
25 Oct 2010
hax_
Introduction to the open-source hxGrid library for distributed computing. Main benefits of the library: cluster uses only idle time of Windows 2000/XP/Vista workstation (no dedicated workstations required); easy to use; free.
Please Sign up or sign in to vote.
10 Jan 2011
phoaivu
GPU Implementation of Extended Gaussian mixture model for Background Subtraction
Please Sign up or sign in to vote.
2 Nov 2018
Vangos
This post will show you how to build OpenCV for Windows with CUDA.
Please Sign up or sign in to vote.
10 May 2010
Kevin Drzycimski
Unroll loops at compile time, deduced by a template argument.
Please Sign up or sign in to vote.
13 Mar 2008
billconan, kavinguy
This article describes the implementation of a neural network with CUDA.
Please Sign up or sign in to vote.
16 Sep 2013
Nick Kopp
Performing base64 encoding on a graphics processing unit using CUDAfy.NET (CUDA in .NET).
Please Sign up or sign in to vote.
25 Jul 2016
Igor Gribanov
Performing linear static analysis on a tetrahedral mesh with a little bit of help from a third-party solver.
Please Sign up or sign in to vote.
18 Dec 2013
Joren Heit
A Hybrid Framework Code-Generator for CUDA
Please Sign up or sign in to vote.
23 Jul 2020
wqaxs36
Math explanation and game engine coding.
Please Sign up or sign in to vote.
26 Jul 2012
headmyshoulder, Denis Demidov
This article shows how ordinary differential equations can be solved with OpenCL. In detail it shows how odeint - a C++ library for ordinary differential equations - can be adapted to work with VexCL - a library for OpenCL. The resulting performance is studied on two examples.
Please Sign up or sign in to vote.
27 Jun 2010
Wayne Wood
Verify the execution efficiency of a short CUDA program when using the library thrust
Please Sign up or sign in to vote.
9 Nov 2011
grilialex
How-To Embed Xilinx FPGA Configuration Data to AVRILOS
Please Sign up or sign in to vote.
3 Apr 2019
Mahsa Hassankashi
Deep learning convolutional neural network by tensorflow python, complete and easy understanding
Please Sign up or sign in to vote.
17 Apr 2016
Ryan S White
an assembler/compiler for AMD’s GCN (Generation Core Next Architecture) Assembly Language
Please Sign up or sign in to vote.
17 Dec 2017
Vladimir Dorokhov
This article is about building simple machine learning service using ASP.NET Core, Tensorflow and Azure Cloud
Please Sign up or sign in to vote.
9 Mar 2012
Adnan Boz
From spam filters to movie recommendation and face detection, nowadays machine learning algorithms are used everywhere to make the machine think for us. But, running these algorithms require high computation power and in most cases supercomputers. This is where the 500 core GPUs step in...
Please Sign up or sign in to vote.
1 Oct 2008
Andrew Kirillov
This article describes the implementation of parallel computations using plain C#.
Please Sign up or sign in to vote.
8 Sep 2010
Dan Buskirk
Understanding the organization of a Visual Studio project for CUDA development
Please Sign up or sign in to vote.
22 Jun 2020
Thomas Daniels
In this article, let’s dive into Keras, a high-level library for neural networks.
Please Sign up or sign in to vote.
11 Jul 2013
Matthew Faithfull
Querysoft Open Runtime: Architecture compatibility aspect.
Please Sign up or sign in to vote.
15 Mar 2011
Roman Ginzburg
A text overlay filter and a JPEG/JPEG2000 encoder using transform filters.
Please Sign up or sign in to vote.
10 Jul 2012
Kerem Kat
Process webcam images on the CPU and GPU with OpenCV, CUDA and C++ AMP
Please Sign up or sign in to vote.
30 Nov 2016
Dino Konstantopoulos
Running Theano with an Nvidia 1070 GPU on Windows 10, with CUDA 8 and Visual Studio 2015
Please Sign up or sign in to vote.
9 Jan 2013
Denis Demidov
This article is an introduction to VexCL. VexCL is vector expression template library created for ease of C++ based OpenCL development.
Please Sign up or sign in to vote.
1 Sep 2009
ChaoJui
Image processing with a burst of performance from CUDA
Please Sign up or sign in to vote.
1 Jul 2010
Wayne Wood
Verify the execution efficiency of a series of short .NET 4.0 parallel programming samples
Please Sign up or sign in to vote.
21 Sep 2013
Mark H Bishop
Tutorial: GPU computing with JCuda and Nsight (Eclipse)
Please Sign up or sign in to vote.
28 May 2011
grilialex
Flow and tools to convert Xilinx bitstreams to C source code for programming FPGA/CPLD
Please Sign up or sign in to vote.
24 Oct 2017
Packt Publishing
In this section, we'll take our first steps in using the low-level TensorFlow API.
Please Sign up or sign in to vote.
3 Aug 2014
Sushil Sh.
How to setup android development enviornment using eclipse and Android studio.
Please Sign up or sign in to vote.
19 Oct 2011
headmyshoulder
odeint v2 - Solving ordinary differential equations in C++
Please Sign up or sign in to vote.
21 May 2012
Jeff B. Cromwell
Granger Causality in both R and C#.NET with open source libraries.
Please Sign up or sign in to vote.
20 Jan 2015
Android on Intel
This tutorial shows how to use two powerful features of OpenCL™ 2.0: enqueue_kernel functions that allow you to enqueue kernels from the device and work_group_scan_exclusive_add and work_group_scan_inclusive_add
Please Sign up or sign in to vote.
22 Feb 2013
Asif Bahrainwala
HPC via Compute Shaders (GPGPU).
Please Sign up or sign in to vote.
12 Apr 2016
Shao Voon Wong
Finding lexicographical permutations on GPU
Please Sign up or sign in to vote.
6 Jan 2014
Adam Wojnar
Simple .jp2/.j2k viewer using Kakadu executables demonstration pack for decoding
Please Sign up or sign in to vote.
12 Apr 2018
Leandro T C Melo
In this article, I will present PsycheC, a type inference engine for the C language.
Please Sign up or sign in to vote.
24 Aug 2003
Alex Mikunov
Runtime MSIL Code Instrumentation and .NET Metadata Extensions
Please Sign up or sign in to vote.
10 Sep 2009
ChaoJui
High performance and good quality of image blurring
Please Sign up or sign in to vote.
24 Jun 2005
Philippe Kirsanov
A small class representing DateTime in seconds elapsed since "01 Jan, 0001 00:00:00".