Uncategorized

Using ComputeSharp for C#-Based HLSL

In much of my work, I need a quick-and-dirty way of executing relatively simple math on a GPU for the sake of computation speed and my own sanity. I recently found ComputeSharp, which is a pretty slick implementation of a C#-based HLSL crosscompilation toolchain. The packages are available on Nuget and can be easily pulled into existing C# projects. The following is my notes from bringing my first “Hello World” C#/ComputeSharp console app online:


I started by making a new .Net6 C# console application in VS2022. I then used the NuGet package manager to add both the ComputeSharp and ComputeSharp.Dynamic packages to my solution (specifically, the 2.0.0-alpha.19 version).

Within my new solution, I added a new “Shaders.cs” file to the project namespace, with the MultiplyByTwo shader example from the ComputeSharp documentation as an initial test:

using ComputeSharp;

namespace ComputeSharpTests
{
    [AutoConstructor]
    public readonly partial struct MultiplyByTwo : IComputeShader
    {
        public readonly ReadWriteBuffer<int> buffer;

        public void Execute()
        {
            buffer[ThreadIds.X] *= 2;
        }
    }
}

With my Shaders file ready, I opened the Program.cs runner file and added the necessary execution code and dummy test matrix to try out my shader:

using ComputeSharp;
using ComputeSharpTests;

Console.WriteLine("Starting ComputeSharp Test:");
int[] array = Enumerable.Range(1, 10).ToArray();

Console.WriteLine("Array before shader execution:");
foreach (int i in array)
{
    Console.Write(i + ",");
}

Console.WriteLine("Executing shader:");
using ReadWriteBuffer<int> buffer = GraphicsDevice.Default.AllocateReadWriteBuffer(array);
GraphicsDevice.Default.For(buffer.Length, new MultiplyByTwo(buffer));
buffer.CopyTo(array);

Console.WriteLine("Array after shader execution:");
foreach (int i in array)
{
    Console.Write(i + ",");
}

We should now be ready to run our first test of the ComputeSharp shader implementation. Hitting Run in Visual Studio yields the following result:

With a small array, this test clearly works. I decided to then test a much larger 10,000-element set (only logging i%1000==0 to console, but running them all):

Knowing the basic work, it was time to try something more complex, like processing an image. Since I’m working with a .Net Core console application, to read a Bitmap image I needed to add the System.Drawing.Common Nuget package to the project. Then I chose a test bitmap image from University of South Carolina’s test dataset page (I used the snail.bmp image, reproduced below):

I want to darken the image by changing the value of all pixels to half their current value. The shader for this would be:

using ComputeSharp;
namespace ComputeSharpTests
{
    [AutoConstructor]
    public readonly partial struct DivideByTwo : IComputeShader
    {
        public readonly ReadWriteBuffer<float4> buffer;

        public void Execute()
        {
            buffer[ThreadIds.X] /= 2;
        }
    }
}

And the runner code would be:

using ComputeSharp;
using ComputeSharpTests;
using System.Drawing;

Console.WriteLine("Starting ComputeSharp Test:");

string path = "E:\\Gitlab\\ComputeSharpTests\\snail.bmp";
string outputPath = "E:\\Gitlab\\ComputeSharpTests\\snail_scaled.bmp";

Bitmap image = new Bitmap(path);
float4[] array = new float4[image.Width * image.Height];
for(int i = 0; i < image.Width; i++)
{
    for (int j = 0; j < image.Height; j++)
    {
        int bufferIndex = i * image.Height + j;
        array[bufferIndex] = new float4(image.GetPixel(i, j).R, image.GetPixel(i, j).G, image.GetPixel(i, j).B, image.GetPixel(i, j).A);
    }
}

Console.WriteLine("\nExecuting shader...");
using ReadWriteBuffer<float4> buffer = GraphicsDevice.Default.AllocateReadWriteBuffer(array);
GraphicsDevice.Default.For(buffer.Length, new DivideByTwo(buffer));
buffer.CopyTo(array);

for (int i = 0; i < image.Width; i++)
{
    for (int j = 0; j < image.Height; j++)
    {
        int bufferIndex = i * image.Height + j;
        float4 result = array[bufferIndex];
        image.SetPixel(i, j, Color.FromArgb((byte)result.A, (byte)result.R, (byte)result.G, (byte)result.B));
    }
}
image.Save(outputPath);
Console.WriteLine("\nFinished ComputeSharp Test\n\n");

Executing the above yields an output snail_scaled.bmp with all channels reduced in value by half versus the source:

This 1D-buffer approach works well, but what if I want an effect that moves the snail to a new position?It would be easier to use a 2D vector of threadIDs so we know which pixel we’re working with without needing to do some math.

In the shader below, we’ll add an imageWidth integer so we can still compute our buffer index, and we’ll also make use of ThreadIds in both X and Y to see where we are in space quickly. The following shader will move the snail +shift.X pixels in x and +shift.Y pixels in y:

using ComputeSharp;
namespace ComputeSharpTests
{
    [AutoConstructor]
    public readonly partial struct Shift : IComputeShader
    {
        public readonly ReadOnlyBuffer<float4> source;
        public readonly ReadWriteBuffer<float4> destination;
        public readonly int imageHeight;
        public readonly int2 shift;

        public void Execute()
        {
            int offsetSource = ThreadIds.X * imageHeight + ThreadIds.Y;
            int offsetDestination = (ThreadIds.X + shift.X) * imageHeight + (ThreadIds.Y + shift.Y);
            destination[offsetDestination] = source[offsetSource];
        }
    }

The associated runner code needs some changes too, to allow us to pass in our parameters to the shader:

using ComputeSharp;
using ComputeSharpTests;
using System.Drawing;

Console.WriteLine("Starting ComputeSharp Test:");

string path = "E:\\Gitlab\\ComputeSharpTests\\sample.bmp";
string outputPath = "E:\\Gitlab\\ComputeSharpTests\\sample_modified.bmp";

Bitmap image = new Bitmap(path);
float4[] array = new float4[image.Width * image.Height];
for(int i = 0; i < image.Width; i++)
{
    for (int j = 0; j < image.Height; j++)
    {
        int bufferIndex = i * image.Height + j;
        array[bufferIndex] = new float4(image.GetPixel(i, j).R, image.GetPixel(i, j).G, image.GetPixel(i, j).B, image.GetPixel(i, j).A);
    }
    if (i % 100 == 0)
    {
        Console.WriteLine(i);
    }
}

Console.WriteLine("\nExecuting shader...");
using (ReadOnlyBuffer<float4> source = GraphicsDevice.Default.AllocateReadOnlyBuffer(array))
using (ReadWriteBuffer<float4> destination = GraphicsDevice.Default.AllocateReadWriteBuffer(array))
{
    GraphicsDevice.Default.For(image.Width, image.Height, new Shift(source, destination, image.Height, new Int2(25,25)));
    destination.CopyTo(array);
}
Console.WriteLine("\nFinished executing shader...");

for (int i = 0; i < image.Width; i++)
{
    for (int j = 0; j < image.Height; j++)
    {
        int bufferIndex = i * image.Height + j;
        float4 result = array[bufferIndex];
        image.SetPixel(i, j, Color.FromArgb((byte)result.A, (byte)result.R, (byte)result.G, (byte)result.B));
    }
    if (i % 100 == 0)
    {
        Console.WriteLine(i);
    }
}
image.Save(outputPath);
Console.WriteLine("\nFinished ComputeSharp Test\n\n");

The simplicity of working with C#, merged with the speed of HLSL shaders, promises to significantly improve many of my daily “bash-something-out” workloads – which is super exciting.

I’ll likely be writing up some more complex guides/stuff I figure out as I keep playing with this package, so keep an eye out for that. In the meantime, if you have any questions or suggestions let me know in the comments below!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.