GPU Geometry Map Rendering – Part 1

January 9, 2010 by Dave Carlile · 1 Comment
Filed under: C#, Procedural Planet 

I spent the past week moving my procedural planet renderer’s geometry map creation code from the CPU to the GPU. It didn’t go as smoothly as I would have liked, but in a way that was a good thing since I gained a much deeper understanding of some render pipeline things that I had been taking for granted. I also learned how to use PIX for shader debugging, which I now realize I should’ve done a long time ago. I hope to walk through some shader debugging in a later post.
 

CPU Geometry Maps

You may recall that a planet is defined as a cube with six faces. Each face can be thought of as a single flat plane when it comes to generating height values, and going forward we’ll do just that by considering  only the front cube face.

The cube face is defined with a coordinate system with (-1, -1) on the lower left, and (1, 1) at the upper right. This is very similar to clip-space, but we’ll refer to it as cube-face-space or just face-space.

 

 Cube-Face-Space
 

The geometry map can be thought of as a grid overlaying the cube face. Each position in the grid can also be thought of as a pixel in a heightmap. The grid can be as detailed as needed, but each dimension should match 2^n+1 – for example: 33×33, 65×65 – in order to help with subdividing. A very commonly  used size is 33×33, and this is what I use. However, for this discussion we’ll use 5×5 to make things a bit easier to talk about. The geometry map has its own coordinate space, with (0, 0) on the lower left and (1, 1) on the upper right. Let’s call this geomap-space.

 

Geomap-Space
 

We need to generate a height value for each position in the geometry map grid. To do this we map from a geomap-space coordinate to a face-space coordinate, and use the result to generate the height at that position and store that height in the geometry map.

Here is a stripped down version of the CPU code I was using to create my geometry maps. The left, top, width, and height parameters define the face-space area we’re generating height values for.   For a 5×5 geometry map over the entire face this function would be called like this:
 

CreateGeometryMap(-1, 1, 2, 2, 5, 5);

 

public void CreateGeometryMap(float left, float top, float width, float height)
{
  // Calculate how far we need to move horizontally for each vertex
  // so the first is at "left", and the last is at "left + width".
  // Do the same for the vertical dimension. GeometryMapWidth and
  // GeometryMapHeight define the geometry map dimensions
  float horizontalStep = width / (GeometryMapWidth - 1);
  float verticalStep = height / (GeometryMapHeight - 1);

  float y = top - height;      // start at the bottom

  for (int gy = 0; gy < GeometryMapHeight; gy++)
  {
    float x = left;

    for (int gx = 0; gx < GeometryMapWidth; gx++)
    {
      geometrymap[gx * GeometryMapWidth + gy] = GetHeightAt(x, y);
      x += horizontalStep;
    }

    y += verticalStep;
  }
}

 
Let’s walk through what it’s doing, in just the horizontal dimension since the vertical is exactly the same. This is a key to understanding how the GPU version works, so don’t skip over it.

The code iterates over the geometry map pixels from 0 to 4. The first pixel corresponds to -1 in face-space, the last pixel corresponds to +1 in face-space.  The intervening pixels are found using the horizontalStep variable, which is calculated by dividing the face-space width by the geometry map width minus one. Remember that we passed in -1 for left, and 2 for width, so horizontalStep is 2 / (5 – 1), or 0.5.

The variable x starts out with the value left, or -1.  Walking through the inner loop we get these values for x:

gx = 0, x = -1
gx = 1, x = -0.5
gx = 2, x = 0
gx = 3, x = 0.5
gx = 4, x = 1.0

 

If our geometry map width was 33, horizontalStep would be 2 / (33 – 1) or 0.0625. The first few steps of the loop would look like this:

gx = 0, x = -1
gx = 1, x = -0.9375
gx = 2, x = -0.875
gx = 3, x = -0.8125
gx = 4, x = -0.75
gx = 5, x = -0.6875
gx = 6, x = -0.625

 

And so on out to gx = 32 and x = 1. As mentioned before, determining y works exactly the same.

 So, that pretty much takes care of the CPU method. It works great, it’s fairly simple, and it’s slower than a turtle crossing an Iowa road in winter. Well, to be fair, the GetHeightAt() function is slow because of the nature of the noise functions. Moving that functionality to the GPU is where we see the huge performance wins. So lets get to it.

GPU Geometry Maps

On the GPU side, we need to be able to generate exactly the same face-space coordinates as CalculateGeometryMap. Why does it have to be exact? Since the GPU only returns the height values, the CPU still has to determine the x and y values since they’re also used to create the actual vertices. If the x and y used by the GPU is different from the one used by CPU, bad things happen, like terrain patches shifting a pixel or two when they’re subdivided, and three 18 hour days spent trying to figure out why (yes, this is how I spent my week-long vacation from my “real” job).

When you think about needing the GPU to iterate over something in a general purpose way, what should immediately come to mind is texture coordinates.  I’m enough of a n00b at this stuff that it didn’t immediately come to my mind, so while letting Google do my thinking for me I ran across the Britonia blog, which looks like it will prove to be a very helpful resource when it comes to this whole planet creation business.

The idea I came across on that blog is to have the GPU interpolate texture coordinates in such as way as they match the x and y values on the CPU version. It was a breakthrough moment for me, I ran with it, and soon came to a screeching halt. Let’s start with the “running with it” part though.

First thing is to set up a render target which will hold our geometry map. The render target needs to match the size of our desired geometry map, so we just create it with the same dimensions. (Note that all of this code will be available in the sample linked at the end of part 2 of the article).

 

const int Width = 5;
const int Height = 5;

renderTarget = new RenderTarget2D(GraphicsDevice, Width, Height, 1,
                                  SurfaceFormat.Color, MultiSampleType.None,
                                  0, RenderTargetUsage.DiscardContents);

 
Note that I’m using SurfaceFormat.Color here. In the actual version you want SurfaceFormat.Single so you get 32-bit floating point goodness. In this case Color works out fine since we’re going to be examining the output in PIX and don’t really care what the final format looks like.

Next thing is to set up a full screen quad with pre-transformed vertices. Pre-transformed means we’re defining the vertices in clip space, so no transformation is necessary in the vertex shader. We can just pass the coordinates directly to the vertex shader with no changes. Also, we define texture coordinates that cover the full quad, so the GPU will interpolate them from 0 to 1 for us as it processes each pixel.

 

vertices = new VertexPositionTexture[4];
vertices[0] = new VertexPositionTexture(new Vector3(-1, 1, 0f), new Vector2(0, 1));
vertices[1] = new VertexPositionTexture(new Vector3(1, 1, 0f), new Vector2(1, 1));
vertices[2] = new VertexPositionTexture(new Vector3(-1, -1, 0f), new Vector2(0, 0));
vertices[3] = new VertexPositionTexture(new Vector3(1, -1, 0f), new Vector2(1, 0));

 
Note that “full screen” really means “full render target”. Because the render target is 5×5, and we’re rendering a full screen quad to it, the quad will contain 5×5 pixels, and our pixel shader will be executed for each of those pixels, with the texture coordinates interpolated over the pixels from 0 to 1 in each dimension.  (Yes, the screeching halt will soon be upon us).

The last thing we need is to tell the pixel shader what face-space dimensions to work with. These are the same parameters passed to the CreateGeometryMap function on the CPU.

 

quadEffect.Parameters["Left"].SetValue(-1.0f);
quadEffect.Parameters["Top"].SetValue(1.0f);
quadEffect.Parameters["Width"].SetValue(2.0f);
quadEffect.Parameters["Height"].SetValue(2.0f);

 

To rehash a bit, CreateGeometryMap required the face-space dimensions (left, top, width, height), as well as constant values defining the geometry map dimensions. For our pixel shader, the face-space dimensions are taken care of by the effect parameters, and the geometry map dimensions are taken care of by the render target dimensions.

All that’s left now is drawing the quad.

 

GraphicsDevice.DrawUserPrimitives(PrimitiveType.TriangleStrip, vertices, 0, 2);

 

And, now the screeching halt…

The results were close, but not what I expected. Neighboring geometry maps were off by a pixel or two, and when quad tree nodes split, all the geometry seemed to shift a pixel or two. It was all very distracting and ugly. After spending quite awhile walking through code and trying to figure out where I went wrong, I finally decided it was time to install PIX.

And the rest will have to wait until the next post, where I’ll go through the shader itself, and walk through debugging it in PIX to see what’s going on.

Procedural Planet Engine Status

December 30, 2009 by Dave Carlile · 2 Comments
Filed under: C#, Procedural Planet, XNA 

Previously I mentioned I was going to do a mulligan on my procedural planet engine. The few hours I’ve worked on it so far have lead to a beautiful new architecture that’s doing most of the same things as before, as well as some major new things, using about 25% of the code.

Here is where  things stand currently. I’ll go through some of these in more detail in a later post:

The planet consists of a cube, with the vertices mapped to a sphere. Each of the six cube faces is a quad tree which is used for subdividing the terrain as you move closer to the planet. Each node in the quad tree represents a patch of terrain with 33×33 vertices that are spread out evenly to cover the patch’s area.

In the previous version the quad tree nodes were subdivided synchronously, which resulted in jerkiness when moving slowly, and outright 5 second waits when moving quickly if a lot of nodes needed to be subdivided. That was good enough then since my priorities were elsewhere, but it’s not good enough for the new version. Now, when a node needs to be split the request is queued on a separate thread. The current node will continue to draw until the split is complete. The split requests can be cancelled as well if the camera has moved elsewhere before the split request reaches the head of the queue.

The nice thing about this design is that if you’re moving very fast you end up getting fewer node splits because they’re cancelled before they happen since they’re no longer necessary. Conversely, if you’re moving slowly the splits can easily keep up with your location so you get all of the required detail.  On the con side, if you’re moving quickly down to a low level, then stop, it can take a bit for the queue to catch up generating the terrain patches, so the detail can take awhile to show up.

Generating a patch currently happens on the CPU using Perlin as a noise basis and various fractal algorithms such as fBm, Turbulence, and Ridged Multifractal. I will be moving this to the GPU over the coming weeks which will vastly improve the “catching up” problem mentioned previously. This will also enable creating procedural normal maps and textures on the fly.

So, the current version of the app lets me start out in space and fly to an Earth-sized planet down to ground level with ever increasing detail, and absolutely no stalling. The entire planet can be explored, but there is no texturing yet, and lighting is using vertex normals so it’s fairly ugly, but it gets the job done at this stage.

I think the next thing I will do is work on moving the patch generation to the GPU. This seemed like a daunting task 8  months ago, but it should be pretty straightforward now. This is a requirement to allow generating higher resolution procedural normal maps, which will be a big step in improving the look of the terrain.

So, that’s it for now.  In future posts I’ll go through some of these features in more detail and discuss how I did things.

Sprite Sheet Creator

August 25, 2009 by Dave Carlile · 1 Comment
Filed under: C#, Game Programming, Tools, XNA 

When developing the iPhone version of Guardian I manually created my sprite sheets.  I used individual sprites up until the end so everything was pretty much set in stone by the time I created the the sprite sheet.  Even then I ended up having to recreate the sprite sheet two or three times, and let me tell you, manually figuring out the texture coordinates isn’t a particularly pleasant experience. In this case I believe I made the right choice.  There were few enough sprites that I would have spent more time creating the tool than I would have saved.

The XBox version has quite a few more sprites, so I decided that spending time creating a sprite sheet tool was going to be well worth the effort.  It didn’t take too long to get it working well enough to use, and not too much longer than that to make it solid enough for distribution.

Sprite Sheet Creator

The application is released as open source under the MIT License.

Download SpriteSheetCreator.zip

Simplified XNA Message Boxes

June 11, 2009 by Dave Carlile · Leave a Comment
Filed under: C#, Game Programming, XNA 

Shawn Hargreaves brings up the subject of how annoying async coding can be.  Calling a “begin” method, dealing with the completion callback function, handling the results – it’s all very ugly to keep track of, and often leads to very ugly code.

He wants to be able to write code like this (and so do I)…

 

 int? button = Guide.ShowMessageBox("Save Game",
                                       "Do you want to save your progress?",
                                       new string[] { "OK", "Cancel" },
                                       0, MessageBoxIcon.None);

    if (button == 0)
    {
        StorageDevice storageDevice = Guide.ShowStorageDeviceSelector();

        if (storageDevice != null)
        {
            using (StorageContainer storageContainer = storageDevice.OpenContainer("foo"))
            {
                ...
            }
        }
    }

 

 

It turns out that making async code work almost like this isn’t too bad to do. It basically involves creating a static class to encapsulate all of the various things you need to keep track of. Here is the fairly well commented code for the static class.

class SimpleMessageBox
  {
    private static int? dialogResult = null;
    public static bool Showing { get; set; }

    public static int? ShowMessageBox(string title, string text, IEnumerable  buttons, int focusButton, MessageBoxIcon icon)
    {
      // don't do anything if the guide is visible - one issue this handles is showing dialogs in quick
      // succession, we have to wait for the guide to go away before the next dialog can display
      if (Guide.IsVisible) return null;

      // if we have a result then we're all done and we want to return it
      if (dialogResult != null)
      {
        // preserve the result
        int? saveResult = dialogResult;

        // reset everything for the next message box
        dialogResult = null;
        Showing = false;

        // return the result
        return saveResult;
      }

      // return nothing if the message box is still being displayed
      if (Showing) return null;

      // otherwise show it
      Showing = true;
      Guide.BeginShowMessageBox(title, text, buttons, focusButton, icon, MessageBoxEnd, null);
      return null;
    }

    private static void MessageBoxEnd(IAsyncResult result)
    {
      dialogResult = Guide.EndShowMessageBox(result);

      // if no button was pressed then we want the result to be -1
      if (dialogResult == null)
        dialogResult = -1;
    }

 

Using the class involves calling SimpleMessageBox.ShowMessage(…) in your Update() method. You continue to call it each frame until it returns a result. This does require some game state information (i.e. your game state is SaveGameState or something similar) so it takes a little extra work, but you have to keep track of those sorts of states anyway.

Here’s a sample of the usage:

protected override void Update(GameTime gameTime)
    {
      base.Update(gameTime);

      if (saveGame)
      {
        // show the message box - we end up calling this each frame as long as we're in the saveGame state - it will
        // return null until the user presses a button or closes the guide - it returns -1 if the guide
        // is closed, otherwise it returns the button number
        int? button = SimpleMessageBox.ShowMessageBox("Save Game", "Do you want to save your progress?",
                                                      new string[] { "OK", "Cancel", "Repeat" }, 0, MessageBoxIcon.None);

        switch (button)
        {
          case -1:
            message = "No Button";
            saveGame = false;
            break;

          case 0:
            message = "Saved";
            saveGame = false;
            break;

          case 1:
            message = "Cancelled";
            saveGame = false;
            break;

          case 2:
            message = "Repeat";
            break;
        }
      }
    }

 

I haven’t use this code in a real project yet (just the sample), but it seems like it would work in quite a few situations. It’s a bit different than doing a message box in Windows since you have to realize you’re calling the ShowMessageBox method each frame. That aside, you can almost imagine that you’re using a blocking message box function.

Download Sample Project

Lens Flare Occlusion Using Texture Masking and XNA

May 30, 2009 by Dave Carlile · Leave a Comment
Filed under: C#, Game Programming, XNA 

I have an article posted on Ziggyware that discusses an alternate method to hardware occlusion queries for checking sun visibility to control lens flare intensity.  The article is part of a contest so I can’t post it here until the contest has been over for awhile.  I can link to it however.

Lens Flare Occlusion Using Texture Masking and XNA

Hope everyone enjoys it.

Sprite Splitting with SpriteBatch

May 5, 2009 by Dave Carlile · 3 Comments
Filed under: C#, Game Programming, XNA 

Someone over in the XNA forums asked a question about how to make sprite explosions like those in the old Defender arcade game, where the sprite is broken into pieces and exploded everywhere.

This effect can be done using nothing more than the XNA SpriteBatch class. One of the overloaded Draw() methods allows you to pass a source rectangle. When drawing a sprite you can use the source rectangle to grab just a part of it. So it’s a simple matter to use multiple draw calls on a single sprite to draw little pieces of it, like so:

// draw parts of the sprite
int xInc = 8;
int yInc = 8;
float spacing = 1.5f;

// draw parts of the sprite
for (int x = 0; x < face.Width; x += xInc)
  for (int y = 0; y < face.Height; y += yInc)
  {
    Vector2 position = new Vector2(100 + x * spacing, 150 + y * spacing);
    Rectangle source = new Rectangle(x, y, xInc, yInc);
    spriteBatch.Draw(face, position, source, Color.White);
  }

“Multiple draw calls” sounds bad, but SpriteBatch is able to batch up the draw calls so you shouldn’t notice any real effect on performance.

You can download a sample XNA project to see this in action. The project also includes a SpriteExploder class that will automatically explode your sprite into multiple pieces and throw them about the screen.

screenshot

Download Sample Project

Next Page »