OpenGL 5: Hands On

Watching our cube rotate about is quite mesmerising but it’s not very interactive. It’d be much more interesting if we could choose where and how to look at it ourselves. To do that we’re going to need to introduce another coordinate system and get familiar with the event loop.

When we’re done we’ll be able to move around and view our cube from any position or angle. As usual we’re going to look at a preview and you’re going to ask how does it show that? Click into the window below and move around with WASD or your arrow keys! You can get back by right clicking or pressing Escape.

He's Doing It Backwards

If you cast your mind all the way back to A Matter Of Perspective in Chapter 3 we introduced the camera just to create the Projection Transformation. Now we’re going allow the camera to move around in world space too but how we enable this might surprise you. Unless you read the title of this section of course.

When we move the camera about, say to (0, 0, 1.5) in world space, we still need to convert everything into the Camera Space we discussed before which requires the camera to be at (0, 0, 0). So how do we recconcile this difference? We move everything else by the reverse of the camera’s position!

As we’ve done with every other transformation this is done by multiplying by a matrix. The big advantage of using a matrix here, instead of just subtracting a position vector, is we can also include the camera’s orientation. Rotating the camera clockwise is the same as rotating the entire world anti-clockwise around the camera’s position.

So how do we get this magical reverse transformation matrix? We’re going to Invert the matrix and this will give us exactly what we need. OpenTK provides two methods for this: an in-place Invert() method and an Inverted() method that returns the newly inverted matrix. We’ll be using the second to make things clear.

Matrix4 camera = Matrix4.CreateRotationX(-0.3f)
    * Matrix4.CreateTranslation(0f, 0.5f, 1.5f);
Matrix4 view = camera.Inverted();

If you’re not comfortable with the creation of the camera matrix then check back with Chapter 3. Once it’s inverted we name it view and you’ll see this matrix a lot. This view or viewMatrix completes the chain of transformations to take us from Model Space all the way back to NDC.

As we’re moving our camera we can remove the translation component from our cube’s transformation matrix.

// you might have different values here
Matrix4 translation = Matrix4.CreateRotationX(-2.2f)
    * Matrix4.CreateRotationY(0.7f);

With the matrix set up on the C# side we need to do the now usual process of sending it to the Vertex Shader. You should know every step of this well now. We’ll grab the uniform index and uniform a matrix to it every frame.

// setup code
int uView = GL.GetUniformLocation(shader.Id, "view");

// loop code
GL.UniformMatrix4f(uView, 1, true, view);

Now we can update our Vertex Shader to match:

// new uniform
uniform mat4 view;

void main ()
{
    uvcoord = uv;
    gl_Position = vec4(position, 1.0) * rotation * view * projection;
}

Make sure you get the multiplication order correct! Try shufling it around and see how it breaks.

Very Classy

It’s been a moment since we created a new class so lets fix that with a Camera class. There’s quite a lot of information that goes into a camera that we’ve currently got spread all over the place.

using OpenTK.Mathematics;

class Camera
{
    Vector3 position;
    float yaw;
    float pitch;

    Matrix4 projection;
    Matrix4 transform;
    Matrix4 view;

    float near = 0.1f;
    float far = 100f;

    public Camera(float aspectRatio)
    {
        position = new Vector3(0f, 0.5f, 1.5f);
        yaw = 0f;
        pitch = 0f;

        projection = Matrix4.CreatePerspectiveFieldOfView(MathHelper.DegreesToRadians(90f), aspectRatio, near, far);

        Update();
    }

    // the same as our previous code in the update loop
    private void Update()
    {
        Matrix4 camera = Matrix4.CreateRotationX(-0.3f)
            * Matrix4.CreateTranslation(0f, 0.5f, 1.5f);
        view = camera.Inverted();
    }

    public Matrix4 Projection => projection;
    public Matrix4 View => view;
}

Unlike the other classes we’ve made this one needs to be instantiated a bit earlier. We’ll need to use it when we HandleEvents so we’ll new it up just before then. To provide the correct aspect ratio we’ll need to ask OpenTK to give us the ClientSize width and height. It’s important here to use ClientSize and not Size which includes all of the window decorations like title bars and borders which would give us the wrong aspect ratio.

Toolkit.Window.SetMode(window, WindowMode.Normal);

// this isn’t setup code!
Toolkit.Window.GetClientSize(window, out Vector2i clientSize);
Camera camera = new Camera((float)clientSize.X/clientSize.Y);

void HandleEvents(PalHandle? handle, PlatformEventType type, EventArgs args)

We can then remove the previous projection and view matrix calculations as well as updating our uniform values:

GL.UniformMatrix4f(uProjection, 1, true, camera.Projection);
GL.UniformMatrix4f(uView, 1, true, camera.View);

If you get lost remember you can check the complete code for this chapter.

The First Person

With our camera all set up we can finally get our hands on it. We’ll start by adding a new case to the inside of HandleEvents that will match MouseMoveEventArgs.

switch (args)
{
    case CloseEventArgs close:
        Toolkit.Window.Destroy(window);
        break;
    case MouseMoveEventArgs mouseMove:
        break;
}

Let’s look at what mouseMove has to offer: a Vector2 named ClientPosition and a handle to the Window that generated the event. As we’ve only got one window we don’t have to worry about the second one so lets take a closer look at the first. ClientPosition comes with some important documentation:

The new position of the mouse cursor in client coordinates. Use IWindowComponent.ClientToScreen(WindowHandle, Vector2, out Vector2) and IWindowComponent.ClientToFramebuffer(WindowHandle, Vector2, out Vector2) to convert to the respective coordinate spaces. When using CursorCaptureMode.Locked this property will contain a virtual mouse position and will not correspond an actual location in client coordinates.

There’s some useful info about handing coordinate transforms but we’re interested in the final part about CursorCaptureMode. Right now this value is going to contain values representing how far the mouse is away from the top left of our window. We need to switch the mode to Locked so it contains uncapped values. To do that we’ll set the mode before we get into the setup code.

Camera camera = new ((float)clientSize.X/clientSize.Y);

// new code
Toolkit.Window.SetCursorCaptureMode(window, CursorCaptureMode.Locked);
Toolkit.Window.SetCursor(window, null);

We’re also going to hide the cursor by passing null into SetCursor so it doesn’t block the view of our favourite cube. With the ClientPosition values now in virtual mode we need to calculate how far the mouse moved each frame as this will determine how much the orientation is changed by. This requires knowing the previous position and that needs to be kept outside of our loop. Add a Vector2 last to the pre-setup code so we can hold onto this data:

Vector2 last = Vector2.Zero;

Finally we can return to HandleEvents to join everything together. We’ll start by calculating the difference in position between the last frame and the ClientPosition. This diff is what we’ll send to a new Look method on our Camera class. Then we need to update the last value to keep everything going. I’ll also add a scaling factor to control the sensitivity which you might need to adjust.

case MouseMoveEventArgs mouseMove:
    Vector2 diff = mouseMove.ClientPosition - last;
    camera.Look(diff / 1000f);
    last = mouseMove.ClientPosition;
    break;

Unfortunately just calling a method doesn’t will it into existence so now head over to our Camera class and add in the matching Look definition:

public void Look(Vector2 delta)
{
    yaw -= delta.X;
    pitch -= delta.Y;

    Update();
}

Note that we’re subtracting the delta value not adding it. If you want to invert your mouselook this is one of the places to do that by adding the Y value instead. Finally we need to update our Update to use the yaw and pitch values. To save a bit of compute power I’ll be doing our roations using a Matrix3 then setting the position directly with the 4th row. I’m also splitting the view matrix, that’s the inverted one, from the transform matrix to use later on.

private void Update()
{
    Matrix3 rotation =
        Matrix3.CreateRotationX(pitch) *
        Matrix3.CreateRotationY(yaw);

    transform = new Matrix4(rotation);
    transform.Row3 = new Vector4(position, 1);

    view = transform.Inverted();
}

With everything now hooked together run the project and see how it feels. You’ll need to quit using the keyboard which we’ll improve in the next part.

Letting Go

As fun as moving our camera around is permanently losing our mouse whenever the window is open is a bit of a pain. To fix that we’ll require a key to be pressed to enable mouse look. We’ll do this by listening to a pair of EventArgs called, unsurprisingly, KeyDownEventArgs and KeyUpEventArgs. Add these two to your switch:

case KeyDownEventArgs keyDown:
    if(keyDown.IsRepeat) break;
    break;
case KeyUpEventArgs keyUp:
    break;

I’ve already added a check for IsRepeat as KeyDown events are fired by the OS not just when the user presses the key down to begin with but again and again until they release it. We don’t want to do anything with these so if it’s a repeated event break away. KeyDownEventArgs has a number of properties including Key and ScanCode which provide similar but not identical information. We’re going to use ScanCode and I’m not even going to attempt to explain the difference! Add a switch to both events to work out which key (scancode) is changing:

switch (keyDown.Scancode)
{
    case Scancode.LeftAlt:
        break;
}

I’m going to use LeftAlt to control my mouse look but you can pick any key. Have a look through the list to see the full range available. Changing the CursorCaptureMode is as simple as you expect - set Locked inside the key down and Normal on key up. The cursor needs a little bit more work as we need to get a copy of the default cursor so we can restore it later on. Update the existing grab code by removing it and replacing with:

/* removed code
Toolkit.Window.SetCursorCaptureMode(window, CursorCaptureMode.Locked);
Toolkit.Window.SetCursor(window, null);
*/
// new code
CursorHandle defaultCursor = Toolkit.Cursor.Create(SystemCursorType.Default);

Now we can clear and restore the cursor in our event handlers alongside the capture mode:

case KeyDownEventArgs keyDown:
    if(keyDown.IsRepeat) break;
    switch (keyDown.Scancode)
    {
        case Scancode.LeftAlt:
            Toolkit.Window.SetCursorCaptureMode(window, CursorCaptureMode.Locked);
            Toolkit.Window.SetCursor(window, null);
            break;
    }
    break;
case KeyUpEventArgs keyUp:
    switch (keyUp.Scancode)
    {
        case Scancode.LeftAlt:
            Toolkit.Window.SetCursorCaptureMode(window, CursorCaptureMode.Normal);
            Toolkit.Window.SetCursor(window, defaultCursor);
            break;
    }
    break;

Although this does restore our default mouse behaviour the Camera is still moving around. We need a grabbed value that we can check against in the mouse move section. Add a bool grabbed next to defaultCursor and set it in our two switch cases then finally check for grabbed before calling Look.

case MouseMoveEventArgs mouseMove:
    Vector2 diff = mouseMove.ClientPosition - last;
    if(grabbed) camera.Look(diff / 1000f);
    last = mouseMove.ClientPosition;
    break;

If everything lines up you’ll have control of your mouse unless you press your grab button. Can you make it work the opposite way?

Get A Move On

We’re not avant garde producers that never move the camera so lets add in this functionality. We’ll be implementing a classic WASD controlled camera so lets jump straight into expanding our HandleEvents switch:

case KeyDownEventArgs keyDown:
    if(keyDown.IsRepeat) break;
    switch (keyDown.Scancode)
    {
        case Scancode.LeftAlt:
            Toolkit.Window.SetCursorCaptureMode(window, CursorCaptureMode.Locked);
            Toolkit.Window.SetCursor(window, null);
            grabbed = true;
            break;
        case Scancode.W: break;  // forwards?
        case Scancode.S: break;
        case Scancode.A: break;
        case Scancode.D: break;
    }
    break;

Now that we’re happy our W key is down lets make the camera move forwards! We’ll add a new method to the Camera class to handle movement for us:

public void Move(Vector3 move)
{
    position -= move * 0.05f;
    Update();
}

Although we’re only doing forwards and sideways movement we’ll use a Vector3 to make adding vertical movement possible later. As with the camera orientation we need a scaling factor to bring the movement speed down to something sensible. Could you implement some sort of sprinting behaviour with what we’ve covered so far?. Now head back to our switch and have a go at hooking it up:

switch (keyDown.Scancode)
{
    // [cursor code]
    case Scancode.W: camera.Move((1f, 0)); break;
    case Scancode.S: camera.Move((-1f, 0)); break;
    case Scancode.A: camera.Move((0, 1f)); break;
    case Scancode.D: camera.Move((0, -1f)); break;
}

Run the project and give it a spin? How does moving around work with this setup and what are the problems? The big one is the camera doesn’t go where it’s looking which we’ll fix in the next section. Another problem is we can only move in one direction at once - we’ll fix this one now.

Head back up to where we store the last value for mouse movement and add another Vector3 named move. As the name suggests we’ll be storing the movement value in this:

Vector2 last = Vector2.Zero;
Vector3 move = Vector3.Zero;

Now in our loop code we can call camera.Move with this value:

// loop code
camera.Move(move);

Finally head back to our HandleEvents and we’ll modify this value instead:

case Scancode.W: move.Z -= 1f; break;
case Scancode.S: move.Z += 1f; break;
case Scancode.A: move.X -= 1f; break;
case Scancode.D: move.X += 1f; break;

Run the project now and awayyyy we go! There’s no stopping us which is a bit of a problem but the solution is straightforwards. We’ll add matching cases to the key up event with the reverse values:

case KeyUpEventArgs keyUp:
    switch (keyUp.Scancode)
    {
        case Scancode.LeftAlt:
            Toolkit.Window.SetCursorCaptureMode(window, CursorCaptureMode.Normal);
            Toolkit.Window.SetCursor(window, defaultCursor);
            grabbed = false;
            break;
        case Scancode.W: move.Z += 1f; break;
        case Scancode.S: move.Z -= 1f; break;
        case Scancode.A: move.X += 1f; break;
        case Scancode.D: move.X -= 1f; break;
    }
    break;

If you’re not moving in the direction you expect check the -= and += are correct. The typical values for S and D are positive in OpenGL (Right Handed) but not everywhere else. Can you implement vertical (up and down) movement?

Go Where You’re Looking

As slick as our camera movement is now there’s still a slight massive issue: it doesn’t go where we’re looking! The movement is locked down to the X & Z axis which might be ok for an RTS style camera but for an FPS style free camera it doesn’t cut the mustard. The fix for this is surprisingly simple as we already have everything we need.

To know what we should be adding to position to move in the camera’s orientation we create a new coordinate system built on the camera. All that’s needed to create a new coordinate system are three vectors called front, up and right and for once they have very descriptive names. Cast your mind back to the Update function and remember how we saved a translation matrix? We can use this to extract the vectors we need:

Vector3 right = transform.Row0.Xyz;
Vector3 up = transform.Row1.Xyz;
Vector3 front = transform.Row2.Xyz;

The Xyz property here is called a swizzle and gives a shorthand way of pulling components out of a Vector. There’s a whole bunch of them available both in OpenTK and in GLSL!

We only want to move along these vectors when asked so we’ll multiply them by their matching move component before adding them all together. Our complete Move method now looks like this:

public void Move(Vector3 move)
{
    Vector3 right = transform.Row0.Xyz;
    Vector3 up = transform.Row1.Xyz;
    Vector3 front = transform.Row2.Xyz;

    Vector3 direction = (move.X * right) + (move.Y * up) + (move.Z * front);

    position += direction * 0.05f;

    Update();
}

If we keep looking up or down we’ll go head over heels and turn upside down. Once upside down our yaw movement feels reversed, although it is correct, which can be confusing. Before we wrap up we can add some limits to prevent this. We’ll also make sure our yaw values stay within a sensible range. Both are done in the Look method - remember we use radians:

public void Look(Vector2 delta)
{
    yaw -= delta.X;
    pitch -= delta.Y;

    yaw = MathHelper.NormalizeRadians(yaw);
    pitch = float.Clamp(pitch, -1.5f, 1.5f);

    Update();
}

Look At Me

We’ve created our view matrix by inverting a translation matrix but I’m going to let you in on a little secret. A lot of people don’t create their view matrix this way. Before I spill on how they do it lets quickly look at the advantages of our current way.

Creating the camera’s position and rotation matrix the same way as every other object. As your project grows being able to use the same code, however you do it, is always an advantage.
All the calculations are visible. The rotation is applied directly using matrix composition but can be swapped out for any other way of creating a matrix.

Those sound like pretty solid advantages so what’s the alternative? There’s a method for generating the view matrix on Matrix4 called LookAt. As the name suggests this creates a transformation matrix to look at a specific point from another point.

The main advantage of this method is it being a single method and for basic camera operations it can be straightforward to use. The main disadvantage is practically the same thing: it’s a single method that behaves very differently to the other transform. Another less obvious disadvantage is how it behaves when looking directly up or down as this can produce unintentional roll with the camera suddenly snapping upside down.

With that out of the way lets have a look at it:

Matrix4 Matrix4.LookAt(Vector3 eye, Vector3 target, Vector3 up)
Build a world space to camera space matrix.

Returns:
A Matrix4 that transforms world space to camera space.

We need to provide three vectors, the eye vector is where the camera is located so that’s our existing position value. I’ll skip over target for a moment as we saw up in our previous camera and it means the same here. Coming back to target it represents a point in world space that the camera is facing towards. If we wanted a camera to always look at a specific spot this would be that but we want FPS style movement. This requires working out the front vector so we can position the target some distance along it.

As scary as all that sounds the calculations required are basic trigonometry with Sin and Cos coming to our rescue. We’ll work out the front vector first, then do some vector maths to derive the other two from it. Instead of keeping track of the translation matrix we’ll be holding on to our trio of vectors.

private void Update()
{
    float x = MathF.Cos(pitch) * MathF.Sin(yaw);
    float y = MathF.Sin(pitch);
    float z = MathF.Cos(pitch) * MathF.Cos(yaw);

    front = Vector3.Normalize(new Vector3(x, y, z));
    right = Vector3.Normalize(Vector3.Cross(front, Vector3.UnitY));
    up = Vector3.Normalize(Vector3.Cross(right, front));

    view = Matrix4.LookAt(position, position + front, up);
}

public void Move(Vector3 move)
{
    Vector3 direction = (-move.X * right) + (move.Y * up) + (move.Z * front);

    position += direction * 0.05f;

    Update();
}

Note that there’s some handedness switching going on and our move.X value needs to be negated. The orientation of the camera is also along the X axis by default so change the starting yaw value to face Z by setting it to Half Pi aka 90 degrees: yaw = MathF.PI;

I’m not including the entire camera class here so you’ll need to do some housekeeping to get this version to compile. With everything working the behaviour should be identical.

Recap

Did you feel something was missing from this chapter? There was a tiny change to our shader at the start then everything else happened in C# land. Although an important concept in graphics in general the camera is a convenience for us humans who can’t do matrix composition in our heads and not a fundamental part of OpenGL. We managed to get through 4 whole chapters without ever using one.

How you use your camera will depend on what you need to accomplish. Don’t limit yourself to just one camera either - multiple cameras types are common.

We also looked at handling keyboard input. Our basic implementation can’t deal with rebinding keys which is a pretty useful feature. OpenTK has support for joysticks and gamepads - could you replace your mouse input with them?

Continue with Chapter 6: Down The Catwalk

Originally published on 2025/04/27