Tutorial: Introduction to OpenGL

Go to page << 1 2 >>

Transformations and animation

Let's spice it up a bit! Type in or copy/paste the following and run it:

dim angle#
while true
    glClear (GL_COLOR_BUFFER_BIT or GL_DEPTH_BUFFER_BIT)
    glLoadIdentity ()
    glTranslatef (0, 0, -4)
    glRotatef (angle#, 0, 1, 0)
    glBegin (GL_TRIANGLE_FAN)
        glColor3f (0,.5, 1): glVertex3f ( 0, 1, 0)
        glColor3f (1, 0, 0): glVertex3f (-1,-1, 1)
        glColor3f (1, 1, 1): glVertex3f ( 1,-1, 1)
        glColor3f (0, 0, 1): glVertex3f ( 1,-1,-1)
        glColor3f (0, 1, 0): glVertex3f (-1,-1,-1)
        glColor3f (1, 0, 0): glVertex3f (-1,-1, 1)
    glEnd ()
    SwapBuffers ()
    angle# = angle# + 1
wend

If all goes to plan, you should see a large spinning pyramid (press Esc to exit when you get sick of it). Pretty cool for a 17 lines of code if you ask me, but how exactly does it work?
To answer the question we need to introduce a few new concepts.

The animation loop

To keep the screen animating the program needs to keep running. It can't just draw a picture then stop. Therefore we've setup a loop using "while true" and "wend". In this case it's an infinite loop, so the pyramid will keep spinning forever (well, until you stop it anyway).

Within this loop, we need to draw a single animation frame. This is the same concept as a cartoon animation. We display a number of individual images (which we call "frames"), each one differing slightly from the one before it. By running them together we create the illusion of movement.

In our example, our animation frames are like this:

Frame number Image
1 A pyramid
2 The same pyramid rotated by 1 degree
3 The same pyramid rotated by 2 degrees
4 ...

To draw each frame, we have to first clear out the previous image:

glClear (GL_COLOR_BUFFER_BIT or GL_DEPTH_BUFFER_BIT)

The two constants, GL_COLOR_BUFFER_BIT and GL_DEPTH_BUFFER_BIT are "OR"ed together to create a bitmask that tells OpenGL that we want to clear both of those buffers.
GL_COLOR_BUFFER_BIT means that the colours of the image will be cleared to black, and GL_DEPTH_BUFFER_BIT means that the depth buffer will be cleared also. (I'll explain what the depth buffer does a little bit later.)

We draw the pyramid with this piece of code:

glBegin (GL_TRIANGLE_FAN)
        glColor3f (0,.5, 1): glVertex3f ( 0, 1, 0)
        glColor3f (1, 0, 0): glVertex3f (-1,-1, 1)
        glColor3f (1, 1, 1): glVertex3f ( 1,-1, 1)
        glColor3f (0, 0, 1): glVertex3f ( 1,-1,-1)
        glColor3f (0, 1, 0): glVertex3f (-1,-1,-1)
        glColor3f (1, 0, 0): glVertex3f (-1,-1, 1)
glEnd ()

Most of this should make sense by now, but there are some new concepts to explain:

GL_TRIANGLE_FAN tells OpenGL to join vertices up into triangles as follows:

So vertex 1 always forms one point of every triangle, and the other vertices "fan" out around the first.
In our example vertex 1 is the top of the pyramid.

glColor3f() sets the colour of each vertex. It has three parameters which correspond to the red, blue, and green intensities of the colour respectively. Where 1 equals maximum brightness, 0 equals no brightness, and numbers in between correspond to various shades. The 3 components are mixed together to create the final colour.

Finally SwapBuffers() is called to make it visible.

Transformations

Transformations are perhaps one of the hardest areas to grasp for people who are new to 3D graphics. They are a flexible and powerful tool however, and it's very hard to do much useful in 3D graphics without them.

A transformation takes a set of vertices and does "something" to them.
The most common types of transformation are:

In the example, we used 2 transformations to make the pyramid spin.

glTranslatef (0, 0, -4)

Move all vertices by vector (0, 0, -4) i.e. 4 units into the screen.

glRotatef (angle#, 0, 1, 0)

Rotate all vertices by angle# (in degrees) around axis (0, 1, 0) i.e. the vertical axis.

OpenGL takes these transformations, combines them together and applies them to each vertex that we pass in with glVertex commands. The transformations get combined in reverse order, so each vertex gets spun (rotated) around the vertical axis, then moved (translated) forward. And when OpenGL joins up the vertices to form the triangle, the triangle is spun and moved too.

If we were really enthusiastic about mathematics we could have spun the pyramid by calculating where each vertex would end up after rotating it by the necessary angle. This would have worked too, but it's easier to let OpenGL do it for us.

Multiple transformations

Q. So what does OpenGL do when more than one transformation is specified?

A. It performs all of them in reverse order to each vertex that it receives.

In our example, we've given OpenGL a "Translate" then a "Rotate". Therefore each vertex we send through will be first rotated, then translated (moved).

We only used to transformations in our example, but we could just have easily used three or ten or one hundred.

OpenGL stores transformations internally in a matrix, and combines them together using matrix multiplication. This doesn't mean we have to have a degree in mathematics to use them, as long as we know what they do.

The standard OpenGL transformations are:

glTranslate (offset_x, offset_y, offset_z)
glRotate (angle, axis_x, axis_y, axis_z)
glScale (scale_x, scale_y, scale_z)

(Like many OpenGL commands, these commands are post-fixed with a letter indicating the type of data they accept. So the float version of glTranslate is glTranslatef etc.)

You can keep adding on translations to your hearts content, and OpenGL will keep on faithfully adding them on to the end of its list.
Eventually however, you want to clear out all the existing transformations and start again from scratch. For this OpenGL has the command:

glLoadIdentity ()

This command - in matrix speak - "loads" an "identity" matrix into the current transformation matrix. For those of us who don't speak matrix, it throws out all the existing transformations and replaces them with a special "do-nothing" transformation.
In our example, we use glLoadIdentity immedately before the we build our transformations with glRotatef and glTranslatef.

Things to try

This is the first "Things to try" section. This is where you'll find examples of things to try on the current example program.

  1. Try changing the numbers in the glTranslatef command. (Note: if you get a blank screen, you've most likely moved the pyramid somewhere that it cannot be seen, for example behind the "view point".)
  2. Try changing the last 3 numbers in the glRotatef command.
    This is the axis of rotation, in other words, what the pyramid will spin around.
  3. Sometimes the order of transformations isn't important, however usually it is.
    Try swapping the glRotatef and glTranslatef lines around so the glTranslate comes first.
  4. Another transformation is the "Scale", represented in OpenGL by the glScale() command:
    glScalef (x, y, z)
    Where x is the amount of scaling to apply in the X axis direction. For example:
    1 = No change
    2 = Scale to twice the size
    .5 = Scale to half size
    0 = Shrink to nothing (i.e flatten)
    Try adding a glScalef (2,1,1) command after the glTranslatef() and before the glRotatef().
  5. Try adding a glScalef (2,1,1) command after the glRotatef()

Camera movement

First let's have a quick recap of the previous section. We will be re-using many of those principles in this section:

So we can setup transformations before drawing objects in order to move them around, rotate them, stretch them e.t.c.
However the camera position however is always fixed at the origin (that is the vertex (0, 0, 0)) and always looks down the Z axis. This isn't always what we want. In first person shooters (Quake 3 for example), the camera moves with you as you move around the level. It would be pretty unplayable otherwise. Likewise with racing games, flight simulations, and basically any 3D first person perspective game where you can move around. So we need the ability to move the camera.. But how?

Interestingly enough the answer is you can't!

But you can achieve the same effect as moving the camera by shifting and rotating the entire scene, so that it looks as if the camera has moved. This is actually simpler than it sounds.

For example, let's consider a scene from a driving simulation. The car has moved 15 units forward, 20 units to the right and is facing 70 degrees in the clockwise direction, and we want to draw the scene as if the car was the camera.

The car is 15 units forward and 20 right, so the first thing we do is move the entire scene 15 units backwards and 20 left. I.e in the opposite direction to the car's position. This effectively pulls the car back to the centre of the universe, and moves everything else accordingly

The last step is to line up the car again. The car is rotated 70 degrees clockwise, so we rotate the scene 70 degrees in the opposite direction, i.e anticlockwise.

Now the camera is effectively positioned.
So how do we move and rotate the entire scene?

First we recall that once a translation is set, it affects all vertices that are drawn thereafter. So we want to setup our camera transformations before any of the drawing takes place (or before any other transformations are setup).
So the obvious choice is to do it at the very beginning of our rendered frame, immediately after clearing the screen and resetting the matrix to the identity.

We performed a translation first, then a rotation. We did things in this order because rotation always occurs around the origin (i.e the centre of the universe), so we made sure the car was there first!
Recall that OpenGL always performs transformations in the reverse order from which they were specified. Therefore we need to specify the rotation first, then the translation.

Putting this altogether, we would get something like this:

' Render scene
glClear (GL_COLOR_BUFFER_BIT or GL_DEPTH_BUFFER_BIT)
glLoadIdentity ()
glRotatef (negative camera angle, camera axis)
glTranslatef (
negative camera position)

So after all this brain bending, we really only need to extra lines of code to effectively position the camera..

Enough theory.. Let's actually do something!
Type in or cut-and-paste the following example:

' Data
dim camX#, camY#, camZ#, camAng#    ' Camera position and direction
dim x, z                            ' Working variables
camX# = 135
camZ# = 50

' Main loop
while true

    ' Clear screen
    glClear (GL_DEPTH_BUFFER_BIT or GL_COLOR_BUFFER_BIT)
    glLoadIdentity ()

    ' Position camera
    glRotatef (-camAng#, 0, 1, 0)
    glTranslatef (-camX#, -camY#, -camZ#)

    ' Draw a city of pyramids
    for z = 1 to 10
        glPushMatrix ()
        for x = 1 to 10
            glBegin (GL_TRIANGLE_FAN)
                glColor3f (0,.5, 1): glVertex3f (  0, 10,  0)
                glColor3f (1, 0, 0): glVertex3f (-10,-10, 10)
                glColor3f (1, 1, 1): glVertex3f ( 10,-10, 10)
                glColor3f (0, 0, 1): glVertex3f ( 10,-10,-10)
                glColor3f (0, 1, 0): glVertex3f (-10,-10,-10)
                glColor3f (1, 0, 0): glVertex3f (-10,-10, 10)
            glEnd ()
            glTranslatef (30, 0, 0)
        next
        glPopMatrix ()
        glTranslatef (0, 0, -30)
    next
    SwapBuffers ()

    ' Move camera
    while SyncTimer (10)
        if ScanKeyDown (VK_LEFT)  then camAng# = camAng# + 1: endif
        if ScanKeyDown (VK_RIGHT) then camAng# = camAng# - 1: endif
        if ScanKeyDown (VK_UP)    then
            camX# = camX# - sind (camAng#) * .5
            camZ# = camZ# - cosd (camAng#) * .5
        endif
        if ScanKeyDown (VK_DOWN)  then
            camX# = camX# + sind (camAng#) * .5
            camZ# = camZ# + cosd (camAng#) * .5
        endif
    wend
wend

When you run this program, you should see a city of pyramids in front of you.
But there's more...!
Press the arrow keys and you'll see that the camera moves around accordingly.
You can now walk around between the pyramids a little bit like a very simple first person shooter! Okay, so it's not exactly Unreal Tournament (and there's nothing to stop you walking right through a pyramid) but it's a step in the right direction.

Let's have a look at what's going on.

You may have spotted several techniques that we've used in previous examples. We have a basic animation loop in which we clear the screen, draw our frame, swap it to the front buffer (i.e make it visible), and then repeat. When you press the arrow keys the camera moves slightly between each frame and the progressively changing images give the illusion of movement.

And this is our camera positioning code:

glRotatef (-camAng#, 0, 1, 0)
glTranslatef (-camX#, -camY#, -camZ#)

Which is positioned after the glLoadIdentity() line, right where we decided it should be.
The translation shifts the entire scene so that the camera ends up in the centre (at the origin) again.
The rotate then rotates the entire scene so that the camera is lined up with the Z axis, and our camera is effectively positioned.

We've used a few more new features, so I'll go over them now:

Firstly:

glPushMatrix ()
...
glPopMatrix ()

This accesses the OpenGL "matrix stack". This is storage space where we can save a matrix (set of transformations) that we want to use again, and restore it.
In this program we've been using glTranslatef to "walk" along a 2D grid, stopping at each point to draw a pyramid. glPushMatrix () is used to "save our place" at the start of each row, by saving the corresponding set of transformations.
Then when we've finished drawing the row, we restore our position with glPopMatrix (), move to the start of the next row (using glTranslatef (...)) and continue.

Moving the camera.

At the bottom of the main loop we have a section maked "Move camera".
There isn't any actual OpenGL code in this part. It involves calculating the new position of the camera based on the old position and direction and the keyboard input. Such calculations are necessary with OpenGL programs, because OpenGL is a graphics library only. It will not do camera movement, or collision detection, or any other in-game logic for you, which means to write any decent interactive OpenGL program you invariably need to provide some basic 3D logic (and maths) to drive it.

The camera movement is wrapped in a "while SyncTimer (10) ... wend" loop. This is a Basic4GL timing function and is the easiest way to ensure something occurs 100 times per second regardless of the computer speed. Otherwise the camera would move slowly on older and slower computers and faster on newer faster ones, which is not what we want.
Each time around the loop, we add one to the camera angle if the left arrow is pressed (ScanKeyDown (VK_LEFT)) to turn the camera one degree anticlockwise. Likewise we subtract one if the right arrow is pressed.
To move the camera forward we have to take into account the angle of the camera, and the distance we wish to move, and apply some elementary trigonometry to figure out how much to move the X and Z components by:

	camX# = camX# - sind (camAng#) * .5
	camZ# = camZ# - cosd (camAng#) * .5

Which will move the camera position forward by .5 units each time.

This is the first algebra section, which describes the equations used to derive the formulas used in the program. Feel free to skip these sections if you're math-intolerant!

Using the properties of a right hand triangle:

  • Sin (ang) = X / dist
  • Sin (ang) * dist = X
  • X = Sin (ang) * dist

Likewise we can calculate that:

  • Z = Cos (ang) * dist

Therefore we need to move Sin (ang) * dist units to the left and Cos (ang) * dist units forward.
Recall that the X axis vector points to the right and the Z axis vector points backwards (out of the screen), so we therefore need to subtract Sin (ang) * dist and Cos (ang) * dist from the X and Z coordinates respectively.

Note also that in Basic4GL, the Sin and Cos functions operate in radians. We wish to operate in degrees (to match OpenGL's behaviour). Therefore we use the Sind and Cosd functions which are simply Basic4GL equivalents to Sin and Cos that operate in degrees.

Go to page << 1 2 >>