Simple physics engine useful for games or other simulations. It sounds good but it's not that easy as it might look like, or is it? This is not a step by step tutorial, it's just a very general article describing my own approach when I was implementing this as my bachelor's thesis using C++. Take this as a starting point where you can get some ideas where to look and what to use. I have included some links to my favorite, easy to understand resources. Here is a video from the final program:

As you can see it's not perfect but it somehow resembles the movement of real objects. We know that that this movement can be described by some physical equations (you have probably all heard of Newton's Laws), we also have some kind of data representation of the objects that our programs use. How to put it all together? Let's split this problem into a few smaller ones.

- Drawing - OpenGL
- Collisions detection - SAT
- Scene optimization - Octree
- Physics - Impulse based
- Simulation

One more thing I'd like to suggest is, use object oriented programming patterns. I don't like this C# or Java approach where literally everything is an object and you need to make a class for even the simplest things but when creating such a large project as a physical engine or a game, it's really useful to use classes and objects, at least for the most crucial entities such as models, objects, separated modules (physics, renderer, AI,...), camera etc. If you are going to use C++, definitely read something about RAII principle and read the C++ Core Guidelines. Also for the math, defintely use GLM.

## Drawing

How to actually draw something on the screen? What you need to know is, which platform are you going to support. Let's talk about desktop. I use Linux as my main OS but I can't simply ignore Windows which is everywhere. Let's make this a multi-platform software. There are two options. First, you create the windows and input listeners in the API provided by the specific OS environment. Then you can create multiple versions and compile them for each OS separately. The second option is probably more popular. Some people actually did all this before (yea they in fact did everything so you can go and download Unity, Unreal Engine or Cryengine and skip the rest of this article), so we have some cool libraries that can handle this window creating and inputs listening for us. I have used SDL in my thesis but I'd also like to recommend GLFW.

It's pretty simple to use such libraries, always just a bunch of initializations and the main loop. The main loop is the core of everything. In this loop everything happens. Here is a simplified pseudo-code example of how this loop looked like in my thesis and how it usually looks like in a similar programs:

while (running) { frameStart = getCurrentTime(); //here is sometimes another loop that goes through the events that are waiting in the queue

//events are usually actions performed by the user like key press or mouse click etc.

switch (event) { //that FPS camera control by mouse case MOUSE_MOTION: rotateCamera(); break; //the handler will check which keys are pressed and perform the needed actions case KEY_PRESSED: keyboardHandler(); break; //use tried to close the window for example case QUIT: running = false; break; } //update the positions of the objects (calculate the physics) scene->update(); //draw the scene scene->render(); //here is why the time operations, this is a simple FPS limiter example frameEnd = getCurrentTime(); frameTime = frameEnd - frameStart; //expected would be calculated ahead as a 1 second / framerate (30 FPS, 60 FPS...) frameLeft = frameExpected - frameTime; if (frameLeft > 0) sleep (frameLeft); else print("There's been a lag bro!"); }

I think that was pretty much straightforward. One pass of this loop is actually one frame of our game. To be able to render in the window, we need to connect the window creating library with some graphic library. I prefer OpenGL but when you get some experience, you should really try Vulkan API. It's a bit verbose but it's a great way how to dive deeper into the process of rendering. It offers much more options for a programmer but also much more work. DirectX is also worth trying if Microsoft is your friend. It's the same again, init all the stuff and then pass the OGL rendering context to the window library. Now we have OGL content in our window. OpenGL is a really broad topic so please check out the tutorials and documentations for further information.

## Collisions detection

This was actually the main topic of my thesis. However, I wanted to present it with everything around as a working example so that's why all the stuff around. We have a 3D model. I think the best way to describe it is on the Wavefront .obj file structure. A boundary 3D model contains vertices (points) and polygons (triangles or quads). Now imagine a data structure where is a list of vertices (x,y,z coordinates) and list of faces which contains pointers to the vertices (one face = one triangle = three vertices). And that is pretty much the model. Collection of connected faces. How to detect the collision between them?

Imagine two spheres. How do we know they collide? Let's place them closely next to each other. What if the distance between their centers is smaller than the sum of their radiuses? Yes, collision! Now imagine two humanoid objects. Where do we even start? I have chosen to use so called Separating Axis Theorem. If you want to see some alternatives, take a look at GJK algorithm. First of all, we are dealing with convex objects. Concave objects can be pre-splitted into a set of convex ones (there are some nice algorithms for that, in my thesis I did that manually when creating the models).

SAT tells us that if there is an axis on which the objects don't overlap, such objects cannot collide. How do objects overlap on an axis? Projection! In 2D imagine an object and a line. (I am using a lot of 2D examples but 3D is the same, just one more coordinate) When we take the furthest points of the object from it's center and shoot them right on that line, we get an interval representing the projection (like a shadow casted by the object on the line). Maybe take a look at the picture below to make it clearer.

In theory, we have an infinite number of lines, infinite number of axis we can test. No need to keep secrets, we will use only the face normals of both objects. When you think about it, you'll find out that the normals define all the orientations of the objects that matter (you can also include some edge normals for more precise results). We are using a normal on the picture above as well. So if there is no normal (separating axis) on which the projections of the models don't overlap, then we have the potential collision.

What if the objects move? How do we predict the collision? Since we draw in for example 60 FPS, we have 60 different positions of the moving objects. The objects might collide but it's not guaranteed that one of those 60 positions will be the collision scenario. This situation is called tunneling. The picture below shows two positions of one object in a different time.

See? When the object gets a high enough velocity or the obstacle is too thin, it might jump through the obstacle and skip the collision. Now we can of course project the leftmost point vertex from the frame 1 and the rightmost vertex from the frame 2 and get this movement interval. By doing this we however get many false detections because we've actually created a huge bounding box. For example, when we move one object through the whole 2D map, we can get a whole pack of collisions with all the objects there.

The way to solve this problem starts with realizing what are we looking for. How do we know that the objects will collide? We are looking for a moment when the projections of both objects overlap on all the axes (normals). What I did was that I not only determined whether there is a collision on the particular axis but I also calculated the intervals for each axis defining the duration of the overlap on the give axis. So for each testing axis I saved when the overlap starts in the frame and when it ends. Now all we need to do is to check if there is an intersection of those intervals on all those axes. If there is, then we have found the moment of the collision.

Is that clear? Here is a piece of pseudo-code again:

//loop through all the normals foreach normal in allNormals { //projection might be a for loop over one model’s vertices simply getting max and min value of dot(vertex, normal) intervalA = projection(objectA, normal); intervalB = projection(objectB, normal); //from this point on take first object as moving and the second one as static //calculate the velocity of moving object finalVelocity = velocityA-velocityB; //in order to prevent tunneling we need to expand the intervals by the model velocities - optimization //if the velocity is > 0 then we expand it to the right and vice versa if (finalVelocity > 0.0f) velocityIntervalA.end += intervalA.end + finalVelocity; else velocityIntervalA.start += intervalB.start + finalVelocity; //if the velocity expanded intervals overlap, test the times if (velocityIntervalA.start <= intervalB.end && intervalB.start <= velocityIntervalA.end) { //we need the time intervals of overlap and find out if they overlap with each other :D //we have multiple time intervals, let's take the latest entry time and the earliest leave time, if maxEntrycollision //it's calculated for one frame and since the velocity says how much the object moves in one frame, max time is 1 min is 0 entry = 0.0f; leave = 1.0f; amount = 0.0; if (intervalA.end <= intervalB.start) { //interval A is going away from interval B, no possible collision if (finalVelocity <= 0.0f) return res; //calculate the time of entering and leaving the other interval, t=s/v entry = absVal((intervalB.start - intervalA.end)/finalVelocity); leave = absVal((intervalB.end - intervalA.start)/finalVelocity); } //interval B is on the left else if (intervalB.end <= intervalA.start) { if (finalVelocity >=0.0f) return res; entry = absVal((intervalA.start - intervalB.end)/finalVelocity); leave = absVal((intervalA.end - intervalB.start)/finalVelocity); } //intervals already overlap else { //deciding how do the overlap if (intervalA.start <= intervalB.start) { leave = absVal((intervalB.end - intervalA.start)/finalVelocity); amount = intervalA.end - intervalB.start; } else { leave = absVal((intervalA.end - intervalB.start)/finalVelocity); amount = intervalB.end - intervalA.start; } } //here we can leave if the collision happens in the future frames (time is higher than max time of the frame = 1) if (entry > 1.0f) return; //save max and min of overlap borders if (maxEntry < entry) { maxEntry = entry; minAmout = amount; collisionNormal = normal; } //this is the case of overlap, we can use the amount to push the objects out of each other else if (amount < minAmount) { minAmount = amount; overlapNormal = normal; } if (minLeave >= leave) minLeave = leave; } //if they don't overlap then we have found the separating axis (current normal) and we can say there is no collision else return; } //if there is a time interval that lies inside all the found ones, collision happened if (minLeave>=maxEntry) foundInterval = true;

So as a result we have the time when the collision starts inside the frame, amount of possible overlap, normal of the collision and we can also calculate the contact point by moving the objects to the collision time and intersecting the faces. (some clipping algorithms) Maybe a bit less accurate but working solution might be projecting the models on the collision normal, getting the vertices of the intervals borders (storing the vertices of the beginning and the end of each model’s projected interval) and deciding the position of those two intervals. Then you can tell which borders collide (for example if start of the first interval is on the left from the start of the second one, you’re getting probably collision between end of the first interval and start of the second one). Like this you get exactly those vertices that were projected and caused the collision. Now when you have one vertex at one of the sides of collision, this is the contact point (contact via vertex). When you have two vertices against three and more, it’s edge vs face (the average of those two edge vertices is the contact point) etc.

The collision normal along with the overlap amount is called the MTV vector (minimal translation vector) that you can use to get the objects out of the collision. Simply move them by the amount in the direction of the normal (or inverse direction, depends on which direction you have the normal). It’s good to orient the normal to point always the same direction (second object to first for example, you can check that by dot product of normal and model geometry centers vector [first-second] not being less than zero)

Last thing I forgot to mention is, how do we do the projection? The answer is surprisingly very simple. Dot product. Short story, dot product of the vertex and the axis gives you the projection of the vertex onto the axis. We do it for all the vertices of the model, take min and max value and we have the projection interval. It's a simple math. Take the dot product formula with cos function and look carefully on the picture below. Don't forget to normalize the result.

## Scene optimization

We know how to detect a collision of two objects. Now let's use the brute force and try all the combinations in the scene. Wait...why is the framerate dropping..noooooo! Yes, time to optimize! There are many structures you can use for this purpose such as bounding volume hierarchy or binary space partitioning techniques but I have used Octree. A quick introduction. Imagine a cube containing the whole scene. Split it according to the main three axes X,Y,Z. How many little vubes do we get? Eight, thus octree.

The principle of such structures is that we split the space into a smaller areas which contain only the objects close to each other. Then we know which objects should we test for collisions and which are too far away. Here is an example how you might use BVH in order to create a tree structure in 2D.

In other words, each level of the tree is smaller and smaller box. Root is the whole scene. There are probably more efficient ways but I have used the same SAT as for collisions to decide which object belongs to which cube. The cubes are in fact axis aligned bounding boxes. What if we precalculate the bounding box of the objects as AABB too. How many axes do we need to test in order to simply tell if two AABB collide? Three, X, Y, Z. If you understood the dot product projection derivation and realize now what coordinates do the main three axes have, you might see that projection would be simply the coordinate of the vertex on a given axis position (axis x = 1,0,0 => vertex.x). The disadvantage of AABB is that we need to recalculate them when the object rotates. Here is a simple algorithm I used for the octree when inserting a new object:

- Starting from the root to the leaves
- Check the child nodes and detect which collides with the AABB of the object
- If the node is not a leaf, do the same for its child nodes
- If the nodes is a leaf and is not full, insert. Else, split the node to eight and reinsert all its object into the new nodes including the current object
- If the object belongs to more nodes, save it to all of them (there are two types of octrees, the second one allows you to save such objects into the upper level nodes)

Note that the nodes have a defined limit of how many objects it can contain without the further splitting. The value depends on the scene so you need to experiment a bit. A good practice is also to define the max level of the tree to avoid an endless splitting which can make the tree counterproductive.

So now when we want to test the collisions, just get the node of the current object and test only the objects in the same node. OK, the object moves. It moves with an incredible speed so in one frame it travels all across the map visiting other nodes. What now? What about SAT again? Finding the time intervals of AABB collisions, AABB of the moving object with AABB of the tree nodes and it's child nodes. Basically we get a stack of node pointers, I call them tracks. The moving objects leaves a track in each node it visits during this test. Then we go through those marked nodes and check for the collisions with the moving object. Depending on the scene you might also want to exclude the moving objects (if most of the scene objects are static) from the tree structure to avoid reinserting every frame. Don't forget that there might be multiple tracks made by multiple objects for one node. That is the case when two moving objects meet at a certain node. Everything is demonstrated on the simplified quadtree below.

One last note, such trees can be also used for a drawing optimization as well. The idea is that we check the camera position, find the node and find all the nodes intersecting with the camera viewing frustum. We choose to render only the objects in such nodes.

## Physics

The first idea brought me back to the high school. I think everyone who had some physics classes did those exercises about elastic and inelastic collisions. I was pretty sure it'll work well so I have implemented those simple equations and yea, the object was behaving pretty good when I also added a so called coefficient of restitution which lowered the energy after the collision. Then I have added more objects and the hell began. Weird explosions of the objects, objects unable to stop on the ground etc. I started looking for a more sophisticated techniques. I found one, an impulse-based collision response.

Impulse is defined as an integral of a force over time. When two objects collide, the force affects each object for some time interval. Since we can’t really simulate this behavior of continuously applying the force to the object that eventually changes the velocity (really complex physics on the level of atoms), we need something that instantly changes it. Our objects are simply about to intersect or are already intersecting, we really want them to change their movement.

When using impulse, we can change of the object's momentum which depends on the mass and velocity of the object. To be able to determine the direction of the impulse vector, we also need to know the collision normal (which we have from SAT) and the original velocity vectors. I don’t feel confident enough to go much in details, for more detailed explanation and derivation, take a look at the very goodpapers by Chris Hecker about rigid body dynamics.

First we need to define a relative velocity of two objects (one velocity for both of them).[eq. 1] Since our SAT return us the normal of the collision, we can use it as well, projecting the velocity on the normal (we need this vector that will cause the objects getting out of the collision which is the normal).[eq. 2] When two objects collide, your intuition probably tells you that the velocities change, not only the direction but also the magnitude which is lower than before the collision. The ration of the final to original velocity is called the coefficient of restitution. The value of 1 for example would mean that we have a perfectly elastic collision. In reality, objects always lose some energy due to deformations, sound created by the impact or generated heat. In our equation, we simply multiply the new velocity by the negative value of the coefficient (Newton’s law of restitution). There are different values of this coefficient for different materials. You can find some tables on the Internet. When two objects collide, people usually take the lower coefficient of the two. In my simulation, I got better results when calculating the average of them.

We need to add the new response velocity that came out from the collision to the original one. Since we are expressing the equations in term of two different objects (using the relative velocity above), we’ll get two new equations.[eq. 3] Now put it all together along with the coefficient of restitution and isolate the impulse *j*. It’s a lot of equations and math which is good exercise, take the papers and go for it :-). (if you’re lazy, it’s in the paper too) The result is pretty nice looking.[eq. 4] And here we go, we have the impulse that we can easily use to calculate the translation velocities. OK, but the objects rotate. Welcome in the hell!

Rotations are a bit tricky and not only because of the physics. When determining the collisions while moving (predicting them), how would one predict the right orientation that might actually change the whole situation? When one long object is moving, we can already tell if it’ll collide in this frame or not but what if it rotates by 90 degrees? The geometry relative to the second objects has changed! One solution is to deal with it by having a mechanism that pulls the objects out of the intersections (given the information from SAT), another is to sample the movement with different orientations and use them all to calculate SAT in the given times. It’s also possible to interpolate the position of the vertices (in the extremes of the geometry) and use that to decide. In other words, when dealing with the rotations, computers usually guess or somehow approximate. Usually the rotation is not that fast that it matters (little inaccuracies are not visible during one frame). Problem would be a movement of for example a helicopter propeller. I have used the first approach which is fixing the intersections when it happens.

What we need to do is to convert the equations into angular velocity *ω* forms. We also need a vector *r* from the center of mass of the object to the collision point. And last but not least, we need the moment of inertia *I*. It describes the inertia of the rotational movement according to the distribution of the mass in the object. There are different equations of how to get this moment for different shapes and types of the objects. (look for some tables again) When putting it all together, we’ll get a similar equation as before.[eq. 5] Again, putting all the equations together and isolating the impulse results in a slightly scary but working equation.[eq. 6]

The above equations were however primarily for 2D space (you can actually get them somehow working in 3D as well). When having 3D scene, the situation is a little more complicated. The moment of inertia is not a single scalar but a tensor (basically 3x3 matrix) that describes how easily is the object rotated along a specific axis. It’s more or less like a mass when dealing with translations. Be careful, when having an inertia tensor for irregular shapes (not a sphere but a ellipsoid for example) you’ll need to rotate it according to the object’s rotation. There is a nice article on Wikipedia for both inertia tensors and also derivation of the final impulse equation. The principle is the same, we just add some cross products that give us the perpendicular vectors for the two given ones to be able to derive the rotations around the mass center. (make sure you have the normal always pointing to the same object, be it the first one or the second, here I use the second to first orientation, it affects the addition/subtraction of the velocities) Don’t forget to multiply the impulse magnitude by the normal to give it the proper direction.

What if the object hits the ground? No, we don’t need the ground to move away as well. You can of course calculate the velocity of the ground but avoid moving it in the end. Easier and more correct approach is to simply eliminate all the second object’s calculations from the equation. In my application I have two options, one is that both objects are dynamic and the second one is for dynamic and static object (you’ll also save some computation time when the second option is running).

One more note regarding the stabilization of the objects. When the objects are about to stop their movement (like when lying on the floor), their velocity is getting smaller and smaller which starts to cause a very small values in my case stored in floats. Be careful when working with that (especially when testing the values). Definitely read something about it! Also to help the stabilization I found this fluid trick very helpful. What it means is that I slow all the objects down a bit during each frame like as if they were floating in a very sparse fluid (hmm air? :D).

## Simulation

Everything is working somehow now. However, what if three objects are about to collide in one frame. Let's say that the object A collides with B slightly before the C rushes in. The AB collision might make A and B bounce away from each other so when C arrives, it won't actually collide with anything. How do we determine which collisions come first and more importantly, which do we need to recheck after solving the previous ones. So again, we are dealing with one frame that has also some time duration. This is the well know problem in simulations, the machine discrete time vs real continuous time. So the update function from the first code snippet above has another loop inside with the following algorithm:

//to keep the track of in-frame time <0.0; 1.0> left = 1.0; elapsed = 1.0; end = false; //gravity etc. applyEnviromentals(); while(!end) { findCollisions(); //the physics will set elapsed something greater than 1.0 if the collision happens in the next frame //physics module also changes the velocity vectors of the objects according to the collision responses elapsed = solvePhysics(left); if (elapsed > 1.0) break; //move the objects (according to their velocities, some might be modified by the physics solving) moveObjects(elapsed); //octree update rebuildTree(); left -= elapsed; } //move the objects to the end of the frame moveObjects(left); rebuildTree();

So in other words, we have the rendering steps defined by FPS and inside each rendering step we have one or more physics steps. I have implemented a stack or we can call it a calendar of events where the found collisions were sorted (yea, time to remember the sorting algorithms) by the time of collision so the physics module chooses the collisions and solves them in the chronological order. Now we stand in front of a difficult problem. Should we continue to solve the collisions in this one step? What if the objects change the trajectories as I described above. The latter collisions can completely change. There are two ways. First one I chose is to prevent any unwanted states caused by the wrong collisions. What does it mean? Simply take the first time in the calendar and solve all the collisions happening in this time. Throw away the rest, and start another step of the physics. The second options doesn't consume so much of the computing time but brings another problems. We can try to solve more than the first time collisions. The price is that we'll have to deal with wrong responses which might cause for example objects ending up inside each other.

## Farewell

And that's pretty much it! Some of the techniques can be probably done much better than I described and there is really a lot of space for optimizations but I just wanted to share with you this approach I took which gave me a working result as you can see on the video. Please read the resources links for some more specific examples and codes. I am not publishing the sources now since they need some modifications. I might publish them sometimes if I ever get to rewriting them.

## Resources

RAII principle in C++C++ Core Guidelines

SDL library homepage

GLFW library homepage

GLM library homepage

OpenGL tutorial

Vulkan tutorial

DirectX tutorial

Structure of Wavefront .obj file

GJK algorithm explanation

SAT paper

Rigid Body Dynamics (Impulse based response papers)

Impulse based response on Ẅikipedia

Moments of inertia on Wikipedia

Impulse based response tutorial The problems with floats and how to solve them