Collision detection
From Academic Kids

In physical simulations, video games and computational geometry, collision detection includes algorithms from checking for intersection between two given solids, to calculating trajectories, impact times and impact points in a physical simulation.
Contents 
Overview
In physical simulation, we wish to conduct experiments, such as playing billiards. The physics of bouncing billiard balls are well understood, under the umbrella of rigid body motion and elastic collisions. An initial description of the situation would be given, with a very precise physical description of the billiard table and balls, as well as initial positions of all the balls. Given a certain impulsion on the white ball (probably resulting from a player hitting the ball with his cue), we want to calculate the trajectories, precise motion, and eventual resting places of all the balls with a computer program. A program to simulate this game would consist of several portions, one of which would be responsible for calculating the precise impacts between the billiard balls. This particular example also turns out to be numerically unstable: a small error in any calculation will cause catastrophic changes in the final position of the billiard balls.
Video games have similar requirements, with some crucial differences. While physical simulation needs to simulate realworld physics as precisely as possible, video games need to simulate realworld physics in a believable way, in real time and robustly. Compromises are allowed, so long as the resulting simulation is satisfying to the game player.
Computational geometers are interested in precise collision detection algorithms (much like physical simulators.) However, computational geometer are more interested in algorithms that have provably good running times. Unfortunately, most algorithms used in physical simulations and video games do not have very satisfying worstcase running times. An example problem is the ray tracing problem: given a list of objects in three dimensional space, as well as the initial position and velocity of a particle, find the first solid object the particle will hit. It is very obvious how to do this in time proportional to the number of objects in the scene, however it is very difficult to do better than this, at least in the worstcase (or even expected case) sense.
It turns out that one can do significantly better for the raytracing problem. Using big O notation, the naïve algorithm works in <math>O(n)<math> time, without any preprocessing. However, there are algorithms for solving this problem in <math>O(\log n)<math> time. The problem, however, is that a precomputation step needs to be performed. The idea is that the set of objects is first given to the program, the precomputation occurs, and then for each subsequent query of a particle with an initial position and velocity, the time it takes to find the first object hit is of order <math>O(\log n)<math>. However, the precomputation generates a data structure of size <math>O(n^{4+\epsilon})<math> for any desired <math>\epsilon>0<math> which makes these algorithms completely unusable in practice. (See, for instance, P.K. Agarwal and J. Matousek. Ray shooting and parametric search. SIAM Journal on Computing, 22(4):794806, 1993.)
On the other hand, for the purpose of physical simulation, data structures were created which work well in practice. In all cases, these algorithms do not have especially interesting running times in the big O sense, however it is found that in practice they perform very well. The University of North Carolina, Chapel Hill has a group who have investigated this problem extensively, please see http://www.cs.unc.edu/~geom/collide/index.shtml.
Collision detection in physical simulation
Physical simulators usually function one of two ways, we shall refer to them as the a posteriori and a priori methods. In addition to the a posteriori and a priori distinction, almost all modern collision detection algorithms are broken into a hierarchy of algorithms.
A posteriori vs a priori
In the a posteriori case, we advance the physical simulation by a small time step, then check if any objects are intersecting, or are somehow so close to each other that we deem them to be intersecting. At each simulation step, a list of all intersecting bodies is created, and the positions and trajectories of these objects is somehow "fixed" to account for the collision. We say that this method is a posteriori because we typically miss the actual instant of collision, and only catch the collision after it has actually happened.
In the a priori methods, we write a collision detection algorithm which will be able to predict very precisely the trajectories of the physical bodies. The instants of collision are calculated with high precision, and the physical bodies never actually interpenetrate. We call this a priori because we calculate the instants of collision before we update the configuration of the physical bodies.
The main benefits of the a posteriori methods are as follows. In this case, the collision detection algorithm needs not be aware of the myriad physical variables; a simple list of physical bodies is fed to the algorithm, and the program returns a list of intersecting bodies. The collision detection algorithm doesn't need to understand friction, elastic collisions, or worse, nonelastic collisions and deformable bodies. In addition, the a posteriori algorithms are in effect one dimension simpler than the a priori algorithms. Indeed, an a priori algorithm must deal with the time variable, which is absent from the a posteriori problem.
On the other hand, a posteriori algorithms cause problems in the "fixing" step, where intersections (which aren't physically correct) need to be corrected. In fact, there are some who believe that such an algorithm is inherently flawed and unstable, especially when it comes to resting contacts (need to provide a reference here...)
The benefits of the a priori algorithms are increased fidelity and stability. It is difficult (but not completely impossible) to separate the physical simulation from the collision detection algorithm. However, in all but the simplest cases, the problem of determining ahead of time when two bodies will collide (given some initial data) has no closed form solution  a numerical root finder is usually involved.
Given that exact numbers are impossible to obtain, one can also use a simple a posteriori algorithm, and then use a binary search to attempt to compute the first moment of collision, if any. However, aside from the fact that this approach may miss some collisions, binary search is known to be relatively inefficient compared to other rootfinding algorithms such as Newton's method.
Some objects are in resting contact, that is, in collision, but neither bouncing off, nor interpenetrating, such as a vase resting on a table. In all cases, resting contact requires special treatment, however, some believe that it poses special problems in a posteriori algorithms.
nbody pruning
For problems where several bodies are moving at the same time (such as billiards,) a preliminary pruning step serves to reduce the number of pairs of objects we need to consider for collision.
If one has <math>n<math> objects then there are <math>n(n1)/2<math> pairs of objects that could possibly collide. One can iterate through all such pairs, and this gives an <math>O(n^2)<math> algorithm. In our billiards example, say there are thirteen balls on the table, then seventyeight pairs of balls need to be checked. However, if there are seven balls in the north half of the table, and six in the south half, then we only need to check the seven balls in the north half against each other, and the six balls in the south half against each other. In this case, there are only thirtysix pairs of balls left to check.
The problem of course is that for large objects very close to one another, it is not necessarily easy to find a line (such as in the billiards example) which separates the objects into two sets that don't intersect. This can be appeased somewhat, and the algorithm can be made recursive, and one obtains a program which seems to work faster than the naïve algorithm for certain data when <math>n<math> is large.
From the point of view of worstcase behavior, we note that if all <math>n<math> objects occupy the same point in space, then necessarily there will be <math>n(n1)/2=O(n^2)<math> pairs to check. Instead, we're interested in output sensitive algorithms: algorithms whose running time is appealing if written in terms of the size of their output. In our case, these are algorithms which will run faster when the actual number of collisions is small.
Temporal coherence
It has been observed that for the purpose of physical simulation, the configuration of physical bodies from one time step to the next changes very little. Algorithms were designed so that the calculations done in a preceding time step can be reused in the current time step, resulting in faster algorithms.
For example, M. C. Lin suggests to find axisaligned bounding boxes for all n bodies in the scene. Each box is represented by the product of three intervals (i.e., a box would be <math>I_1 \times I_2 \times I_3=[a_1,b_1] \times [a_2,b_2] \times [a_3,b_3]<math>.) We observe that two such boxes, <math>I_1 \times I_2 \times I_3<math> and <math>J_1 \times J_2 \times J_3<math> intersect if, and only if, <math>I_1<math> intersects <math>J_1<math>, <math>I_2<math> intersects <math>J_2<math> and <math>I_3<math> intersects <math>J_3<math>. We suppose that, from one time step to the next, <math>I_k<math> and <math>J_k<math> intersect, then it is very likely that at the next time step, they will still intersect. Likewise, if they did not intersect in the previous time step, then they are very likely to continue not to.
So we reduce the problem to that of tracking, from frame to frame, which intervals do intersect. We have three lists of intervals (one for each axis) and all lists are the same length (since each list has length <math>n<math>, the number of bounding boxes.) In each list, each interval is allowed to intersect all other intervals in the list. So for each list, we will have an <math>n \times n<math> matrix <math>M=(m_{ij})<math> of zeroes and ones: <math>m_{ij}<math> is 1 if intervals <math>i<math> and <math>j<math> intersect, and 0 if they do not intersect.
By our assumption, the matrix <math>M<math> associated to a list of intervals will remain essentially unchanged from one time step to the next. To exploit this, the list of intervals is actually maintained as a list of labeled endpoints. Each element of the list has the coordinate of an endpoint of an interval, as well as a unique integer identifying that interval. Then, we bubble sort the list by coordinates, and update the matrix <math>M<math> as we go. It's not so hard to believe that this algorithm will work relatively quickly if indeed the configuration of bounding boxes does not change significantly from one time step to the next.
In the case of deformable bodies such as cloth simulation, it may not be possible to use a more specific pairwise pruning algorithm as discussed below, and an nbody pruning algorithm is the best that can be done.
Physically based temporal coherence
If an upper bound can be placed on the velocity of the physical bodies in a scene, then pairs of objects can be pruned based on their initial distance and the size of the time step.
Pairwise pruning
Once we've selected a pair of physical bodies for further investigation, we need to check for collisions more carefully. However, in many applications, individual objects (if they are not too deformable) are described by a set of smaller primitives, mainly triangles. So now, we have two sets of triangles, <math>S={S_1,S_2,...,S_n}<math> and <math>T={T_1,T_2,...,T_n}<math> (for simplicity, we will assume that each set has the same number of triangles.)
The obvious thing to do is to check all triangles <math>S_j<math> against all triangles <math>T_k<math> for collisions, but this involves <math>n^2<math> comparisons, which is displeasing. If possible, it is desirable to use a pruning algorithm to reduce the number of pairs of triangles we need to check.
The most widely used family of algorithms used is known as the hierarchical bounding volumes method. As a preprocessing step, for each object (in our example, <math>S<math> and <math>T<math>) we will calculate a hierarchy of bounding volumes. Then, at each time step, when we need to check for collisions between <math>S<math> and <math>T<math>, the hierarchical bounding volumes are used to reduce the number of pairs of triangles under consideration. For the sake of simplicity, we will give an example using bounding spheres, although it has been noted that spheres are undesirable in many cases.
If <math>E<math> is a set of triangles, we can precalculate a bounding sphere <math>B(E)<math>. There are many ways of choosing <math>B(E)<math>, we only assume that <math>B(E)<math> is a sphere that completely contains <math>E<math> and is as small as possible.
Ahead of time, we can compute <math>B(S)<math> and <math>B(T)<math>. Clearly, if these two spheres do not intersect (and that is very easy to test,) then neither do <math>S<math> and <math>T<math>. This is not much better than an nbody pruning algorithm, however.
If <math>E={E_1,E_2,...,E_m}<math> is a set of triangles, then we can split it into two halves <math>L(E):={E_1,E_2,...,E_{m/2}}<math> and <math>R(E):={E_{m/2+1},...,E_{m1},E_m}<math>. We can do this to <math>S<math> and <math>T<math>, and we can calculate (ahead of time) the bounding spheres <math>B(L(S)),B(R(S))<math> and <math>B(L(T)),B(R(T))<math>. The hope here is that these bounding spheres are much smaller than <math>B(S)<math> and <math>B(T)<math>. And, if, for instance, <math>B(S)<math> and <math>B(L(T))<math> do not intersect, then there is no sense in checking any triangle in <math>S<math> against any triangle in <math>L(T)<math>.
As a precomputation, we can take each physical body (represented by a set of triangles) and recursively decompose it into a binary tree, where each node <math>N<math> represents a set of triangles, and its two children represent <math>L(N)<math> and <math>R(N)<math>. At each node in the tree, as a we can precompute the bounding sphere <math>B(N)<math>.
When the time comes for testing a pair of objects for collision, their bounding sphere tree can be used to eliminate many pairs of triangles.
Many variants of the algorithms are obtained by choosing something other than a sphere for <math>B(T)<math>. If one chooses axisaligned bounding boxes, one gets AABBTrees. Oriented bounding box trees are called OBBTrees. Some trees are easier to update if the underlying object changes. Some trees can accommodate higher order primitives such as splines instead of simple triangles.
Exact pairwise collision detection
Once we're done pruning, we are left with a number of candidate pairs to check for exact collision detection. In the simplest case, we have to perform triangletriangle checks. One check is given here.
A basic observation is that for any two convex objects which are disjoint, one can find a plane in space so that one object lies completely on one side of that plane, and the other object lies on the opposite side of that plane.
To turn this into a triangletriangle intersection check, one further observes that two triangles collide essentially only when they can not be separated by a plane going through three vertices. That is, if the triangles are <math>{v_1,v_2,v_3}<math> and <math>{v_4,v_5,v_6}<math> where each <math>v_j<math> is a vector in <math>\Bbb R^3<math>, then we can take three vertices, <math>v_i,v_j,v_k<math>, find a plane going through all three vertices, and check to see if this is a separating plane. If any such plane is a separating plane, then the triangles are deemed to be disjoint. On the other hand, if none of these planes are separating planes, then the triangles are deemed to intersect. There are twenty such planes.
If the triangles are coplanar, this test is not entirely successful. One can either add some extra planes, for instance, planes that are normal to triangle edges, to fix the problem entirely. In other cases, objects that meet at a flat face must necessarily also meet at an angle elsewhere, hence the overall collision detection will be able to find the collision.
For many more sophisticated primitives, it is impossible to find a closedform solution to the intersection problem. To complicate matters further, many have found that using a blackbox root finder to locate intersections often leads to numerical catastrophe. Some have suggested that replacing a highorder surface with a triangulation is the most numerically stable approach.
A priori collision detection
Pruning is also desirable here, both nbody pruning and pairwise pruning, but the algorithms must take time and the types of motions used in the underlying physical system into consideration.
When it comes to the exact pairwise collision detection, this is highly trajectory dependent, and one almost has to use a numerical root finding algorithm to compute the instant of impact.
As an example, consider two triangles moving in time <math>{v_1(t),v_2(t),v_3(t)}<math> and <math>{v_4(t),v_5(t),v_6(t)}<math>. At any point in time, the two triangles can be checked for intersection using the twenty planes previously mentioned. However, we can do one better, since these twenty planes can all be tracked in time. If <math>P(u,v,w)<math> is the plane going through points <math>u,v,w<math> in <math>\Bbb R^3<math> then there are twenty planes <math>P(v_i(t),v_j(t),v_k(t))<math> to track. Each plane needs to be tracked against three vertices, this gives sixty values to track. Using a root finder on these sixty functions produces the exact collision times for the two given triangles and the two given trajectory. We note here that if the trajectories of the vertices are assumed to be linear polynomials in <math>t<math> then the final sixty functions are in fact cubic polynomials, and in this exceptional case, it is possible to locate the exact collision time using the formula for the roots of the cubic. Some numerical analysts suggest that using the formula for the roots of the cubic is not as numerically stable as using a root finder for polynomials.
Spatial partitioning, miscellanea
Alternate algorithms are grouped under the spatial partitioning umbrella, which includes octrees, binary space partitioning (or BSP trees) and other, similar approaches. If one splits space into a number of simple cells, and if two objects can be shown not to be in the same cell, then they need not be checked for intersection. These algorithms are generally older, and less popular, than the more modern algorithms described above.
M. C. Lin also proposed in her Ph.D. thesis an exact pairwise algorithm for convex polyhedra that exploited temporal coherence. She noted that it is very easy to track, from one time step to the next, the closest features of a pair of convex object, using a variant of a Voronoi diagram. This algorithm is tricky to implement, and only works on convex polyhedra. Since concave polyhedra are very interesting in practice, attempts were made to adapt M. C. Lin's algorithm to concave situations. The natural approach is to write each concave polyhedron as a (not necessarily disjoint) union of convex polyhedra. However, in theory of computation and computational geometry, this is known as the minimal convex cover problem, and it is known to be NPhard.
As if that were not enough, there are examples of very useful concave polyhedra typically approximating curved surfaces (for instance, a torus) which can only be written as a degenerate union of an extremely large number of convex polyhedra. Hence, this algorithm is not used for more general scenes.
Collision detection in video games
Video games have to split their very limited computing time between several tasks. This added to the limited resources of the programmers, the less strict goals of believable (and not necessarily exact) simulation and (perhaps most importantly) the realtime requirement have combined in such a way that video games, for the most part, have used relatively primitive collision detection algorithms, although most have been highly successful in creating a robust system.
For a long time, video games had a very limited number of objects to treat, and so checking all pairs was not a problem. In twodimensional games, in some cases, the hardware was able to efficiently detect and report overlapping pixels between sprites on the screen. In other cases, simply tiling the screen and binning each sprite into the tiles it overlaps provides sufficient pruning, and for pairwise checks, bounding rectangles or circles are used and deemed sufficiently accurate.
Three dimensional games have used spatial partitioning methods for <math>n<math>body pruning, and for a long time used one or a few spheres per actual 3d object for pairwise checks. Exact checks are very rare, except in a few genres such as simulation. Even then, exact checks are not necessarily used in all cases.
Because games use simplified physics, stability is not as much of an issue. Almost all games use a posteriori collision detection, and collisions are often resolved using very simple rules. For instance, if the protagonist finds himself embedded in a wall, he might be simply moved back to his last known good location. Some games will calculate the distance the protagonist can walk before getting embedded into a wall, and only allow him to move that far.
A slightly more sophisticated and striking effect is ragdoll physics. If a video game character is disabled, instead of playing a preset animation, a simplified skeleton of the character is animated as if it were a rag doll. This rag doll falls limp, and might collide with itself and the environment, in which case it should behave appropriately.
In many cases for video games, approximating the protagonists by a point is sufficient for the purpose of collision detection with the environment. In this case, binary space partition trees provide a viable, efficient and simple algorithm for checking if a point is embedded in the scenery or not. Such a data structure can also be used to handle "resting position" situation gracefully when a protagonist is running along the ground. Collisions between characters, and collisions with projectiles and hazards, are treated separately.
A robust simulator is one that will react to any input in a reasonable way. For instance, if we imagine a high speed racecar video game, from one simulation step to the next, it is conceivable that the cars would advance a substantial distance along the race track. If there is a shallow obstacle on the track (such as a brick wall), it is not entirely unlikely that the car will completely leap over it, and this is very undesirable. In other instances, the "fixing" that the a posteriori algorithms require isn't implemented correctly, and characters find themselves embedded in walls, or falling off into a deep black void. These are the hallmarks of a mediocre collision detection and physical simulation system.