Sunday, November 04, 2012

Back-substitution and inverting matricies

Matrix triangulation and back-substitution algorithms can be used in combination with gaussian elimination to solve systems of equations or to find the inverse of a matrix. I previously covered gaussian elimination, continuing on we can now solve the systems of equations using back substitution.

The matrix we had to solve was:
    1     2     1     4    13 
    0    -4     2    -5     2 
    0     0    -5  -7.5   -35 
    0     0     0    -9   -18 
First we normalise the upper-triangle matrix, by simply dividing each row such that the leading coefficient is one:
    1     2     1     4    13 
    0     1  -0.5   1.2  -0.5 
    0     0     1   1.5     7 
    0     0     0     1     2 
(this simplifies the back-substitution, but we can skip/combine this step with the back-substitution)

For back-substitution we work our way backwards from the bottom of the matrix to the top, progressively eliminating each variable. As with gaussian elimination we select a pivot row, and subtract that from the rows above it. First, we start with the last row, and subtract 1.5 times that row from the row above.
    1     2     1     4    13 
    0     1  -0.5   1.2  -0.5 
    0     0     1     0     4 <-- subtract pivot row * 1.5
    0     0     0     1     2 <-- pivot 
Similarly, we continue on to the second row, subtracting 1.2 times, and the top row, subtracting four times.
    1     2     1     0     5 <-- subtract pivot row * 4
    0     1  -0.5     0    -3 <-- subtract pivot row * 1.2
    0     0     1     0     4 
    0     0     0     1     2 <-- pivot
Again, we repeat the process for the third column:
    1     2     0     0     1 
    0     1     0     0    -1 
    0     0     1     0     4 <-- pivot
    0     0     0     1     2 
And finally, the second column:
    1     0     0     0     3 
    0     1     0     0    -1 <-- pivot
    0     0     1     0     4 
    0     0     0     1     2 
Now we have our solution to the system of equations from our original gaussian elimination problem.
a = 3, b = -1, c = 4 and d = 2.
In words/pseudo-code, the process is:
  • Pivot through all the rows, starting from the bottom to the top
  • For each row above the pivot, calculate how many times we need to subtract the pivot row from this row.
  • For each element in the row, subtract the corresponding element from the pivot row, multiplied by the value above.
In code:
for (int p=n-1;p>0;p--) { //pivot backwards through all the rows
        for (int r=p-1;r>=0;r--) { //for each row above the pivot
            float multiple = mat[r][p] / mat[p][p]; //how many multiples of the pivot row do we need (to subtract)?
            for (int c=p-1;c<m;c++) {
                mat[r][c] = mat[r][c] - mat[p][c]*multiple; //subtract the pivot row element (multiple times)
(complete code here)

This process can be applied to find the inverse of a general matrix. Beginning with any matrix we want to invert, we augment it with the identity matrix. For example:
    2     4    -2     1     0     0 
    4     9    -3     0     1     0 
   -2    -3     7     0     0     1 
Now we can apply gaussian elimination to generate:
    2     4    -2     1     0     0 
    0     1     1    -2     1     0 
    0     0     4     3    -1     1 
The normalise the upper triangle to get:
    1     2    -1   0.5     0     0 
    0     1     1    -2     1     0 
    0     0     1  0.75 -0.25  0.25 
And finally, back-substitution to get our solved inverse:
    1     0     0   6.8  -2.8  0.75 
    0     1     0  -2.8   1.2 -0.25 
    0     0     1  0.75 -0.25  0.25 
In this entire discussion I have left out ill-conditioned and singular matrices, but I'll leave modifying the code for that as an exercise for the reader.

Gaussian Elimination

Gaussian Elimination is an elementary transformation that converts a matrix into a triangle, or row-reduced echelon form (RREF). It forms the basis of a number of operations in linear algebra to solve systems of equations, invert matrices, and minimize systems of equations among other things (I'll cover these in later posts). The Gaussian Elimination algorithm itself is straight-forward (you probably learnt it in high school). Given a system of equations, e.g.
  a + 2b +  c + 4d = 13
 2a +      4c + 3d = 28
 4a + 2b + 2c +  d = 20
-3a +  b + 3c + 2d = 6
We can form an augmented matrix to represent it, and use Gaussian elimination to solve it. The goal is to produce a triangle-matrix representation, so that we can solve the equations by back-substitution. In other words, we want to (eventually) have one row represent each variable, and for all other rows, that variable should be zero. (i.e. solved). Gaussian elimination takes us part of the way there by giving us a set of equations with a starting point which we can then later solve.

Representing the above equations as a matrix, we have:
    1     2     1     4    13 
    2     0     4     3    28 
    4     2     2     1    20 
   -3     1     3     2     6 
The first step is to select a pivot row, which we can use to eliminate/reduce the other rows. When we eliminate the other rows, we want the that variables value to be 0. In this example, we pick the first row, and then subtract that twice from the row below, to ensure that the row below will have zero a's.
    1     2     1     4    13 <-- pivot
    0    -4     2    -5     2 <-- subtract pivot row * 2
    4     2     2     1    20 
   -3     1     3     2     6 
Likewise, four times the third tow, and negative three times the final row.
    1     2     1     4    13 <-- pivot
    0    -4     2    -5     2 
    0    -6    -2   -15   -32 <-- subtract pivot row * 4
    0     7     6    14    45 <-- subtract pivot row * -3
Great. Our first variable (a) has been eliminated. We now repeat this step, starting from the second row, with the variable 'b'. We don't want to use the first row, as we want to preserve that row's representation of the 'a' variable.
    1     2     1     4    13 
    0    -4     2    -5     2 <-- pivot
    0     0    -5  -7.5   -35 <-- subtract pivot row * 1.5
    0     0   9.5   5.2    48 <-- subtract pivot row * -1.75
Now, we repeat the process again, starting from the third row.
    1     2     1     4    13 
    0    -4     2    -5     2 
    0     0    -5  -7.5   -35 <-- pivot
    0     0     0    -9   -18 <-- subtract pivot row * -1.9
Done. In pseudo-code/words, the algorithm is:
  • For each row (except the last), select a pivot. (In my example, I just take the first available row each time)
  • For each row that is below the pivot, calculate the number of times we need to subtract the row (i.e. divide)
  • For each element in this row, subtract the corresponding element in the pivot row, multiplied by the value we calculated above.
The code to achieve this is:
//input a m (col) by n (row) matrix ('mat')
    //p is the pivot - which row we will use to eliminate
    for (int p=0;p<n-1;p++) { //pivot through all the rows
        for (int r=p+1; r < n; r++) { //for each row that isn't the pivot
            float multiple = mat[r][p] / mat[p][p]; //how many multiples of the pivot row do we need (to eliminate this row)?
            for (int c = 0; c<m; c++) { //for each element in this row
                mat[r][c] = mat[r][c] - mat[p][c]*multiple; //subtract the pivot row element (multiple times)
(full code here) Next time, we continue on to solve the equations - available here! (2,4,-1,3)

Sunday, September 30, 2012

Mining Robotics: An overview survey

Robotics is typically associated with manufacturing robotics (e.g. PUMA arm), military robotics (e.g. Predator UAV), and more recently consumer robots (e.g. Roomba), medical/healthcare (e.g. Da Vinci) and the automotive industry (e.g. driverless cars). Not many are aware of the prevalence of robotics in the mining industry and the steps the industry has taken towards automation and autonomous robots.
Autonomous mining

The mining industry is a world-leader in autonomy, for example Rio Tinto's Western Australia operations has the worlds largest fleet of autonomous vehicles (150 autonomous trucks) - significantly larger than any operational system in the military. Rio's Western Australia operations are all controlled from a operation centre, which controls 40 mines, 30 pits, trains, power stations, and ports all based thousands of kilometres away. In terms of data, the WA system generates around 2.4 terabytes per minute of data. There is quite a lot of intelligence and innovation involved.

Overall, mining can be broadly broken up into a few key phases:
Mining process
  • Exploration, assessment and planning. In this phase, new resources are identified and a new mine site is designed and constructed.
  • Drill and blast, material is extracted from the ore.
  • Load and haul, material is taken from the point of extraction to the processing plant.
  • Processing, where the material is converted/crushed into a more useful (sellable) form.
  • Transportation, where the product is loaded and transported, usually via rail to a port and then on to a ship to its final destination.
  • Stockpiling, occurs at various points in varying quantities where appropriate.
At each of these steps some kind of machine is involved, and I'll give you a brief overview of the machines and some of the relevant research or commercial automation systems available.

Exploration and remote sensing is a massive research area in itself in other industries, and mining is no different. UAV's are seeing use in aerial surveys on mine sites, with large data sets fusing visual (photogramatery), infrared, LIDAR, InSAR, gradiometry, seismic and other geodesic measurements.
West Angelas mine LIDAR scan
On the ground new sensor fusion systems are being developed to classify the mine and ore structure and to identify the richest ore deposits. Combining all this data into an overall mine model is a difficult machine learning task. The Rio Tinto Center for Mine Automation are doing active research in this field, and the Gatewing X100 is an example of an UAV used for mapping in mining.

Drill and blasting is a mining-specific operation and there has been significant advancement in robotics in this area due to the operational hazards involved with this line of work. Robots can accurately drill holes that won't collapse and are easy to load, and Atlas CopCo and Flanders both have commercial automation systems for drilling which are well on their way to delivering autonomous drill rigs in the near future (trial drilling systems have been in use on production sites since 2008). Atlas Copco first started their work in automated drill rigs in the 1980's and now has over 2,500 machines running their control system technology.

The load and haul stage is perhaps the most interesting as it is the first area where autonomous vehicles are used in regular production environments. Whilst autonomous loading is still an area of research (See these CSIRO projects on dragline and shovel loading automation), there are plenty of commercial automation systems for haul vehicles.
This includes CAT Minestar Command, Atlas-Copco Scooptram, Sandvik Automine, Komatsu Frontrunner. Mineware provide shovel and dragline automation systems, with LIDAR systems that build digital terrain maps on the fly. Autonomous Solutions has a number of autonomous vehicles, including trucks and dozers.
Continuous miners and long wall mining have seen multiple automation systems including commercial systems from Eickhoff and CAT. Excavators are no stranger to automation, CMU automated excavators and truck dumping back in the late 90's, and work is ongoing at PWRI in Japan and Hyundai research. The range of commercially available autonomous mining vehicles put military UGV's and automotive companies to shame.

Transmin Rocklogic
Processing plants have been fully automated, although for many metals, (e.g. iron ore), there isn't much too the process in the first place. Companies such as Metso have fully automated crushers, conveyors, and also include computer vision systems to identify and classify rocks/froth/bubbles, etc. FLSmidth and Calibre Transmin have developed automation systems for rock breakers are available allowing the rockbreaker to automatically park and deploy. In-Pit Crushing and Conveying (IPCC) systems allow parts of the plant to be mobile, and even these systems have been largely automated by companies such as Sandvik.

Autonomous train
Transporting material from the mine is usually performed by a train, and autonomous trains have been around for a while. In fact, LKAB have been running driverless trains since the 1970's. The main difference in modern mining applications being that the goal is now fully autonomous operation, and that the trains can stretch many kilometres in length, making control a more difficult problem. Major miners such as Rio Tinto are automating the trains in Western Australia, with companies such as Ansaldo STS and New York Air Brakes providing the technology.

Finally, with stockpiling Stacker-Reclaimers have been automated, with companies such as ThyssenKrupp and iSAM leading the way

Rio Tinto - Remote Operation Center
Overall there are a large amount of automated and autonomous mining equipment available, and projects such as Rio Tinto's mine of the future at West Angelas and Yandicoogina sites, Vale's Carajas Serra Sul S11D site and Nautilus's Solwara underwater mining are all pushing towards fully autonomous sites where we may see no humans involved in operating future mine sites.

So if you want to find out more about robotics and automation research in mining there are a few great places to start:
The future of mining is autonomous robots, and we are well on our way!

Sunday, August 05, 2012

Programming links

Well overdue for a catchup post on the non-graphics programming side of things, so here we go:

Tuesday, July 31, 2012

Mid year point

Well, its a bit over the mid year point, and the blog posts have been lower than usual. There are a number of things that I will do longer posts on, which have happened in the last six months:
  • The WAMbot Journal of Field Robotics article was accepted and published, which has been the subject of a number of previous posts on MAGIC2010. I've put together a few posts on the navigation system (A*, Elastic bands, DWA), but still nothing on the system architecture, hardware, exploration, AI, HMI, comms, SLAM, and overall experiences. So plenty of material left to go.
  • A paper on the Navigation system has been accepted for publication.
  • A paper on using Physics Abstraction Layer for evolving robot control programs has been accepted for publication.
  • I finally uploaded the code to ImprovCV and SubSim.
  • I gave a guest lecture on realtime raytracing with WebGL, and another on intelligent systems and automation in mining
  • I've been doing a little bit of HTML5 and three.js work, which hopefully will turn into a few posts
  • As per usual, I've been collecting a large list of interesting links from around the web, that will form a number of catchup posts.
Hopefully more from me soon...

Thursday, June 07, 2012

GPU Technology Conference 2012

nVidia's GPU Technology Conference is over, and a number of presentation slides have been uploaded. There were a quite a few interesting talks relating to graphics, robotics and simulation:
  • Simon Green from nVidia and Christopher Horvath from Pixar presented 'Flame On: Real-Time Fire Simulation for Video Games'. It starts with a recent history of research on CG fluid systems, and gives five tips on better looking fire: 1. Get the colors right (e.g. radiation model), 2. Use high quality advection (not just bilinear filtering), 3. Post process with glow and motion blur. 4. Add noise. 5. Add light scattering and embers. They then go into more detail on Tip #1 looking at the physics behind the black-body radiation in a fire, and the color spectrum.
  • Elmar Westphal of PGI/JCNS-TA Scientific IT-Systems presented 'Multiparticle Collision Dynamics on one or more GPUs', about multiparticle collision dynamics GPU code. He starts by explaining the overall algorithm, and explaining step-by-step what performs well on the GPU. Specific GPU optimisations explained include spatial subdivision lists, reordering particles in memory, hash collisions, and finally dividing workload between multiple GPU's. An interesting read.
  • Michal Januszewski from the University of Silesia in Katowice introduces 'Sailfish: Lattice Boltzmann Fluid Simulations with GPUs and Python'. He explains lattice boltzmann fluid simulation, and some of the different configurations of lattice connectivity and collision operators. Moves into code generation examples, and gives a brief explanation of how the GPU implementation works.
  • Nikos Sismanis, Nikos Pitsianis and Xiaobai Sun (Aristotle University, Duke University) cover 'Efficient k-NN Search Algorithms on GPUs'. Starts with an overview of sorting and K-Nearest Neighbour (KNN) search algorithm solutions, including ANN (approximate NN) and lshkit and moves into results including a comparison of thrust::sort with Truncated Bitonic sort. Software is available at
  • Thomas True of nVidia explains 'Best Practices in GPU-Based Video Processing' and covers overlapping copy-to-host and copy-to-device operations, and an example of processing bayer pattern images.
  • Scott Rostrup, Shweta Srivastava, and Kishore Singhal from Synopsys Inc. explain 'Tree Accumulations on GPU' using parallel scatter, parallel reduce and parallel scan algorithms.
  • Wil Braithwaite from nVidia presents an interesting talk on 'Interacting with Huge Particle Simulations in Maya using the GPU'. He begins with a brief runthrough of the workings of the CUDA SPH example, and then moves onto the particle system including Maya's body forces (uniform, radial, vortex), shape representations (implicit, covex hull, signed distance fields, displacement maps), collision response, SPH equations, and finally data transfer. Ends with a brief overview of rendering the particles in screen space. Neat.
  • David McAllister and James Bigler (nVidia) cover the OptiX internals in 'OptiX Out-of-Core and CPU Rendering' including PTX code generation and optimisation, and converting the OptiX backend to support CPU's via Ocelot and LLVM. An interesting result, LLVM does better at optimising "megafunctions" than small functions, but not entirely unexpected given how LLVM works. The presentation finishes with an overview of paging and a tip on bounding volume heirarchies. Good to see Ocelot in the mainstream.
  • Eric Enderton and Morgan McGuire from nVidia explain 'Stochastic Rasterization' (ala 'screen door transparency' rendering) via MSAA for motion blur, depth of field and order-independent transparency, by using a geometry shader to bound the shape and motion of each tri in screen space, and setting up the MSAA masks. Nice.
  • Cliff Woolley presents 'Profiling and Tuning OpenACC Code' (by adding pragmas to C / Fortran code, ala OpenMP) using an example of Jacobi iteration, and there were a number of other talks on the topic.
  • Christopher Bergström introduced 'PathScale ENZO' the alternative to CUDA and OpenCL.
  • Phillip Miller from nVidia did an broad coverage of 'GPU Ray Tracing'. He starts with a myths and claimed facts on GPU raytracing, highlights some commercial GPU raytracers (and the open source OpenCL LuxRenderer) and goes into some details that are better explained in the OptiX Out-of-Core presentation.
  • Phillip Miller follows with 'Advanced Rendering Solutions' where he takes a look at nVidia's iray, and where they believe they can introduce new capabilities for design studios and find a middle ground with re-lighting and physcially based rendering.
  • Peter Messmer presents 'CUDA Libraries and Ecosystem Overview', where he provides an overview of the linear algebra cuBLAS and cuSPARSE libraries performance, then moves to signal processing with cuFFT and NPP/VSIP for image processing, next is random numbers via cuRAND and finally ties things up with Thrust.
  • Jeremie Papon and Alexey Abramov discuss the 'Oculus real-time modular cognitive visual system' including GPU accelerated stereo disparity matching, likelihood maps and image segmentation with a parallel metropolis algorithm.
  • Jérôme Graindorge and Julien Houssay from Alyotech present 'Real Time GPU-Based Maritime Scenes Simulation' beginning with ocean simulation and rendering from FFT based wave simulation using HF and LF heightmap components. They then cover rendering the mesh, scene illumination and tone mapping, and a sneak peak at boat interaction. The ocean simulation video is neat.
  • Dan Negrut from the Simulation-Based Engineering Lab at the University of Wisconsin–Madison gives an overview of the labs multibody dynamics work in 'From Sand Dynamics to Tank Dynamics' including friction, compliant bodies, multi-physics (fluid/solid interactions), SPH, GPU solution to the cone complementary problem, ellipsoid-ellipsoid CCD, multi-CPU simulation, and finally vehicle track simulation in sand. Wow. Code is available on the Simulation-Based Engineering Lab website.
  • Max Rietmann of USI Lugano looks at seismology (earthquake simulation) in 'Faster Finite Elements for Wave Propagation Codes' and describes parallising FEM methods for GPUs in SPECFEM3D.
  • Dustin Franklin from GE introduces GE's MilSpec ruggedised Kepler-based GPU solutions and Concurrent Redhawk6 in 'Sensor Processing with Rugged Kepler GPUs'. Looks at some example applications including hyperspectral imaging, mosaicing, 360 degree vision, synthetic aperture radar processing, and space-time adaptive processing for moving target identification.
  • Graham Sanborn of FunctionBay presents 'Particle Dynamics with MBD and FEA Using CUDA' and gives a brief overview of their combined CPU/GPU multi-body FEA system and briefly describes the contact, contact force, and integration steps.
  • Ritesh Patel and Jason Mak of University of California-Davis cover the Burrows-Wheeler Transform, Move-to-Front Transform and Huffman Coding in 'Lossless Data Compression on GPUs'. They find merge sort for BWT performs best on the GPU, explain the parallel MTF transform and Huffman in illustrative detail and tie things up with benchmarks, unfortunately GPU is 2.78x slower than CPU.
  • Nikolai Sakharnykh and Nikolay Markovskiy from NVIDIA provide an indepth explanation of their GPU implementation of solving ADI with tridiagonal systems in '3D ADI Method for Fluid Simulation on Multiple GPUs'.
  • Enrico Mastrostefano, Massimo Bernaschi, and Massimiliano Fatica investigate breadth first search in 'Large Graph on multi-GPUs' and describe how best to parallelise it across multiple GPU's by using adjacency lists and level frontiers to minimise the data exchange.
  • Bob Zigon from Beckman Coulter presents '1024 bit Parallel Rational Arithmetic Operators for the GPU' and covers exact 1024 bit rational arithmetic (add,sub,mul,div) for the GPU. Get the 1024 bit arithmetic code here.
  • Roman Sokolov and Andrei Tchouprakov of D4D Technologies discuss 'Warped parallel nearest neighbor searches using kd-trees' where they take a SIMD style approach by grouping tree searches via voting (ballot)
  • David Luebke from nVidia takes a broad look at CG in 'Computational Graphics: An Overview of Graphics Research @ NVIDIA' and provides an overview of research which is featured in a number of previous talks and other GTC talks including edge aware shading, ambient occlusion via volumes and raycasting, stochastic rendering, improved image sampling and reconstruction, global illumination, and CUDA based rasterization.
  • Johanna Beyer and Markus Hadwiger from King Abdullah University of Science and Technology discuss 'Terascale Volume Visualization in Neuroscience' where each cubic mm of the brain scanned with an electron microscope generates 800 tereabytes of data. The idea here is to leverage the virtual memory manager to do all the intelligent caching work, rather than a specialised spatial datastructure for the volume rendering.
  • Mark Kilgard introduces the NV_path_rendering extension in 'GPU-Accelerated Path Rendering', and demonstrates using the GPU to render PDF, flash, clipart, etc. Contains some sample code.
  • Janusz Będkowski from the Warsaw University of Technology presented 'Parallel Computing In Mobile Robotics For RISE' a full GPGPU solution for processing mobile robot laser scan data through to navigation. Starts with data registration into a decomposed grid which is then used for scan matching with point-to-point Iterative Closest Point. Next is estimating surface normals using principle component analysis, demonstrated on velodyne datasets. This is used to achieve point-to-plane ICP and he demonstrates a 6D SLAM loop-closure. Finishes it all off with a simple gradient based GPU path planner.
Note that in recent days more presentation PDF's have been uploaded so there is still plenty to look through, and with all the content it's difficult to look through it all - take a look yourself! I'll leave you with a video from the GTC 2012 keynote on rendering colliding galaxies:

Wednesday, May 09, 2012

Graphics links post

Its been a while since I've done a link post, (almost five months) so time to catch up! From the demoscene and WebGL world we have: In games we have: In tools and research we have:
  • A link collection of all the SIGGRAPH course notes
  • PowerVR articles on everything from OpenRL (open ray tracing library) to PVR texture compression and parallax bumpmapping. A mini (PowerVR) version of ATI and nVidia's article and tools collection.
  • GNU Plotting covers tips and tricks for generating graphs and plots with GNU plot.
  • Autodesk's free photofly lets you create 3d models from photos.
  • Insight3D is an open source image based modeling (3D models from photos) software package.
  • openFrameworks is a cross-platform C++ toolkit for making realtime visual productions, interfacing to OpenGL, GLEW, FMOD, FreeType, Quicktime, etc.
  • FXGen is an open source procedural texture generator.
  • libnoise is an open source noise (e.g. perlin) generator.
  • Robert Schneider maintains a list of mesh generation software for all your triangulation and surface, grid, tetrahedron generation needs.
  • Cyril Crassin posted his thesis on GigaVoxels.
  • 3D voxel sculpting with 3d-coat.
  • Sculptris is 3D sculpting software, similar to Zbrush.
  • AN IMAging Library supports a number of file formats, and also simple image operations such as distance transforms.
  • Pedro Felzenszwalb has some image distance transform code
  • Sander van Rossen's open source Winforms library for node/graph based user interfaces.
And other interesting things: I'll leave you with the winning 64k intro from Revision:

Tuesday, May 08, 2012

Dynamic Window Algorithm motion planning

Most robots have a set of navigation algorithms for motion planning that execute at different frequencies, global path planners (e.g. A*, ~0.1 Hz), mid-level path deformation (e.g. elastic band, ~5Hz), and collision / obstacle avoidance algorithms (~20Hz), which will be the last step before actuator control.
For MAGIC 2010, we used the Dynamic Window Approach.
(Note the ROS navigation stack offers the same algorithm configurations, but of-course didn't exist at the time we had to develop the WAMbot codebase). There are three common approaches used for local trajectory planning:
  • Potential-field based, where each obstacle has an obstacle 'force field' for repelling the robot, and the goal has a attraction field. (A similar approach is 'Vector-fields', and Virtual Force Field)
  • Dynamics based, where the algorithm consider the robots dynamics in calculating a solution. (e.g. Velocity Obstacles and Dynamic Window Approach)
  • Sampling based, where various collision free states are sampled and then combined. (e.g. Reachability graph, Probabilistic roadmaps)
The Dynamic Window Approach is a velocity-based local planner that calculates the optimal collision-free ('admissible') velocity for a robot required to reach its goal. It translates a cartesian goal (x,y) into a velocity (v,w) command for a mobile robot.

There are two main goals, calculate a valid velocity search space, and select the optimal velocity. The search space is constructed from the set of velocities which produce a safe trajectory (i.e. allow the robot to stop before colliding), given the set of velocities the robot can achieve in the next time slice given its dynamics ('dynamic window'). The optimal velocity is selected to maximize the robots clearance, maximize the velocity and obtain the heading closest to the goal.

Its easier to explain if we look at the code first. In pseudo-code the DWA is:
BEGIN DWA(robotPose,robotGoal,robotModel)
   desiredV = calculateV(robotPose,robotGoal)
   laserscan = readScanner()
   allowable_v = generateWindow(robotV, robotModel)
   allowable_w  = generateWindow(robotW, robotModel)
   for each v in allowable_v
      for each w in allowable_w
      dist = find_dist(v,w,laserscan,robotModel)
      breakDist = calculateBreakingDistance(v)
      if (dist > breakDist)  //can stop in time
         heading = hDiff(robotPose,goalPose, v,w)
         clearance = (dist-breakDist)/(dmax - breakDist)
         cost = costFunction(heading,clearance, abs(desired_v - v))
         if (cost > optimal)
            best_v = v
            best_w = w
            optimal = cost
    set robot trajectory to best_v, best_w

Now to explain:
  1. First, we can calculate the desired velocity to the goal based on our current position, and the destination. (e.g. go fast if we are far away, slow if we are close. Use Equations of Motion, see Circular motion for a mobile robot).
  2. Select the allowable velocities (linear 'v', and angular 'w') given the vehicles dynamics, e.g. allowable_v ranges from the current velocity subtract the robots maximum deceleration * timeslice to the current velocity plus the robots maximum acceleration * timeslice, or more compact: [v-a.t,v+a.t], likewise for angular velocity.
  3. Search through all the allowable velocities
  4. For each velocity, determine the closest obstacle for the proposed robot velocity (i.e. collision detection along the trajectory)
  5. Determine if the distance to the closest obstacle is within the robots breaking distance. If the robot will not be able to stop in time, disregard this proposed robot velocity.
  6. Otherwise, the velocity is 'admissible', so we can now calculate the values required for the objective function. In our case, the robots heading and clearance.
  7. Calculate the 'cost' for the proposed velocity. If the cost is better than anything else so far, set this as our best option.
  8. Finally, set the robots desired trajectory to the best proposed velocity.
If you've read all the other posts, you should now know enough to implement your own mobile robot navigation system, or at least understand a bit more about the algorithms in the ROS navigation stack.

Monday, April 16, 2012

Transmin Rocklogic wins WAITTA 2012 Innovation Award

Transmin's rockbreaker automation product, Rocklogic, has won the 2012 WAITTA Innovation award.
ScienceWA covered the story.

Rocklogic features a number of world-first innovations, including tight integration with vehicle automation products allowing the rockbreaker to efficiently interleave operations with haul trucks or loaders. In addition, Rocklogic is the first system for hydraulic booms to allow automated parking and deploy, as well as an active collision avoidance system.

It's been the product of many hours of hard work to bring to market, so it is fantastic to receive this recognition.

On to the nationals!

Sunday, April 15, 2012

Farbraush Demo Tools

Farbrausch released the source code to all their demo tools, including kkrunchy, ktg and werkkzeug3. And following the move, Moppi Productions released the Demopaja source code. A good day for demoscene coders.

For more info on these tools take a look at the Demopaja documentation and werkkzeug3 documentation. (FYI, werkkzeug is a german/elite-speak word for "tool", or literally a work-thing). Dirk Jagdmann has a nice online presentation on the Farbraush/Elitegroup demotool approach, and Dierk 'Chaos' Ohlerich has a number of good presentations:
Over the years I've collected a number of great snippets of tips'n'tricks from various bits of FR code that have been opened (e.g. this gem by Dierk 'Chaos' Ohlerich on FPU intrinsics), now you can get it all in one big collection. As the FR release is just a big dump of all the material, here are a few relevant posts to help you decipher what is going on: I'll leave you with some of Farbrausch and Moopi's favourite productions to inspire you to decipher the code:

Saturday, April 07, 2012

Game Developers Conference 2012 - Technical summary

GDC2012 is over, and this year there are a huge number of available presentations. You can download the Game Developer Conference 2012 presentations from the GDC vault, Jare / Iguana has also kept a link collection from GDC 2012. I've looked over all the technical publications available and put together this summary post. (Edit: I've updated this to cover some maths, physics, and graphics material I missed on the first pass - thanks Johan & Eric)

I'll start with graphics.

Louis Bavoil / nVidia and Johan Andersson / DICE have a presentation on "Stable SSAO in Battlefield 3 with Selective Temporal Filtering", ambient occlusion is a well established technique now, but they apply a quick way to use past data and the differences in Z buffer states between frames to intelligently reuse the AO results. They also look at filters and optimising blur functions. Similar to established tricks in the realtime raytracing demoscene.

Eban Cook / Naughty Dog presented "Creating Flood Effects in Uncharted 3", a technical artist look at water effects. Unfortunately realtime fluid simulation wasn't used, instead Houdini was used to pre-generate the game content. An overview of the shaders for water, water particles, froth particles, and lighting is given.
Light probe interpolation

Robert Cupisz / Unity discussed light probes, "Light probe interpolation using tetrahedral tessellations", in terms of choosing the appropriate probe and weights using Delaunay Triangulation / Tetrahedrons and Barycentric Coordinates by dividing scenes into convex hulls. Also covers projecting onto the nearest convex hull, covers it all with a fair bit of maths, this would be of interest to physics / collision detection programmers too. There is a collection of nice links and some sample code at the end.

Matthijs De Smedt / Nixxes covers "Deus Ex is in the Details" using DX11 tech. Covers AA (FXAA DLAA MLAA), SSAO, DOF (Gaussian blur), tessellation and soft-shadows.

Colt McAnlis / Google investigates post-compressing DXT textures in his talk "DXT is not enough", trying to out-do zipped DXT's with delta encoding. More info at this blog post or skip it all and download the DXT CRUNCH compressor here.

Matt Swoboda / Sony & Fairlight delves into Signed Distance Fields, a demoscene hot-topic last year, with the talk "Advanced Procedural Rendering in DirectX 11". Investigates converting polgyon mesh data and particle data into signed distance fields. Takes an in-depth look into a optimised marching cubes implementation for a fluid simulation with smooth particle hydrodynamics (SPH), and how to use signed distance fields to do ambient occlusion.
Physically based rendering in realtime 

Yoshiharu Gotanda / tri-Ace research makes a case for physically based rendering with a Blinn-Phong model in the presentation "Practical Physically Based Rendering in Real-time". An indepth look at the BRDF formulation they use.

Wolfgang Engel, Igor Lobanchikov and Timothy Martin / Confetti present "Dynamic Global Illumination from many Lights", just a bunch of pictures, not much information.

Carlos Gonzalez Ochoa / Naughty Dog covers "Water Technology of Uncharted". Covers the shader, animating the normal maps flow, and simulating the ocean water with Gerstner waves, b-spline waves, and wave particles. They go on to look at LOD with "Irregular Geometry Clipmaps" including fixing T joints, and then culling, skylights and underwater fog. Next physics, attaching objects (buoyant), and point queries. Finally, SPU optimization. Quite comprehensive.
Water technology of Uncharted

Ben Hanke / SlantSixGames describes the bone code in "Rigging a Resident Evil". Transforms are described with 9 functions and processed with an optimising compiler, allowing fast retargeting of animations.

Scott Kircher / Volition Inc expands on Inferred Lighting in "Lighting & Simplifying Saints Row: The Third" by looking into lighting for rain, foliage, dynamic decals, and radial ambient occlusion. Then moves on to automated mesh simplification using iterative edge contraction and takes an indepth look at selecting an appropriate error metric.

Nathan Reed / Sucker Punch Productions discusses "Ambient Occlusion Fields and Decals in Infamous 2", going into depth on how to solve the artifacts of this approach.

Marshall Robin / Naughty Dog covers the effects system tools in "Effects Techniques Used in Uncharted 3: Drake’s Deception".

Niklas Smedberg / Epic Games looks at PowerVR GPU processing pipeline and capabilities in "Bringing AAA graphics to mobile platforms" and provides a number of tricks'n'tips on optimising the performance of the mobile GPU, and highlights the cheap operations. In short: AA (fast), hidden surface removal (fast), alpha test (slow), render targets (slow), texture lookups (slow). Takes a more detailed look at material shaders, god rays, and character shadows. All in all, pretend its ~2002, and you'll be right.

Mickael Gilabert / Ubisoft and Nikolay Stefanov / Massive cover the GI system in Far Cry 3 in "Deferred Radiance Transfer Volumes". Light probes get precomputed directional radiance transfer data from a custom raytracer stored using spherical harmonics. Source code for the relighting system is presented, along with optimisations by using volume textures.

John McDonald / nVidia explains CPU/GPU synching for buffers in "Don’t Throw it all Away: Efficient Buffer Management" and provides advice on buffer creation flags.

Bryan Dudash / nVidia suggests using average normals to overcome tesselation issues in "My Tessellation Has Cracks!".
Mastering DX11 with Unity

Renaldas Zioma / Unity and Simon Green / nVidia present "Mastering DirectX 11 with Unity". Starts by looking into Unity's physically based shaders (Oren-Nayar, Cook-Torrance, and energy conservation, then blurry reflections and combining normal maps). Next up, Catmull-Clark Subdivision, tetrahedra light probes (See Robert Cupisz's talk), HBAO, APEX destruction, Hair simulation with guide hairs, Explosions using signed distance fields with noise and color gradients,and finally velocity buffer motion blur.

Tobias Persson / Bitsquid discusses lighting billboards in "Practical Particle Lighting". Looks at normal generation and per-pixel lighting for billboards (including code snippets), applying shadow maps with a domain shader, and shadow casting

Karl Hillesland / AMD investigates realtime Ptex (per-face texture mapping) in "Ptex and Vector Displacement in AMD Demos", and efficient retrieval from the texture atlas, including all MIPs.

Jay McKee / AMD presents the "Technology Behind AMD’s
Leo Demo". He details some of the code behind the forward rendering of 3000 dynamic light sources using a depth pre-pass, light culling (tile-based compute shader to output light list), and light accumulation with materials phase. Basically moves the light management code from CPU to GPU.
Terrain in Battlefield 3

Mattias Widmark / DICE presents "Terrain in Battlefield 3: A modern, complete and scalable system". Begins with an overview of the features for the terrain system (heightfield based, procedurally generated, spline decals, decoration (tree,rock,grass), destruction, water), and presents their quadtree terrain data structure, paying particular attention to LOD. Next, CPU/GPU performance is investigated, and a clip-map based virtual texture system is presented. The large terrain data set is managed by intelligently streaming data to the required detail ('blurriness'), and co-locating data (heighfield/color/mask lumped together, next to the next level of relevant LOD data), which is also compressed (RLE/DXT1). Nodes are then prioritized based on distance, culling, and updates (e.g. destruction). Finally, mesh generation, stitching and tessellation with displacement on the GPU.

Moving on to physics.

Erin Catto covers "Diablo 3 Ragdolls", including representing ragdoll bones, initialising ragdolls from animations, and interacting kinematic and dynamic objects.

François Antoine / Epic talks about Gears of Wars 3 destruction physics in "Pushing for Large Scale Destruction FX" and suggests using particles for dust and debrie, and simplifying meshes for destruction.

Stephen Frye / EA looks at ragdolls in the presentation "Tackling Physics". Highlights aspects of ragdolls that look unrealistic, and suggests adding joint limits and motorized constraints at joints to simulate muscles. Gives two approaches to solving the control problem, first using external forces, second calculating the appropriate torque from world space.

Graham Rhodes / Applied Research Associates presents "Computational Geometry" where he looks at half-edge data structures for triangulating a polygon, splitting a face, splitting an edge, intersection of an edge and a plane and generating a convex hull.

Richard Tonge / nVidia covers "Solving Rigid Body Contacts" and starts with a gentle introduction to rigid body state space and progressively builds a signal-block-diagram of solving a single contact restraint. Then looks at each block in the diagram and deciphers the physics behind it. He then looks at solving multiple contacts, and explains why you can't apply a linear solver to the problem (contacts break), and presents the LCP, and an alternative approach; sequential impulses. He then gives a whirlwind tour of GPU solvers.

Gino van den Bergen / DTECTA presents "Collision Detection", first covering shapes, then configuration space, distance tests, Seperating Axis Tests, and takes a closer look at the GJK algorithm.

Jim Van Verth / Insomniac gives a nice introduction to Navier Stokes in "Fluid Techniques", breaking down the terms for external forces, viscocity, advection and pressure visually. Then looks at three major representations for fluids, grid, particle and surface (wave) based.

Takahiro Harada / AMD examines how heterogeneous compute architectures can achieve large scale dynamic simulations in "Toward A Large Scale Simulation". Begins with an overview of GPU architecture, and GPU rigid body simulation in three key phases: broad-phase, narrow-phase and constraint solving for a system of 128,000 particles and 12,000 convex bodies. He presents a design for overcoming data transfer and minimising synchronisation points whilst dividing the workload between CPU and GPU.

Erwin Coumans / AMD investigates destructive physics in the aptly titled "Destruction". He begins with generating voroni diagrams for shattering geometry and boolean operations, and moves into generating collision shapes with convex decomposition and tetrahedralization. Then moves on to realtime approaches with real-time booleans and breakable constraints and finite element

Looking at AI.

Bobby Anguelov / IO Interactive, Gabriel Leblanc / Eidos-Montréal and Shawn Harris / Big Huge Games present "Animation-Driven Locomotion For Smoother Navigation". They start with the standard motion graphs and transitioning/blending between animation cycles. Then take an indepth look at footstep planning (IK, Foot sliding) and come up with a system for deciding where steps should be taken to fulfil the navigation goal. They then investigate modifying navigation paths to better fit the animation cycles, and finish by looking into collision avoidance.

Daniel Brewer / Digital Extremes looks at agent perception, reaction, combat chatter, buddy systems and collision avoidance using velocity space Optimal Reciprocal Collision Avoidance in "Building Better Baddies".

Brian Magerko / Georgia Tech covers "How to Teach Game AI from Scratch" including competitions (Mario AI, Google Ants, Poker AI, Starcraft AI).

Dave Mark / Intrinsic and Kevin Dill / Lockheed Martin investigate some examples (snipers, guards) of Utility-Based AI in "Embracing The Dark Art of Mathematical Modeling in Game AI".

Kasper Fauerby / IO Interactive explains "Crowds in Hitman:Absolution" including cell maps, boids, animation and PS3 implementation details. The crowd AI uses a state machine with steering behaviours (pending walk, walk, panic), and behaviour 'zones' with information from the navigation system to select behaviours. Near-optimal Character Animation with Continuous Control was used for animation.

Elan Ruskin / Valve looks at empowering writers and dialog in TF2, Left4Dead, etc, in "Rule Databases for Contextual Dialog and Game Logic". Begins with player triggered lines (extended by environment, memory etc.) and avoiding fill-in-the-blank dialog by using databases. Rules, queries, responses and writers tools are examined next, and ties things off with database query optimisations.

Mike Robbins / Gas Powered Games examines "Neural Networks in Supreme Commander 2", with 34 inputs and 15 output actions and a single hidden layer (98 neurons), with a fitness function composed from 17 inputs trained to control combat platoons.

Ben Sunshine-Hill / Havok investigates LOD for AI in "Perceptually Driven Simulation", and makes a case for using probability of noticing a difference instead of distance as a LOD measure, and presents a market-based "LOD trader" for selecting the appropriate LOD given the constraints on hand.

Moving along to programming and math.

Adisak Pochanayon / Netherrealm covers debugging and timing issues in "Runtime CPU Spike Detection using Manual and Compiler-Automated Instrumentation". First up, manual instrumentation and wrapper functions, Then detours, and automated instrumentation (compiler flags) with an indepth look at the 360. Finally, profiling with threshold functions.

Pete Isensee / Microsoft details how rvalue in C++11 (T&&) can eliminate temporaries in "Faster C++: Move Construction and Perfect Forwarding".

Scott Selfon / Microsoft reviews audio compression technologies in "The State of Ady0 Cmprshn", starting with time-domain compression with PCM (raw, A-Law, U-Law, ADPCM), then frequency-domain compression and discusses the artifacts generated by both, then evaluates the performance of different codecs.

Robin Green / Microsoft and Manny Ko / Dreamworks present "Frames, Quadratures and Global Illumination: New Math for Games". Beings with a review of spherical harmonics, Haar wavelets, and Radial basis functions. Builds up to 'Spherical Needlet' wavelets, by exploring different basis functions ('frames')

Gino van den Bergen / DTECTA presents dual-numbers in "Math for Game Programmers: Dual Numbers", beginning with a look at complex numbers. Automatic differentiation with dual numbers is then described, with code, and examined in curve tangents, directed line geometry (triangle/ray intersections, plucker coordinates, angles), and rigid body transforms/skinning (dual quaternions).

Jim Van Verth / Insomniac explains rotation formats in "Understanding Rotations", including angle (2d) Euler angles, Axis-angle, Matrix (2d/3d), complex (2d) and Quaternion (3d). Interpolation is considered for each case (including slerp).

Eric Lengyel / Terathon presents exterior (Grassmann) algebra in "Fundamentals of Grassmann Algebra". This includes the wedge product, bivectors, trivectors and multivectors. Moves on to cross product transforms, dual-basis 'anti-vectors', regressive 'antiwedge' product, and demonstrates how these can be used in homogeneous and plucker coordinate systems. This leads on to basic intersections (line, plane, point) and distances (point plane, two lines) and finally ray-triangle intersection using bivectors to avoid barycentric coordinates.

Squirrel Eiserloh / TrueThought presents "Interpolation and Splines". Takes us back to basics by looking at averaging and blending, and moves onto interpolation. Begins with quadratic and cubic Bézier curves, then moves into splines and discusses continuity. Cubic Hermite splines are up next, and how to convert between Bézier and Hermite, then Catmull-Rom splines and finishes with the more general Caridnal splines.

John O’Brien / Insomniac covers "Math for Gameplay / AI". Starts with object intersection tests (sphere-sphere, sphere-plane, AABB-AABB, AABB-ray, capsules-capsule, capsule-ray) and projecting onto a plane in a gun turrent AI example. Next up, Bayes' Theorem and conditional probability, followed by fuzzy logic.

The Web up next

Corey Clark and Daniel Montgomery present "Building a Multi-threaded Web-Based Game Engine" covering both client side (WebGL, WebSockets, etc) and server side (NodeJS, Hosting, etc).

Michael Weilbacher / Microsoft looks at server issues in "Dedicated Servers in Gears of War 3".

Michael Goddard "Developing a Javascript Game Engine"
using component based architecture. Takes an indepth look at events/promises and loading content.

Mike Dailly / YoYo investigates packing textures and command list execution for improving performance in "The Voodoo Art of Dynamic WebGL".

Marc O’Morain / Swrve takes a look at a number of issues (including iOS multitouch) in "Building Browser Based Games Using HTML5".

And rounding up everything else

Caruso Daniel explains the "Forza Motorsport Pipeline". Importing assets into the game.

GuayvJean-Francois investigates sound diffraction and absorption in "Real-time sound propagation".

Mike Lewis presents the challenges of multithreading for MMOs "Managing the Masses".

Sean Ahern looks at building better game engine tools in "It stinks and I don't like it"

Clara Fernández-Vara, Jesper Juul, and Noah Wardrip-Fruin make a case that "Game Education Needs Game History"

Chris Jurney presents his idea "Motion Blobs", a fast and crude kinect data "gesture" system, essentially an extension of the typical 2D approach to 3D. Steps are to calculate motion via background subtraction, filtering (open/close), labeling, and then correlation.

Alexander Lucas explains automated testing at Bioware in "The Automation Trap And How Bioware Engineers Quality"

Alex Mejia looks at camera movement in "Saints Row : The Third real time capture tools".

Scott Philips presents "Designing Over the Top SAINTS ROW: THE THIRD Postmortem", and highlights the importance of pre-visualization and playtesting.

Ron Pieket / Insomniac looks at eliminating downtime in "Developing Imperfect Software" via a 'Structured Binary' approach to building engine data by taking advantage of a Data Definition Language.

Benson Russell takes a look at Naughty Dog's approach to polish in "The last 10, going from good to awesome", in essence longer alpha and beta tests.

Luke Muscat takes a look at the lessons learnt while updating Fruit Ninja in "Iterating Design And Fighting Fires: Updating Fruit Ninja And Jetpack Joyride"

Tatyana Dyshlova talks about managing 300+ artists working on 500 car models in "Racing to the Finish"

Quite a collection this year, but overall seems to be less exciting content than previous years. For graphics, it seems that signed distance fields and physically based rendering is the new theme, AI is still playing catchup and character animation cycles are still a hot topic, following that theme, physics is also looking at characters and ragdolls, with destruction being the hot topic, and the web is focusing on WebGL.

Thursday, March 29, 2012

Elastic Band - Realtime pathfinding deformation

I've covered global path planners before including A* path planning and Dijkstra. These path planners will give you a result, but if something alters the desired path (such as an object crosses the path, which you need to avoid) you need to recompute the whole path again. Algorithms such as D* lite allow efficient re-computation. However, the elastic band method allows the existing path to be used, and just adjusted to handle deformations. Or, more formally, elastic band path planning enables realtime modifications to a precomputed path that consider additional obstacles (or cost functions) not considered during the original paths computation.
Elastic band - Global path in blue, bubbles as colored circles,
 and the red path is the one generated by the elastic band algorithm.

The easiest way to describe it in games terminology is to have an entire path turned into particles that follow boid-like (flocking) rules. The best part, is that since it is essentially based on particles, means that you can re-use code from a particle system or physics engine you may have available. If your from the games background, you can use player/entity interchangeably with my use of the word 'robot'.

The 'elastic band' is a created from a set of 'bubbles'. A bubble contains a radius, a position, the coordinates of the closest obstacle to the bubble and a velocity.

The way it works is that an 'elastic band' is initialised with the partial path from the global path planner. The centre of each bubble is assigned to a point along the global path. (For optimisation purposes, a sparse version of the global path is typically used).

If the elastic band does not contain the robots current position, then the robots present position is inserted, provided it will align with the desired path.

For every bubble in the path we determine the total external and internal forces acting on it, and apply them and create/remove bubbles as necessary.
This is done by:
  • Creating a new bubble if the distance to the next bubble is greater than a predefined constant. In effect this constant determines the discretization of the path.
  • Determining the closest obstacle to the bubble and calculating the radius. This is achieved by using a spatial partitioning algorithm to determine the closest obstacle by its Euclidean distance. The radius is assigned to this distance, limited to a maximum.
  • Calculating the external repulsion force from the closest obstacle. The magnitude of this force is scaled to be a function of its distance limited by an upper bound, such that closer obstacles exert a greater pressure. (i.e. (max - r) / r )
  • Calculating the internal cohesion force from the previous and following bubbles in the path. This is simply the sum of the normalised vectors to the previous and following bubbles centres.
  • Updating the bubble state by applying the forces and velocities to calculate the new bubble position. I used a higher order integrator with hand-tuned dampening terms. In addition, a low pass filter is applied to the position update to reduce the influence of the forces.
Finally, bubbles are removed if they are too close to each other. The entire band is then checked for continuity, ensuring that each bubble is greater than the robots size. If this criteria is not fulfilled the robot is halted, as it will not be able to get to its destination without colliding.

The bubble that is a set distance infront of the robot is selected to generate the next trajectory input. The angle between the current bubble and the next bubble is calculated and used to modulate the vehicles desired velocity. This causes the vehicle to reduce its speed when it approaches a sharp turn.

In pseudo-code the algorithm is as follows:
eb[0] = path.start
i = 0
for each point along path
  if dist(eb[i],point)> c

while robot not at path.goal
  for each bubble b in eb
    f_e = b.pos – closestObstacle(b)
    f_i = (b.pos – eb[b.i – 1].pos) + (b.pos – eb[b.i + 1].pos)
    b.integrate(f_e+f_i,b.vel) //update velocity from forces
    b.dampen(b.vel) //filter state
    dist = abs(b.pos – eb[b.i + 1].pos)
    if dist > c
    if dist < c_min
  tmp_goal = eb.closestbubble(path.robot + lookahead) //set goal from robot position
  set robot trajectory to tmp_goal
Thanks to Sushil who did all the hard work in implementing this for MAGIC2010.