One short quick post, Unity is now free. Of course, this isn't the full edition with all the tools you would need if you are developing anything bigger than a one-man game..
Still, the price is right.
If your the free as in speech kind, try the Blender game engine.
Thursday, October 29, 2009
Wednesday, October 28, 2009
Boston Dynamics: PETMAN
Boston Dynamics, famous for Big Dog (check out the Big Dog youtube video), are now working on a Biped, PETMAN.
Its still early on in the project, but they seem to be making good progress on the mechanical side of things. The noisy motor might prove to be a bit of an issue.
I wonder how long until the japanese fighting-toy robots get to be using this kind of equipment. It's probably still quite some time, since robot prices don't tend to drop too much over the years. I'm predicting it will still be a very long wait till consumers see anything beyond toy bipedal robots.
Its still early on in the project, but they seem to be making good progress on the mechanical side of things. The noisy motor might prove to be a bit of an issue.
I wonder how long until the japanese fighting-toy robots get to be using this kind of equipment. It's probably still quite some time, since robot prices don't tend to drop too much over the years. I'm predicting it will still be a very long wait till consumers see anything beyond toy bipedal robots.
Tuesday, October 27, 2009
Apple Mac OS X Utilities
Everyone needs tools to use their PC. I covered the essentials for a Windows PC previously. Some good (free) tools for your Mac:
- StuffIt Expander, your OS X WinRAR/WinZip equivalent.
- MacFUSE, this extension enables file systems in user space for the mac. Ever want to read NTFS?
- NTFS 3g, the NTFS plug in for macFUSE, no more 'Items could not be moved because XXX cannot be modified'. Why OS X doesn't support NTFS is beyond me. I have been using the NTFS 3g plug in with no problems for about a year now.
- Opera, while not a Mac-only utility, having the Opera web browser is essential. Especially when Safari acts up, or you want to use IRC, or bittorrent, or RSS, or email, or anything really. Its an all in one solution.
- Parallels, lets you run your bootcamp Windows partition seamlessly inside the mac, and with graphics acceleration to boot. Nifty. While not free, I feel its worthwhile. The free alternative is virtualbox.
- Nocturne, this lets you dim, and otherwise change your screen display. Very useful for late night computing.
- Small Image, for quick image resizing.
- Paintbrush for mac.
- Perian and Flip4Mac WMVfor extending media file support by OS X.
Tuesday, October 20, 2009
Timing square root on the GPU
Inspired by the post by Elan Ruskin (Valve) on x86 SQRT routines I thought I would visit this for my supercomputing platform of choice, the GPU. These kinds of low level trickery I left behind after finishing with RMM/Pixel-Juice some time around 2000, having decided that 3dNow! reciprocal square root routines were more than good enough..
Anyway, a brief overview of how we can do square roots:
I decided to test three routines for the GPU:
I did my best at generating reliable results by testing block sizes from 2..256 and performing 2.5 million sqrt operations. Here are the results from my nVidia 9800GX2:
Total time is the total time measured by the CPU that the GPU took to launch the kernel and calculate the results. The clock ticks are meant to be more accurate measurements using the GPU's internal clock, but I find that to be dubious.
The conclusions to take from these results are simple: Carmack's inverse and other trickery isn't going to help, using the GPU RSQRT function as opposed to the inbuilt SQRT function saves you about a clock tick or two. (Probably because nVidias SQRT is implemented as 1/RSQRT, as opposed to X*RSQRT)
I'm happy to say, low level optimization tricks are still safely a thing of the past.
You can get the code for the CUDA benchmark here: GPU SQRT snippet.
Anyway, a brief overview of how we can do square roots:
- Calculate it with the FPU, (however that was implemented by the chip manafacturer).
- Calculate it from newton-raphson. This allows you to control the accuracy of the sqrt. (Or typicaly rsqrt) This comes in two flavours:
- Use an initial estimate, and refine ALA Greg Walsh / John Carmack / Quake 3 approach.
- Use a lookup table, then refine. This is probably an obvious approach, but I think AMD did a lot of pioneering work on this approach. (Well, at least, I learned these tricks from them..) See nVidias lookup table sqrt code.
- Use an initial estimate, and refine ALA Greg Walsh / John Carmack / Quake 3 approach.
- Calculate it from the inverse. This comes in two flavours:
- Calculate the reciprical, then invert it (1/rsqrt(x)), this gives you correct results
- Multiply it by the input value (x*rsqrt(x)), this gives you faulty results around 0, but saves you a costly divide.
Note:
1.0f / rsqrtf(0.0f) = 1.0f / infinity = 0.0f
0.0f * rsqrtf(0.0f) = 0.0f * infinity = NaN
- Calculate the reciprical, then invert it (1/rsqrt(x)), this gives you correct results
I decided to test three routines for the GPU:
- native sqrt
- native rsqrt
- Carmack's rsqrt
I did my best at generating reliable results by testing block sizes from 2..256 and performing 2.5 million sqrt operations. Here are the results from my nVidia 9800GX2:
Method | Total time | Max. ticks per float | Avg. ticks per float | Std. Dev. | Avg. Error |
---|---|---|---|---|---|
GPU intrinsic SQRT | 1.285ms | 5.99 | 3.99 | 0.00 | 0.00% |
GPU intrinsic RSQRT * x | 1.281ms | 5.99 | 3.99 | 0.00 | 0.00% |
Carmack RSQRT * x | 2.759ms | 6.28 | 4.26 | 0.01 | 0.09% |
The conclusions to take from these results are simple: Carmack's inverse and other trickery isn't going to help, using the GPU RSQRT function as opposed to the inbuilt SQRT function saves you about a clock tick or two. (Probably because nVidias SQRT is implemented as 1/RSQRT, as opposed to X*RSQRT)
I'm happy to say, low level optimization tricks are still safely a thing of the past.
You can get the code for the CUDA benchmark here: GPU SQRT snippet.
Thursday, October 15, 2009
AMD OpenCL for GPU
Not one to be left behind by nVidias news, AMD/ATI have released the AMD OpenCL beta v4 which now supports OpenCL for AMD GPU's! Some highlights:
Fabulous news! Now you can do OpenCL on OSX, Windows (32&64bit), for nVidia and ATI GPU's and AMD CPU's. It doesn't get any better than this, well, at least util next year when Intel enters the fray.
It's not all good news though, it seems some of AMD's GPUs don't support double precision:
Still, it's better than nVidias lot, and I'm happy to see AMD finally making a serious effort in this space. (Not that the previous efforts weren't impresive, just not so focused...)
nVidia has also released the new version of CG, v 2.2. I wonder how much OpenCL will replace the use of Cg..
- First beta release of ATI Stream SDK with OpenCL GPU support.
- ATI Stream SDK v2.0 OpenCL is certified OpenCL 1.0 conformant by Khronos.
- Microsoft Windows 7 and native Microsoft Windows® 64-bit support
Fabulous news! Now you can do OpenCL on OSX, Windows (32&64bit), for nVidia and ATI GPU's and AMD CPU's. It doesn't get any better than this, well, at least util next year when Intel enters the fray.
It's not all good news though, it seems some of AMD's GPUs don't support double precision:
Still, it's better than nVidias lot, and I'm happy to see AMD finally making a serious effort in this space. (Not that the previous efforts weren't impresive, just not so focused...)
nVidia has also released the new version of CG, v 2.2. I wonder how much OpenCL will replace the use of Cg..
Wednesday, October 14, 2009
Google Building Maker
Google has just released their building maker, it looks like they finally found a use for the videotrace technology they acquired from the University of Adelaide.
This should help putting content together quickly for simulators, etc.
Check it out:
This should help putting content together quickly for simulators, etc.
Check it out:
Nearest Neighbor
A bunch of links on the nearest-neighbor problem, for higher dimensions:
http://www.mit.edu/~andoni/LSH/
http://www.cs.umd.edu/~mount/ANN/
http://www.cs.sunysb.edu/~algorith/implement/ranger/implement.shtml
And, for good measure a set of comparisons of optical flow algorithms:
http://vision.middlebury.edu/flow/eval/
I'll probably stick to the OpenCV defaults anyway, but its nice to know there are options. It would seem Bruhn et al. is the most accurate..
http://www.mit.edu/~andoni/LSH/
http://www.cs.umd.edu/~mount/ANN/
http://www.cs.sunysb.edu/~algorith/implement/ranger/implement.shtml
And, for good measure a set of comparisons of optical flow algorithms:
http://vision.middlebury.edu/flow/eval/
I'll probably stick to the OpenCV defaults anyway, but its nice to know there are options. It would seem Bruhn et al. is the most accurate..
Monday, October 12, 2009
Talking Piano and AI
Daniel Wedge sent me this interesting link on a talking piano!
A great idea, I wonder if it had been done before..
In other news, the International Joint Conference on Artificial Intelligence archive has been made open to the public covering all the way from 1969 - 2007!
A great idea, I wonder if it had been done before..
In other news, the International Joint Conference on Artificial Intelligence archive has been made open to the public covering all the way from 1969 - 2007!
Thursday, October 08, 2009
nVidia: OpenCL , Nexus, Fermi
There's been a fair bit of news flowing out of nVidia, biggest first:
nVidia has released a GPU OpenCL implementation compatible with all devices that support CUDA (no surprise).
You can get the nVidia OpenCL from the nVidia OpenCL Developer Website.
Next in nVidia news, Nexus has been released. I haven't had a chance to try it, but apparently it allows you to debug GPU programs via Microsoft Visual Studio in the 'normal' way - this would certainly make GPU programming a little easier.
Finally, nVidia has released information on Fermi their next-generation architecture. Basically it seems to be more of the same (which is always good) without all the bad bits for GPGPU programming (even better!). The biggest changes are allowing multiple kernels to execute in parallel, and having decent double-precision support. This should really open up the scientific&engineering computing to GPGPU, and will probably do good things for getting accelerated raytracing happening. AnandTech has a good write-up on Fermi, although it looks like we will see Larabee before Fermi...
I had a chance to play with a 3xC1060 Tesla boards for the GPU fluid simulation project on a 64bit machine. This threw up a whole bunch of problems, since I was using MSVC express edition, which does not support 64bit (apparently.) Problem was solved by using the 32 bit CUDA wizard, redirecting the CUDA libraries to the 32bit versions (ie: c:\CUDA\lib, not C:\CUDA\lib64), and some other tweaks.
nVidia has released a GPU OpenCL implementation compatible with all devices that support CUDA (no surprise).
You can get the nVidia OpenCL from the nVidia OpenCL Developer Website.
Next in nVidia news, Nexus has been released. I haven't had a chance to try it, but apparently it allows you to debug GPU programs via Microsoft Visual Studio in the 'normal' way - this would certainly make GPU programming a little easier.
Finally, nVidia has released information on Fermi their next-generation architecture. Basically it seems to be more of the same (which is always good) without all the bad bits for GPGPU programming (even better!). The biggest changes are allowing multiple kernels to execute in parallel, and having decent double-precision support. This should really open up the scientific&engineering computing to GPGPU, and will probably do good things for getting accelerated raytracing happening. AnandTech has a good write-up on Fermi, although it looks like we will see Larabee before Fermi...
I had a chance to play with a 3xC1060 Tesla boards for the GPU fluid simulation project on a 64bit machine. This threw up a whole bunch of problems, since I was using MSVC express edition, which does not support 64bit (apparently.) Problem was solved by using the 32 bit CUDA wizard, redirecting the CUDA libraries to the 32bit versions (ie: c:\CUDA\lib, not C:\CUDA\lib64), and some other tweaks.
Catchup
I've been quite occupied with ECU (Exams and mid-semester marking), Transmin, GPU Pathfinding, and the DSTO MAGIC 2010 competition. The WA team I've been organizing has teamed up with Flinders university and got some significant assistance from Thales Australia. The submission has been made, will have to see if we are successful...