## Thursday, April 30, 2009

### Math - Random numbers and splines

I came across my old favourite website on random number generation today whilst browsing so that inspired this post.

Random numbers are a key component to so many things, from generating interesting textures, to selecting chromosome pairs for a genetic algorithm optimization problem. I've found the choice of random number generator can have a massive impact on some GAs.

Here is a list of Linear congruential generator (LCG) random number generators with spectral analysis, discussions and properties. It's a fantastic resource.

iq of rgba has a great article on small and fast random number generators for floating point numbers.

Splines are another great mathematical tool for any programmers bag of tricks. They are used in animating characters to robot control systems. The GDC 2009 maths tutorial has a great introduction to splines. Cubic provide a good introduction with pseudo-code for Hermite (the best) splines.

A really interesting find today was Time constant splines. I'm a bit skeptical about it on how it works (re-interpolating and re-fitting splines to get 'smooth' interpolation steps) but the author claims good results, and I'm sure it would be good enough for camera paths etc.

### University Rankings

Having lead a relatively academic life so far and being involved with lots of teaching people often ask me what the good universities are. It's hard to come up with something on the spot that you can really stand by.

Often people are considering traveling and living somewhere at great expense - you don't want this time, effort and money to go to waste. Luckily I have come up with a relatively fool-proof way of answering this question.

Any university with 20+ nobel prize laureates is probably quite decent. If your studying a nice solid established field then this should be the guide you use. If your in a new field (CS) then perhaps pay a bit of attention to the Academic Ranking of World Universities compiled by Shanghai Jiao Tong University. This ranking I believe acts as a bit of a 'predictor' for future nobel-success, since the nobel prizes seem to lag a bit behind the times, regardless of their original intention.

Without further ado, here's the list:
University of Cambridge (4th Academic Ranking 2008 - SJTU)
University of Chicago (9th Academic Ranking 2008 - SJTU)
Columbia University (7th Academic Ranking 2008 - SJTU)
Massachusetts Institute of Technology (5th Academic Ranking 2008 - SJTU)
University of Paris
University of Oxford (10th Academic Ranking 2008 - SJTU)
Stanford University (2nd Academic Ranking 2008 - SJTU)
University of California, Berkeley (3rd Academic Ranking 2008 - SJTU)
Georg August University of Göttingen
Harvard University (1st Academic Ranking 2008 - SJTU)
Cornell University
Ludwig Maximilians University of Munich
Yale University
New York University
Princeton University (8th Academic Ranking 2008 - SJTU)
Johns Hopkins University
California Institute of Technology (aka CALTECH) (6th Academic Ranking 2008 - SJTU)
ETH Zurich
University of Heidelberg
Humboldt University Berlin
University of Illinois at Urbana-Champaign
University of Manchester
Washington University in St. Louis
University of Zurich
University of Pennsylvania
Technical University of Munich
University of London
...
Noteable CS-strong sub-20's:

Carnegie Mellon University (17)
University of California, San Diego (17)
Uppsala University (15)
Imperial College London(14)
University of California, Los Angeles (10)
University of Texas at Austin (9) + Dallas (3)
University of Tokyo (9)
University of Vienna (9)
University of North Carolina, Chapel Hill (7)

Australian Universities:
University of Melbourne (6)
University of Adelaide (6)
The Australian National University (3)
University of Sydney (3)
University of Western Australia (2)

Of course, if your interested in a specific area, eg: GPGPU's you can look further down the list (eg: Georgia Institute of Technology (aka GATECH) (2)). CS is usually one of the more tricky ones since Maths/CS/etc don't get any nobel prizes. But that doesn't make Cambridge, MIT, and Stanford any less good...

## Wednesday, April 29, 2009

Multithreading is often a difficult concept to explain, but I've seen this excellent explanation over on gamasutra, "OMG, Multi-Threading is Easier Than Networking".

Finally a good use for cats.

Of course, anyone who wants some quick boosts from multithreading should take a closer look at OpenMP. This technology is seriously under-rated. And since MSVC supports it out of the box now, there really isn't any good reason to not use it.

Intel pushes their Threaded Building Blocks pretty hard, but I don't much see the point of it. I don't want my applications to scale so much anyway (the only real benefit as far as I can see). At some point MPI just makes more sense. And of course OpenCL if it lives up to its promises..

## Tuesday, April 28, 2009

### Torrents and Vivaldi

Downloading torrents is always a good way to churn bandwidth. My favourite client is uTorrent, but Vuze will often be a better cross-platform solution. Of course, for the ocasional rare download the Opera browser will do a fantastic job as an all-in-one (web, email, news, torrent, irc, WOW!)

One way to preserve your download quota is through using local peers, you can do this via these handy
IP addresses for Australian peer networks (eg: WAIX).

Another alternative is to use something like Vivaldi (paper includes pseudo-code). Bittorrent clients normally choose the highest bandwidth connection over lower-latency (ie: closer) connections. Ono is another option/plugin for Vuze that will let you prefer closer clients.

The algorithms are quite straight foward although computationally intensive. A good choice for GPGPU?

### GPGPU Workshops

So some big news, finally nVidia released OpenCL, you can apply to become a developer:
http://www.nvidia.com/opencl

Some news locally:

Next thursday, 7th of May, "GPU Computing Workshop" at UWA, all day.
Talks from nVidia, UWA, Xenon about a range of topics.

Next friday, 8th of May, "GPU Computing Tutorial" at WASP at UWA, 9am - 1pm.
A hands on session from nVidia.

nVidia has sent over Mark Harris (founder of GPGPU.org) for the two events.

Should be good!

## Wednesday, April 22, 2009

### AMD GDC09 Presentations

The AMD/ATI GDC09 presentations are up now, it is interesting to see that ATI is slowly becoming more open source / open tech friendly, I think the AMD merger helped in this respect. Since then a lot of good low level tech from ATI has been opened up including AMCL GPU.

A few articles caught my eye, the AMD Stream Physics sounded interesting, but was an empty presentation, more or less saying 'We also think OpenCL is neat'. At least AMD and Intel are both quite committed to OpenCL so it ought to be a success.

The most interesting presentations were ones that covered new DX functionality like "GDC2009: Shader Model 5.0 and Compute Shader". Other good ones covered applying old techniques to new GPU capabilities such as "GDC2009: D3D11 Tessellation" (bezier patches for tessellation, what a great idea, why didn't I think of that?), and "GDC2009: Authoring for Real-Time Tessellation and Displacement Mapping" covers the technical artists side of this.

"GDC2009: The A to Z of DX10 Performance" is an interesting read as it comes from ATI and nVidia together and cover some interesting performance aspects. Apparently rendering a single screen tri is faster than a single screen quad, and dynamic branches are preferable now. News to me!

My favorite was "High Quality Direct3D 10.0 & 10.1 Accelerated Techniques". Mostly because of the real world examples of modern (HD) SSAO. One of my favorite tools. Mind you, it would be interesting to see how this stack up against Accumulative SSAO...

## Saturday, April 18, 2009

### Soft Bodies

Since Bullet physics engine added soft body support that brings the number of publicly freely available physics engines that have soft body support to 3. PhysX, OpenTissue and Bullet. So that seemed like enough to justify adding softbody support to PAL.

PAL now supports cloth/patch based soft bodies, and tetrahedral deformable bodies.

Bullet has basically designed the softbody system as completely separate from the rigid body system, so this makes it a great system to abstract since it will make it possible to implement a softbody system that does not directly rely on the underlying physisc engine, bringing softbodies to ODE, Newton, and friends that do not have it yet. Same plan is for the SPH fluid code I've written, but those are both still on the TODO list to integrate to PAL.

Of course, the remaining issue is how to generate the tetrahedral descriptions.

Some options I've investigated:

• The PhysXViewer should be able to do it.
• Apparently Qhull can generate tetrahedron (how convenient!) using the 'qhull d QJ i' option. This would be a good option since Qhull is a widely employed library.
• TetGen, seems to be a rather large library, given its special purpose nature (generating tetrahedron), but says it is optimized for finite volume/element methods.
• GNU Triangulated Surface Library says it can create tetrahedra, but I'm not sure if this is only for isosurfaces.
• NETGEN claims to create tetrahedral meshes from CSG
• Cubit is a massive library from sandia that claims to generate tetrahedral meshes.
• libmesh can work with tet4, but I've not investigated what inputs it can take.

And during my googling I came across SLFFEA an open source finite element analysis tool. Cool!

## Monday, April 13, 2009

### Larrabee

I have to say, I bought into the hype that surrounded Larrabee. I thought it would be awesome. So, excuse my disappointment when they finally released the full Larrabee instruction set. See: Larrabee @ GDC (rasterization) and an article over at Dr. Dobbs Journal on Larrabee.

One of the first bits of production coding I did was optimizing rendering code for MMX/3dNow. It was a painful experience, the code I'd hand optimized took months to write, and by that time computers and compilers had gotten so much faster that it would have been easier just to buy a faster PC. I stopped all pretense of writing assembly code for any practical purpose around the time the eight version of the Intel C++ compiler came out. It did vectorization, and it did it,.. not-bad. (This translates to f\$#in awesome for anyone who dealt with 'vectorizing compilers' before)

Having a look at Larrabee, it seems immediately clear to me that it's just another Itanium (The wonderchip that wasn't). The instruction set (gather/scatter is cool!) is far too complex for a compiler to do well at, so to get good performance you'll need to go down to the metal. And not even Carmack does that, he hires someone else to do it. I don't think there will be many people taking advantage of this technology. Nevertheless if Intel and Microsoft get together to write the DX software renderer in Larrabee assembly, we might end up with half decent low power intel GPU's in laptops. (One presumes Michael and Tom can do a decent job of that!) At least that would be nice.

### Shared Libraries

Shared libraries, dynamic linked libraries, .so, .dll, .dylib, whatever you might call them seem to be a bit of a black art of programming. Especially when you mix C++ into the picture.

One key part of PAL is the self-registering plugin system. This allows extra functionality to be extended to the system on demand. It was very useful to allow the development of various simulation systems where fixes or alterations to the physics system can be loaded on demand.

It also greatly simplifies the whole process of creating/managing objects. The big catch is getting it to work in a portable cross platform manner.

The basics of the system is this:
Abstract factory which will have plug-ins self register via either a static constructor, or add themselves to a list of available constructors that can be retrieved via a shared object function call.

For staticly constructed objects, its not a big deal, just have a single static version of each object which registers itself with the factory. This works fine as long as those static variables are initialized. Unfortunately, the Microsoft compiler does not link those symbols from a static library, unless a function in the static library is explicitly called. So your options are to link all the objects to your program, or have a 'stub' in the static library, that you are forced to call to populate all the objects. PAL follows this latter approach. (Note, if you put that static stub into a separate header file, you can save yourself a lot of include header problems)

For dynamic libraries, well, it's not such a big deal for Windows. Under MSVC you just create a DLL project, and load the library with LoadLibrary and GetProcAddress. Ofcourse you need to remember to '__declspec(dllexport)' the functions you need to access, or write the def file (far too complicated). Then make sure the compiler exports the symbols.

MacOSX didn't pose great problems either, again, just use dlopen and dlsym. Remember to compile with '-dynamiclib'.

That brings my to GCC/Linux. This one was a bit tricky, and I could only solve this with the kind help of B. Watson over on #slackware. The problem with GCC is the way it handles run time type information. You need RTTI for dynamic_cast's etc, to work. This is relatively critical for design patterns such as abstract factories and virtual inheritance, which are heavily employed in PAL.

Under Linux to open a shared object, dlopen and dlsym will do, however you need to specify RTLD_GLOBAL, otherwise the type info will be stored locally and not globally. Now this normally doesn't cause a problem, but with GCC class types are compared by pointer references and not strings, so of course, with each object having its own local address space, the code just wont work!

That fixes one problem, the other is getting the symbols to export. By default GCC won't export all the symbols that you need. To force GCC to do this we need to pass through to the linker from the compiler the '-rdynamic' flag. But, alas, we are hampered again, we need to actually make sure the symbols are visible. GCC has introduce the ability to have private/local symbols, and if only one person specifies something to be private then the whole subsystem becomes private. No major dramas though, just specify '__attribute__ ((visibility("default")))' everywhere.

So now finally your ready to compile a linux shared object with position independent code:
-fPIC -shared -rdynamic
But it still doesn't work!

Yes, for some reason I don't know, you need to make sure your MAIN program is also compiled with the '-rdynamic' flag.

I guess the lesson for the end of the day was, '-rdynamic' is your friend.

Heres a useful snippet that summerizes all this:

REMEMBER: Export ALL symbols (shared or not!)
Linux: RTLD_GLOBAL & rdynamic!

`#if defined (OS_WINDOWS) || defined(WIN32)#    define DYNLIB_HANDLE hInstance#    define DYNLIB_LOAD( a ) LoadLibrary( a )#    define DYNLIB_GETSYM( a, b ) GetProcAddress( a, b )#    define DYNLIB_UNLOAD( a ) !FreeLibrary( a )struct HINSTANCE__;typedef struct HINSTANCE__* hInstance;#elif defined (OS_LINUX) || defined(OS_OSX)#    define DYNLIB_HANDLE void*#    define DYNLIB_LOAD( a ) dlopen( a, RTLD_LAZY|RTLD_GLOBAL )#    define DYNLIB_GETSYM( a, b ) dlsym( a, b )#    define DYNLIB_UNLOAD( a ) dlclose( a )#endif`

To compile:
`OSX:main: g++ main.cpp -o main DLL:g++ shared.cpp -o shared.dylib -dynamiclibLinux:main:g++ main.cpp -o main -ldl -rdynamic -fPICg++ shared.cpp -o shared_single.so -fPIC -shared -rdynamic`

And would you believe it took me almost 3 hours to figure this all out. (At least I didn't end up having to write my own RTTI system like the last guy.

You can check out the PAL self-registering abstract pluggable factory in the code. (see also the test programs and makefiles).

## Wednesday, April 08, 2009

### Essential free utilities for setting up a new PC

Every once in a while you get a new computer and you need to go through the usual process of setting it up.

Here are my favorite bits of free software to install.

First of all, a new browser. Everyone seems to like Firefox, but I prefer Chrome and Opera. Make sure you install Gears for offline gmail, and flash, etc. too. You might want FileZilla for your FTP needs, and if you want an email client, Thunderbird is a good choice.

WinRAR, for all your unzipping needs. Archive file formats supported are ZIP, RAR, ACE, CAB, ARJ, LZH, TAR, GZ, BZ2, UUE, JAR, ISO, 7Z and Z.

Notepad++ and OpenOffice or StarOffice or of course MS Office (if your a student) for all that document and spreadsheet work you need to do.
Adobe Acrobat PDF reader, or FoxIt (great for mobile devices too!).
PDF Creator to print to PDF.

For your photo collection, Picasa and Paint.NET for image editing. Irfanview is a nice small powerful image utility.

If your not happy with windows media player (and who is?), try
VLC for all your media streaming needs, and MediaPlayer Classic, Mplayer, for a few more codecs. Finally, VirtualDub for video editing. foobar2000 is great as a lightweight video player that can also convert audio formats. (You might also want LAME).

To get all those video and audio goodies there is of course µTorrent, the lighterweight BitTornado and SoulSeek.

To stay safe, AVG antivirus, just because it happens once every three years, and then your happy you have it. (Also, install some data recovery software, for the same reason.) You can try Recuva, personally I use a commercial product.

Then there are those additional annoyances to deal with: Disabling IE7 auto-update and Disabling IE8. Tweak UI and CMD here powertoys. Tweak UI will let you change annoying default behavior, (like autoplay) and remove additional icons.
To remove additional annoyances, right click on the clock to customize notifications and remove some of those additional startup options, and right click properties on the start button to get the quick launch etc. Finally strip those additional start menus clean by going through your Documents and Settings\All Users\Start Menu\Programs

Finally, Synergy will help you work effectively with your old and new PC.

Obviously I'm not covering details for Vista, because why would anyone want to use that?