There is a new paper from the pacific graphics conference hybrid parallel continuous collision detection (HPCCD). I've been wanting to write GPU collision detection system for a while, but was always held back on how to do this efficiently, and easily (ie: not much effort on my part). The hybrid approach proposed sounds great: do all the hard (not-very-easily-made-efficiently-parallel) on the CPU, and just use the GPU for some edge-edge and vertex-face primitive tests. This keeps the GPU doing what it does best, and the CPU doing what it does best.
The paper is available from the above link, and you can download thecollision detection API. UNC maintains some collision detection benchmark scenes.