The LLVM group has just got a new logo, a modernized version of the 'dragon book' dragon. But the real exciting news I saw on GPGPU.org is a project 'GPUocelot'. This program will translate compiled PTX programs (produced by nVidias CUDA) via the just-in-time LLVM compiler to any targeted backend, meaning for example a PS3 CELL processor. All you need to do is add an AMD backend to LLVM, and hey-presto, instant CUDA-for-ATI. That could potentially put a dent into OpenCL's plans..
The papers from the High Performance Graphics conference are also out, one that caught my eye was Understanding the Efficiency of Ray Traversal on GPUs, not really because I thought the paper was fundamentally groundbreaking, but because it explained a few neat tricks on nVidias part, in particular, a good explanation of persistant threads in CUDA for breaking down non-uniform workloads from a global pool. (eg: They used it on a per-ray basis, so that fast and slow rays don't "block" each other.)