The default commandqueue is created with a context that contains all GPU devices in platform. Since kernels are only compiled on first invocation, switching between GPU devices is OK, but switching to a CPU device afterwards causes an exception because the kernel was not compiled for CPU. Should we provide more options and expose more intefaces to the user?