Efficient Work Stealing

Datacenter servers must balance load across many (sometimes dozens) of cores. Existing techniques such as work stealing perform well for long tasks, but can be inefficient for short tasks that take only a couple of microseconds. At these timescales, cores may spend a significant fraction of their time just looking for work, rather than actually doing useful work; this wastes CPU resources that could be used by other applications on the same server. We are exploring techniques to perform load balancing more efficiently, so that requests are handled faster and cores waste fewer cycles looking for work.