Yahoo! and probably all other large installations of Hadoop have to deal with upgrading their Hadoop clusters. Scheduled rolling upgrades is the strategy applied everywhere, but depending on the size of the cluster this can take way too much time. Yahoo! has developed internally an interesting tool that can help with Hadoop upgrades:
Groundhog is an automated testing tool to help ensure backwards compatibility (in terms of API, functionality, and performance) between releases of Hadoop before deploying a new release onto clusters with a high QoS. Groundhog does this by providing an automated mechanism to capture user jobs (currently limited to pig scripts) as they are run on a cluster and then replay them on a different cluster with a different version of Hadoop to verify that they still produce the same results.
This is the sort of tool I always wanted for most of the applications I’ve developed: a system able to capture complete or percentage of the real traffic and then replay it. At every layer of the application.