__tz_convert contains a lock tzset_lock. When doing profiling of a real world application, I noticed that ~5% of our context switches came from this lock.
There should be a more efficient way to represent this lock -- for example, if the timezone can be represented as a 32 bit integer, an atomic compare-and-swap could be used to change it. Alternatively a sequence lock approach could be used.