HT probably dissipates less power than the OOO circuity, and HT had a better "ROI".
Hardware multithreading is at least an order of magnitude
simpler than OOO execution and is a good match for a
light weight, relatively simple, relatively high frequency in-
order CPU core, RISC or x86. For many integer workloads
an Atom core could be stalled for half of all CPU cycles
so a second thread that itself can exploit half of the first
thread's stalled cycles will increase throughtput by 50%
in theory. In reality it will be less because two threads will
knock some of each other's code and data out of cache
thus increasing miss rates.