cfEngine + a VCS (git, mercurial, SVN…) to store and historize your configurations.
Try to make every change on your infrastructure testable, use the monitoring tool as a an automatic tool for it.
This will ensure you that small tasks are handled as soon as possible and keep people working on long-term tasks focussed.