June 2010 logged 1,892 pushes – almost our previous record of 1,971 in January. Note this number for June is *under* reporting TryServer usage, as we accidentally lost Try Server usage logs from 01-10june. We assert, without proof, that we would have easily set a new record if we had the missing 10 days of data for TryServer, our busiest branch. Even missing 10-of-30 days of TryServer in June, TryServer was still the busiest branch of the entire infrastructure compared with full month data for other branches.
The numbers for this month are:
- 1,892 code changes to our mercurial-based repos, which triggered 234,387 jobs:
- 35,308 build jobs, or ~49 jobs per hour.
- 111,513 unittest jobs, or ~154 jobs per hour.
- 87,566 talos jobs, or ~121 talos jobs per hour.
- Losing logs for 1/3 of month for our busiest branch means we are underreporting for June. Hopefully the work catlee/nthomas/anamarias are doing to automate reports will be live soon, to prevent this happening again
- Our Unittest and Talos load continues high, like last month, and we expect this to jump further as more OS are still being added to Talos.
- We’re still double-running unittests for some OS; running unittest-on-builder and also unittest-on-tester while developers and QA work through the issues. Whenever unittest-on-test-machine is live and green, we disable unittest-on-builders to reduce wait times for builds.
- The trend of “what time of day is busiest” changed again this month. Not sure what this means, but worth pointing out that each month seems to be different. This makes finding a “good” time for a downtime almost impossible.
- The entire series of these infrastructure load blogposts can be found here.
- We are still not tracking down any l10n repacks, nightly builds, release builds or any “idle-timer” builds.
Here’s how the math works out (Descriptions of build, unittest and performance jobs triggered by each individual push are here:
[UPDATE: thanks to jhford for catching some copy-paste typos! joduinn 15-jul-2010]