I am currently preparing for a system load test in a customer project. This involves an awfull lot of conversion-operations on >10 Mio files, for example the conversion of legacy SGML into XML with some additional calculated content or generation of such dummy-records to find the limits of our target architecture, packing, transporting and unpacking the data again on the other side. I knew all this was going to take days and has lots of traps waiting for me.
Of course I am using a series of PHP-CLI Scripts serially triggered by shell-scripts… so I can do other stuff in the meantime. So far the theory.
In order to have a potent development-box, I bought an up-to-date gaming-machine at www.arlt.com. I would like to mention the great service at Arlt: They allowed me to test if I can install Debian on the hardware before I actually checked out. Very cool! If you ever had trouble with incompatible chipsets installing Linux, you will definitely appreciate this. I took the 2nd machine (850EUR).
After some hours of running conversion scripts the system suddenly went to read-only mode and could not be operated from the commandline anymore. What happened? At this point I lost a lot of time and nerves hunting for all sorts of ghosts…
It all looked like a hardware-problem, so I exchanged parts. Some days later the customer told me, everything is fine with the first bulk of test-files on his 14-drive RAID 10. Okay!
So I tried the following:
- Use a second harddisk and mount it on /var where my data was lying.
- Completely opened the box, opened the window and let air circulate to support cooling. If you have, point a fan to it (see picture of my final working setup).
- Included pauses with usleep() in the loops of my scripts (1 sec after 1000 cycles and 5ms after each loop cycle).
It finally worked out for me this way without any more hardware failures.
Lesson learned when using ‘minor equipment’ for dev-purposes: Backup often. And if you have really heavy stuff running many hours or even days with lots of file I/O involved, build in pauses into your scripts and get additional cooling – or buy yourself the RAID 10 with 14 harddisks, which is faster, has more style and is more fun anyway.