Hello all, long time reader, first time poster. I believe I have a rather unique situation to solve that pertains to my company's server.
We currently have an HP ProLiant ML350 G5 server on which we keep all important project files for clients. I've noticed after the last few years some funny things going on, and lately some problems that have me quite concerned.
Some of our employees opt to work from home either during bad weather or on the weekends using our VPN. Three weeks ago an employee called me to tell me he had been trying to log on all day but had no luck connecting. I tried to connect from my machine to verify and no dice for me as well. He had work that needed done asap, so he came into the office so he could access the files he needed directly. When he walked in our server was shut down, and the APC BE350R Battery Backup/surge protector we have the server on was making a constant tone but no lights. In the manual this apparently indicates: Overload Shutdown - During On Battery operation a battery power supplied outlet overload was detected. To me this means: the surge protector was tripped because a spike in voltage (surge) was detected from the wall outlet.
After some poking around I learned that the company we lease the building off of has a diesel generator backup for our office. Every Sunday at 7 pm the utility power is cut to our office and the generator runs power for 20 minutes or so and shuts down. I don't know exactly what steps it takes, or why the power even has to be cut in the first place, but I'm contemplating coming in and observing what happens. That's of course something I would have done already if I didn't live 28 miles from work. I know the generator running is a regular thing, but the server going down is not. More often than not it seems fine Monday morning. Also, we have the same battery backup on virtually all of the machines in our office, but the only one that ever gets "tripped" is the one in the server room.
So I guess my questions are one part IT, one part electrician
I know there are ways to set a machine to reboot on a schedule, but can I set it up so it will shut down and stay off for say an hour and then boot? I know this won't solve the surge protector/backup from tripping, but I'd like to at least have the server shut down before the spike occurs and power is lost while I find a solution to the generator problem.
Is there something I can use to prevent the APC from tripping when the generator comes on? Would you even agree that the generator is the culprit from what I've described?
I know the upper and lower voltage limits of the APC unit can be increased or decreased for allowable limits via the PowerChute software... is it OK to tweak these values or better left alone?
Anything I may be missing or overlooking?
I know this has happened a few times and I know pulling the plug on a running machine is never good. We have had some problems in the past that I think may be contributed to this (bad hdd, blown psu) so it's important I nip it in the butt.
Any help would be appreciated, if additional information is needed let me know. I originally had all kinds of details listed but the wall of text was massive.
Answer to question 1) Yes - most HP servers have an Integrated Lights Out card (iLO). This allows you to access the system even if it's powered off - so long as it's plugged in. It also allows you to remotely control power operations, so you don't have to drive in. It's a regular ethernet interface - press F8? when the server boots, it will prompt you to enter the iLO config at some point. Just connect this iLO port to your network, use a strong password, and give it valid IP settings.
I have several of the ML350's in production. Most of mine are G6. Not certain if your G5 may have the same problem as mine did, but... The power supplies in ours are high-efficient 460 Watt. The UPS they require must produce a pure sine wave and not an stepped-approximation sine like many of the PAC units. We had to replace our UPS with one from HP (model HP T1500 G3). Not really any more expensive than a decent one from APC, but when we get a surge or a flicker, the serve doesn't shut off. Thought I'd throw in my two cents.