You didn't tell us what the max frequencies possible are for any other particular part (like CPU, or bus), so it's hard to say what you should do for max benefit.
You are drawing the wrong conclusion thinking it's important not to drop the HTT to 3X, anything else will make as much difference. Even so, yes most often you can run 1024 without any problems, but don't focus on getting HTT higher, focus on the other parameters and just let the HTT end up at whatever it does as a result of maximizing the others.
Particularly, get your ram speed up and find out at which frequency you would need to raise (slow down) timings to go higher, and then how much higher the memory would go at the slower, higher timings.
While doing all this, drop the HTT multi down and leave it down for the time being, so you isolate each subsystem to find it's peak speed capability. Maybe you've already gone through all this, but one part that still sticks out is that you are only running 250 bus and at a hit to the memory (being underclocked).
If I were to guess, I'd guess the thing you need to do is raise the 250, leaving the memory ratio the same. Drop the CPu mulitplier so it's out of the equation for the time being and see how high the bus and memory can go remaining stable, THEN raise CPU multiplier again. Whatever you end up with, THEN decide which HTT multi you can use, not caring if it's maxed near 1000 or not.