best cpu for web crawling application?

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
I am using a web spidering application for a new business I am developing, and I need to improve the performance. I am wondering what the most likely bottleneck is. I am guessing it's CPU, and am thinking about buying a new computer to run the spidering app.

The application runs several threads and scans websites for links, then follows the links and scans the other websites for links - similar to Google or the original Webcrawler I presume. I'm running this app on my old desktop, an Athlon 2500XP 1GB ram, on a fast cable connection. My cable connection does not seem to be bogged down - I can browse the web, bittorrent, etc. while the spider is going using my other computer without any performance issues.

So, I can buy a new computer for a $200 to $300, which should help things along. I haven't really been plugged in to the latest CPU discussion, though I understand Intel pretty much dominates at the high end. That said, I'd like to spend no more than $300 for the entire new computer - so we're looking at a bang/buck play.

Is CPU the bottleneck, or is it another component(s)?
What is the best bang/buck CPU play for this application?
Is this primarily an integer or floating point task?
What common benchmarks would be the closest match to this application?


Thanks!
 
I am pretty sure your internet/network is the problem. That will be your biggest bottle neck, even with a high end Quad core. Also you will probably need a bigger budget to get a new PC. A Q6600 it self is $150-190. You can't really upgrade your rig any more.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
Do you think it's an issue of overall bandwidth or speed of fetching a page? Like I said - bandwidth seems to be fine.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
BTW, I am currently searching with 10 threads. The software I'm using has the capability to run with up to 100 threads, but I find the software is unstable or too slow at with that many threads going.
 

snarfies1

Distinguished
Dec 31, 2007
226
0
18,680
CPU-wise, I wouldn't think you'd need more than an old P4 or Athlon 64, single core. I doubt you'd even need that much - I'd think a P3 should do the job nicely, quite frankly. I used to do spidering at an old dialup ISP/webhost over 10 years ago with no speed problems - but I was sitting on a T1 at the time. So at the time I was probably working on a Pentium I.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
In checking CPU utilization, the app is using up 98%, and it's doing about 2 pages per second. I am still thinking that the CPU is the bottleneck.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
Anyone care to take a stab at answering my last 3 questions -

What is the best bang/buck CPU play for this application?
Is this primarily an integer or floating point task?
What common benchmarks would be the closest match to this application?

Thanks!
 
The best bang for the buck would probably be a Phenom 9550. Since you're looking at a really small budget though then I'd go with an Athlon 5000X2 or similar model depending on what you can get on sale or with a combo deal on a motherboard :D.

For your application Integer performance is probably more important.

SiSoftware Sandra XI
 
Wait a sec, exactly how intense is this app???

If it has 4+ threads you will certainly benefit from a Quad. By the looks of your current XP 2500 set up, looks like you need a totally new rig. (new CPU + DDR2 RAM, +etc).
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
The app is pretty intense - uses 99% of my CPU current. I don't know if quad core or whatever will have any benefit, as the software is fairly rudimentary and not optimized for multiple cores. The threads are all running in software, and can be run from 1 to a hundred.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
Hmmm, thanks for your input Megaman. I don't see how the Phenom 9550 is the best bang/buck.
Based on this - http://www.tomshardware.com/charts/cp the Phenom 9500 and X2 5000 are neck and neck in performance, and the X2 is about $75 on Pricewatch, while the Phenom is $140. I'd say that makes the X2 far and away the bang/buck winner.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
I'm planning on running software not designed for dual cores, let alone quad cores, on Windows XP. Will I get any benefit from the extra cores?
 
@OP: Don't bother upgrading. You will NEED to rebuild. There aren't any Socket A CPUs made any more. DDR is also dead. A Phenom or other Dual Core AMD requires Socket 939/AM2. Or if on Intel a LGA775.

Assuming you are using the old HDD, OS, case, DVD RW, etc except PSU.:

E2180 ($60) **Assumes you will OC to 3Ghz+ (E2180 @ 3Ghz =~E8400 @ Stock) ++ You could also step up to the E5200 ($85).
P35-DS3L ($80)
Good budget 400W ( http://www.newegg.com/Product/Product.aspx?Item=N82E16817153023 ) $31 after MIR
Cheap video card ( http://www.newegg.com/Product/Product.aspx?Item=N82E16814130098 ) ~$25

2*1GB DDR2 800 ~$40

XIGMATEK SD964 =$12 (after MIR) +++Only if OCing to 3Ghz+++

Total=~ $250 give or take $20.

======
Some one else can do an AMD build, as I haven't looked much at AMD systems since the C2Ds. You may be able to get a good X2 based rig for good price.

See:
http://www.newegg.com/Product/ProductList.aspx?Submit=ENE&N=40000343&Description=AMD%20X2&name=Processors%20-%20Desktops&Order=PRICE
To get started on CPU.

Imo, if willing to OC the E2180 is the way to go. (Ppl here will help out with the OCing, also there is a motherboard specific OCing guide for the P35-DS3L written by me in the forums, look it up if planing to get P35-DS3L)
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
Pretty good for $250. My budget isn't that strict, I just don't really see the need to spend much for a computer to run this app.

I was actually planning on keeping my 2500 running, and buying or building this as a standalone rig.
 

thrashertm

Distinguished
Sep 16, 2008
11
0
18,510
Yeah, $500 is probably fine. But remember, this comp isn't for gaming or multimedia, just for running this web crawling app. I would think that $300 ought to be enough to get me a vastly more powerful computer than my 2500 XP