I worked at a government office and they had a similar task, they ended up out sourcing it to professionals, they used some kind of dual SLR setup to snapshot the documents (alot of them were very frail and needed extra care), I think there were a team of 50 operators working for nearly 10 weeks to get the job done, it was very expensive.
The scanner you will want to purchase will sit at around $3,500 to $5,000. It will likely not be able to service 1 million documents without some form of maintenance on it. Previously I have seen up to 60,000 documents a week scanned in. Based on that you can figure 6 months to have it all scanned in accurately.
Do some checking for either OCR scanning devices of a commercial grade. You may even want to check at outsourcing it depending on your financial considerations.
Newegg will carry mainly consumer grade equipment, I would contact CDW (cdw.com) and talk with a specialist on it.
You can't go wrong with a fujitsu 6240. This thing is blazing fast and now paper jams. newegg also carries it for around a couple grand. We converted our dr's office to EMR and we bought the next model down too and it also works great. Have 2 people scanning all day.
When I worked at a banks digitization department we used something similar to the Kodak i5200. It's difficult to recommend anything without knowing a price limit. The i5200 or similar costs around $25k.
There are a lot of places that can do what you are asking. It is expensive.
Please be more specific with exactly what you are scanning, and how you need them archived, what you are going to do with them once archived, how you would like to access this data once it is archived, and I can make some suggestions.
ok, I have been in a similar situation. a very Large Insurance office, had to scan in Docs going back to 1970 ..and they kept EVERYthing , from Larger than Legal size Docs, to little hand written scraps. the best scanners for the money that could do the job was the Visionneers , and Paperport was a good Doc managing and archiving software. (BUT I would ,if you can, get find the older ones before they got bought out by Nuance and fired the Xerox team that made and deveolped everything ) I think it was around Paperport version 10.that Nuance bought them out and FUBARed everything. the 9450 was great scanner , with ADF , so you could stack prolly 8-10 sheets at a time.(money wise , for 1 $5000 scanner you could buy 10 Visionneers) ..I dont know how many people you have working on this project. that having been said, it is HUGE job, and will take a LOT manhours ..and you may even wear out a couple scanners ..we had 6 scanners and 6 people scanning 8 hrs a day ..and prolly took a couple of months total; but we talking big filing cabinets stuffed to the brim, like 6,000 pounds of paper. ..(bear in mind you also have to remove old staples and paper clips , and be careful about the sticky notes, etc ..checking the Doc to make sure it is scanned and you can read the scan etc) ..OCR software SUCKs no matter who or what makes it ..it can BARELY do a typed doc correctly sometimes, and if your trying to scan in forms that have peoples handwriting on it..FORGET it . your basically going to scan everything like a Pic (that was another thing, I worked with some of those $5000 scanners ..and for B&W they were fine, but most could not do color, and we had to scan in Color Pics as well, everything from Old Polaroids to 35mm etc, now they may have changed and more expensive scanners might be doing color these days, havent looked at them recently ..also Do you already have a Contact mangement Program, network? Servers? are you going to inegrate the Phone System for Caller ID pop up the client file ? are going access the scanned docs daily ,weekly , or just need to pull them up whenever you need to ? its a big exnpensive job , but when done its pretty cool to be able to just type in a name , and be able to have EVERY Doc from that file at your fingertips.
^ lots of paper is right. 1 skid of 8.5x11 standard paper, thats 40 cartons, is 200,000 sheets and weighs between 1600 and 1800 pounds on average depending on bond. So if these documents are paper, and you in fact have a million of them, this is a daunting task to tackle, and is best left to a professional service with the equipment and technology to handle it. Either that or just plan on having a full time team work on it for many months, and the equipment/software as stated already, is spendy as well.
Go professional, Fujitsu just did this for UK Border Agency and to quote them "The project required a complete redesign and implementation of the physical site. In parallel, a new application, network and hosting infrastructure was designed, developed, tested and deployed. Although the timescales were very challenging, the tremendous commitment and teamwork enabled all contracted milestones to be met on time."
Another advantage on hiring 3rd parties is what you'll do with the hardware after the scanning job is done. If you buy like 20 scanners and will not use them later on, you'll waste between 1600 and 1800 tons of cash.
There is no way you can do this yourself without a large facility. Scanners with a duty cycle allowing you to complete this task in a timely manner are not exactly everywhere. Make sure the professionals also provide OCR service. Running OCR on a million documents yourself will be painful.
Well, one person doing this on a single sheet scanner:
1 document per minute (Generous) would take nearly two years without stop.
Duty cycle is about 500 a day, so three times a day, you may be performing maintenance. Lets set aside 30 minutes a day for maintenance, 8 hours for sleeping, and weekends off: your looking at closer to 3.5 years.
At $4000, a HP Scanjet N912 has a duty cycle of 5000 pages a day scanning 100/ipm (ocr to be done separately) you can cut the 42 months into 12 days (including sleep, maintenance). However, you will be performing maintenance up to 28 times a day for this sucker. Most likely you would need a full spare and all the common replaceables before you begin.
Small scanners would take a very long time, but for that many documents your budget is probably enough to buy better equipment. Larger scanners can take a stack of documents and automatically feed them and dump them into a large PDF file.
OTOH someone mentioned a digital camera. I had to copy a small book recently at someone else's facility, and their copier was broken that day. I had my old digital camera with me so I put the book outside on the sidewalk and photographed it. I did about 30 pages in just a few minutes. I brought the JPG's back and printed them out, and the book is perfectly usable for us. The problem with this is that I wasn't real careful with framing and so forth so the pages are turned a little bit. If a person was doing a lot of documents they would of course buy a copy stand to mount the camera on which would control the framing, lighting and alignment.