Archived from groups: microsoft.public.win2000.file_system (
More info?)
In article <61952ab6.0406101042.40d5828d@posting.google.com>,
usenetacct@lycos.com <usenetacct@lycos.com> wrote:
>sure, we need to keep files created by jobs online for several months
>& that may be going to a year. This is currently 120,000 new files
>per day & could go up dramatically very soon with a new client. Size
>of them ranges from 4MB to 1KB, mostly on the smaller end (<50KB). We
>constantly delete files to make room for new ones, so each day we're
>deleting anywhere from 120,000 to 500,000 of these files while adding
>those I mention above. They're stored in a directory structure based
>on the
>year-month-day-hour-minute-job#. When I say millions of files, I am
>being literal. Tens of millions, to be accurate.
>
>We also have a single directory that contains the original files the
>work is based on, & we have a weeks' worth of data in that, which is
>~150,000 files in that single directory. We delete old data from here
>every day to make room for the new also. This system runs 24/7 and we
>cannot take it down to defragment, run chkdsk, or anything else. If
>we have disk corruption, we have to format & restore from tape,
>because the sheer number of files means running chkdsk takes many days
>to run.
>
>At the moment, we can store 3 months of data in on an 880GB array only
>if we literally let the system run constantly at extremely high
>utilization rates. We may need to expand it to allow a years worth.
>At current volumes, that would mean a single 4 terabyte volume. If we
>continue growing at current rates (very fast), 4 terabytes is only the
>beginning.
>
>thanks
I assume you are running NTFS file systems, formatted at the max
cluster size (4KB ?) that supports compression. I recommend using
file compression. With such small files, some of them will go right
into the MFT, and you'll get other space back.
Why/when do you get "corruption" ?
I'm no expert, but I've worked on projects were multi-platform file
system performance was an issue, and we had a couple people that wrote
production file system code in prior lives. I saw a demo of an NTFS
(NT 4.0) vs Solaris with identical 100,000 files in a folder on an NT
server and a Solaris server. NT/NTFS crawled, and Solaris response
was close to instantanious when doing something as simple as a DIR/ls
command.
How often to you access 90-day-old data ? Some sort of hierarchical
storage managememt that rolls the third month's data to a 300GB sata
disk ($250 and dropping) in a hot-swapable tray that you put on the
shelf until needed.
>
>
>
>
>>
>> Can you give us somespecifics about your system ?
>>
>> You don't state that you are having system performance problems, is it
>> this a hypothecial question ? Not that there's anything wrong with
>> that. With "millions" of files you might run into other file system
>> bottlenecks unrelated to fragmentation.
--
Al Dykes
-----------
adykes at p a n i x . c o m