Cluster size and performance

Bob

Distinguished
Dec 31, 2007
3,414
0
20,780
Archived from groups: microsoft.public.win2000.active_directory (More info?)

I have a programmer on my staff who is giving me difficulties.

we have a database server (progress), win2000 sp4, quad processor, 4gb ram,
raid 0+1 (3 drives stripe set, mirrored) with 4k default cluster size.

this server's disk I/O performance is abysmal.

First, the guy argues that there's any value to defragging (he think's there
is none).

second, i mentioned since we are using a db application that supports 8k or
greater cluster sizes, that we should consider increasing the cluster size
on the volume.

so here's my delimma. I need to find an article to PROVE to him that larger
cluster sizes CAN have a performance increase in situations like this.

does any one have any links on articles about cluster size and performance?

i tried googling, and all i can come up with is just generic or non specific
info. In other words, they say that increasing cluster size COULD increase
performance for large file size (multimedia, database). but none of them
actually PROVE it.

i need to find a doc that i can show him to stuff up his littlee...uh...you
get the picture.

anyways, i suggested we take and do an in house performance test: setup a
test server, 4k clusters, benchmark. set to 8/16k cluster size, benchmark,
compare. he thinks it is a waste of time (mainly because it might prove that
I know what i'm talking about).

anyways, i've since learned that this guy won't take my word for it no
matter what I say. so i have to find a documented expert source to prove it.

i searched MS's kb. couldn't find anything.

i'm so desperate to prove to this guy that my suggestion to increase cluster
size could help us with Disk i/o. (i've done enough perfmonning to
demonstrate clearly it's not processor, memory, network. it's disk).

we've got a 7gb database that can be set to 8/16k cluster sizess. i
suggested we set the file system volume to 16/32/64, set the db
approproately and give it a real world test.

he's not interested.

so....i have to find a reputable whitepaper / source that clearly shows
larger cluster size in a db / ntfs file environment could help improve
cluster size.

if i'm wrong, then i've learned something. it seems to me that if a db has
to read several pages of data 4k at a time, rather than 64k at a time, that
we are not being efficient. it seems to me that if we could utilize larger
clusters, it'd be a potential boost to disk performance.

help me either 1) prove i'm wrong, or 2) help me get him to see the light. I
hate situations like this. but i guess it's real world. they don't want to
take my word for it. fine.

HELP! I'm an IT Pro in distress who's being told he doesn't have a clue!
help me find the clue!
 
G

Guest

Guest
Archived from groups: microsoft.public.win2000.active_directory (More info?)

"Bob" <someone@somewhere.com> wrote in message
news:efh%23Ts2IFHA.2648@TK2MSFTNGP14.phx.gbl...
>I have a programmer on my staff who is giving me difficulties.

If he's your employee, make this test a condition of his employment.
Seriously, there are too many "wiz bang wizards" in this industry who are
convinced they know everything and refuse to look further than the end of
their nose for answers. Feel free to use my favorite quip for this sort of
situation: "You're confusing what can't be done with what you don't know how
to do again."

4K matches the memory block size and is a good cluster size for any number
of situations. But matching the cluster size to the block size used by the
database app, and/or the stripe size used by the disk subsystem (I'm
assuming you're running on some sort of hardware RAID) will usually make for
some performance improvement.

If SQL is using 64K blocks, and the file system is formatted with 4K
clusters, the NTFS driver needs to manage and keep track of 16 clusters for
each SQL block. You will at the very least get diminished cache hit
performance as very often 1 or 2 of those 16 clusters gets flushed, but the
other 14 or 15 (now useless clusters) hang around for whatever reason.

On the physical disk subsystem, if the stripe size is 64 or 128K, that much
data will *always* be read from the disk and be held in the disk
controller's cache even if NTFS.sys only reads and caches 4K of that data.
Again, performance will often increase if the cluster size matches.

But a lot of this depends on your data and how you access it. Which is why
you're not finding more than a bunch of handwaving "it often helps"
references.
 
G

Guest

Guest
Archived from groups: microsoft.public.win2000.active_directory (More info?)

Hello,

Thanks for you post.

It is hard to find the article you require. I only find this.

The following article has information about default cluster size, please
have a look.

140365 Default Cluster Size for FAT and NTFS
http://support.microsoft.com/?id=140365

Hope this helps!

Best regards,

Frances He


Microsoft Online Partner Support
Get Secure! - www.microsoft.com/security

=====================================================

When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.

=====================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
 

Bob

Distinguished
Dec 31, 2007
3,414
0
20,780
Archived from groups: microsoft.public.win2000.active_directory (More info?)

oh my Gawd! My brother I am in love with your quip!

THAT SPEAK VOLUMES!

I'm LMAO! i can't wait to try that one! because in my case here, that's
EXACTLY the issue...!

;-)


"Doug Frisk" <PublicNews@removeme.fazwak.com> wrote in message
news:uw$BIo4IFHA.904@tk2msftngp13.phx.gbl...
> "Bob" <someone@somewhere.com> wrote in message
> news:efh%23Ts2IFHA.2648@TK2MSFTNGP14.phx.gbl...
>>I have a programmer on my staff who is giving me difficulties.
>
> If he's your employee, make this test a condition of his employment.
> Seriously, there are too many "wiz bang wizards" in this industry who are
> convinced they know everything and refuse to look further than the end of
> their nose for answers. Feel free to use my favorite quip for this sort
> of situation: "You're confusing what can't be done with what you don't
> know how to do again."
>
> 4K matches the memory block size and is a good cluster size for any number
> of situations. But matching the cluster size to the block size used by
> the database app, and/or the stripe size used by the disk subsystem (I'm
> assuming you're running on some sort of hardware RAID) will usually make
> for some performance improvement.
>
> If SQL is using 64K blocks, and the file system is formatted with 4K
> clusters, the NTFS driver needs to manage and keep track of 16 clusters
> for each SQL block. You will at the very least get diminished cache hit
> performance as very often 1 or 2 of those 16 clusters gets flushed, but
> the other 14 or 15 (now useless clusters) hang around for whatever reason.
>
> On the physical disk subsystem, if the stripe size is 64 or 128K, that
> much data will *always* be read from the disk and be held in the disk
> controller's cache even if NTFS.sys only reads and caches 4K of that data.
> Again, performance will often increase if the cluster size matches.
>
> But a lot of this depends on your data and how you access it. Which is
> why you're not finding more than a bunch of handwaving "it often helps"
> references.
>
 
G

Guest

Guest
Archived from groups: microsoft.public.win2000.active_directory (More info?)

Hi Bob.

First off, shoot the programmer.

1. NTFS was designed so that there was a much lower dependence on
defragging. While this is something that can help, there isn't the huge
performance increase the you'd see with one of the FAT based file systems.

2. The cluster size depends on a lot of things. Remember that MS SQL uses
8K pages -- the catch is that it allocates them in blocks of 8. So for the
Data volume, you should set 64K clusters. This may not the same with VFox
or other DB apps.

3. The volume with the transaction log should be set to the standard 4k
clusters to allow for the performance needed there.

4. You mentioned one RAID set. If IO is really a problem. The OS should be
on one mirrored set of 2 drives, standard settings, the Database on another
RAID stripe set of different spindles, and the log files on a mirror/stripe
set of two or more spindles.

If you are still having IO problems, use the query analyzer to redefine your
indexing and move frequently used tables to different spindles to increase
simultaneous data access as different data stores.

Hope this helps.
--
Ryan Hanisco
MCSE, MCDBA
FlagShip Integration Services

"Bob" <someone@somewhere.com> wrote in message
news:efh%23Ts2IFHA.2648@TK2MSFTNGP14.phx.gbl...
>I have a programmer on my staff who is giving me difficulties.
>
> we have a database server (progress), win2000 sp4, quad processor, 4gb
> ram, raid 0+1 (3 drives stripe set, mirrored) with 4k default cluster
> size.
>
> this server's disk I/O performance is abysmal.
>
> First, the guy argues that there's any value to defragging (he think's
> there is none).
>
> second, i mentioned since we are using a db application that supports 8k
> or greater cluster sizes, that we should consider increasing the cluster
> size on the volume.
>
> so here's my delimma. I need to find an article to PROVE to him that
> larger cluster sizes CAN have a performance increase in situations like
> this.
>
> does any one have any links on articles about cluster size and
> performance?
>
> i tried googling, and all i can come up with is just generic or non
> specific info. In other words, they say that increasing cluster size COULD
> increase performance for large file size (multimedia, database). but none
> of them actually PROVE it.
>
> i need to find a doc that i can show him to stuff up his
> littlee...uh...you get the picture.
>
> anyways, i suggested we take and do an in house performance test: setup a
> test server, 4k clusters, benchmark. set to 8/16k cluster size, benchmark,
> compare. he thinks it is a waste of time (mainly because it might prove
> that I know what i'm talking about).
>
> anyways, i've since learned that this guy won't take my word for it no
> matter what I say. so i have to find a documented expert source to prove
> it.
>
> i searched MS's kb. couldn't find anything.
>
> i'm so desperate to prove to this guy that my suggestion to increase
> cluster size could help us with Disk i/o. (i've done enough perfmonning to
> demonstrate clearly it's not processor, memory, network. it's disk).
>
> we've got a 7gb database that can be set to 8/16k cluster sizess. i
> suggested we set the file system volume to 16/32/64, set the db
> approproately and give it a real world test.
>
> he's not interested.
>
> so....i have to find a reputable whitepaper / source that clearly shows
> larger cluster size in a db / ntfs file environment could help improve
> cluster size.
>
> if i'm wrong, then i've learned something. it seems to me that if a db has
> to read several pages of data 4k at a time, rather than 64k at a time,
> that we are not being efficient. it seems to me that if we could utilize
> larger clusters, it'd be a potential boost to disk performance.
>
> help me either 1) prove i'm wrong, or 2) help me get him to see the light.
> I hate situations like this. but i guess it's real world. they don't want
> to take my word for it. fine.
>
> HELP! I'm an IT Pro in distress who's being told he doesn't have a clue!
> help me find the clue!
>