Friday, March 30, 2012

proportional fill using multiple files

As part of my ongoing testing of the performance impact of multiple
files vs. a single file. (Here is the link, comments would be
appreciated http://linktrim.com/yq) I have come across some
inconsistencies in how "proportional fill" works. According to books
online
"Filegroups use a proportional fill strategy across all the files
within each filegroup. As data is written to the filegroup, Microsoft=AE
SQL Server=99 2000 writes an amount proportional to the free space in
the file to each file within the filegroup, rather than writing all the
data to the first file until full, and then writing to the next file.
For example, if file f1 has 100 megabytes (MB) free and file f2 has 200
MB free, one extent is allocated from file f1, two extents from file
f2, and so on. This way both files become full at about the same time,
and simple striping is achieved."
Based on some of my testing this is not what is occurring. I created 3
files in one filegroup for my DB. The first number is the file size,
the second number is the used size after the data load.
1=2E Test one
File1 - 3000 - 520
File2 - 2000 - 520
File3 - 2000 - 520
2=2E Test two
File1 - 3000 - 293
File2 - 2000 - 293
File3 - 3000 - 293
3=2E Test three
File1 - 4000 - 416
File2 - 2000 - 232
File3 - 2000 - 232
As you can see in test one and two the used size is the same across
files even thought the 3 data files are not the same size. Not until I
make one file twice as large as the others did I sea difference in
used space.
Can anyone explain this behavior?
Bert
Were all the files completely empty? What size is that in? Is it MB's,
GBs, KBs etc?
Andrew J. Kelly SQL MVP
"bert" <bertcord@.gmail.com> wrote in message
news:1107271506.112735.198080@.f14g2000cwb.googlegr oups.com...
As part of my ongoing testing of the performance impact of multiple
files vs. a single file. (Here is the link, comments would be
appreciated http://linktrim.com/yq) I have come across some
inconsistencies in how "proportional fill" works. According to books
online
"Filegroups use a proportional fill strategy across all the files
within each filegroup. As data is written to the filegroup, Microsoft
SQL ServerT 2000 writes an amount proportional to the free space in
the file to each file within the filegroup, rather than writing all the
data to the first file until full, and then writing to the next file.
For example, if file f1 has 100 megabytes (MB) free and file f2 has 200
MB free, one extent is allocated from file f1, two extents from file
f2, and so on. This way both files become full at about the same time,
and simple striping is achieved."
Based on some of my testing this is not what is occurring. I created 3
files in one filegroup for my DB. The first number is the file size,
the second number is the used size after the data load.
1. Test one
File1 - 3000 - 520
File2 - 2000 - 520
File3 - 2000 - 520
2. Test two
File1 - 3000 - 293
File2 - 2000 - 293
File3 - 3000 - 293
3. Test three
File1 - 4000 - 416
File2 - 2000 - 232
File3 - 2000 - 232
As you can see in test one and two the used size is the same across
files even thought the 3 data files are not the same size. Not until I
make one file twice as large as the others did I sea difference in
used space.
Can anyone explain this behavior?
Bert
|||Hey Andrew thanks for taking the time to respond. Yes the files were
completely empty, and the size is in MB. Here is the DDL for the
table.
CREATE DATABASE [testdb]
ON (NAME = N'testdb_Data',
FILENAME = N'e:\sqldata\mssql\data\testdb_Data.MDF' ,
SIZE = 3000, FILEGROWTH = 10%), (NAME = N'testdb_Data1',
FILENAME = N'e:\sqldata\mssql\data\testdb_Data1.NDF' ,
SIZE = 2000, FILEGROWTH = 10%), (NAME = N'testdb_Data2',
FILENAME = N'e:\sqldata\mssql\data\testdb_Data2.NDF' ,
SIZE = 3000, FILEGROWTH = 10%) LOG ON (NAME = N'testdb_Log',
FILENAME = N'g:\sqldata\mssql\data\testdb_Log.LDF' ,
SIZE = 500, FILEGROWTH = 10%)
I reran the tests a second time as I was wondering if checking the
batch size for my bulk insert would have an affect. I ran the test
twice once without setting the batch size and a second time of setting
the batch size of 10000. Both tests resulted in the data files being
filled equally, even though the second file is smaller than the others.
|||OK this is a guess but makes a lot of sense when you think about it. The
results you had are listed below. Since the files are filled at the extent
level it can't split an extent and put half in one file and half in another.
So in your tests until you get differences in free space that are exact
multiples of each other it will round down. So with a file size of 2000
(file2) and 3000 (file3) it would allocate one extent to file 2 and one to
file 3. When you get double the free space as in File1 and file 2 it can
then allocate in a 2 to 1 fashion. I believe the algorithm has to be very
simple to keep it efficient. That and the fact it allocates in extents lend
to the behavior you see.
1. Test one
File1 - 3000 - 520
File2 - 2000 - 520
File3 - 2000 - 520
2. Test two
File1 - 3000 - 293
File2 - 2000 - 293
File3 - 3000 - 293
3. Test three
File1 - 4000 - 416
File2 - 2000 - 232
File3 - 2000 - 232
Andrew J. Kelly SQL MVP
"bert" <bertcord@.gmail.com> wrote in message
news:1107287428.961190.60820@.z14g2000cwz.googlegro ups.com...
> Hey Andrew thanks for taking the time to respond. Yes the files were
> completely empty, and the size is in MB. Here is the DDL for the
> table.
> CREATE DATABASE [testdb]
> ON (NAME = N'testdb_Data',
> FILENAME = N'e:\sqldata\mssql\data\testdb_Data.MDF' ,
> SIZE = 3000, FILEGROWTH = 10%), (NAME = N'testdb_Data1',
> FILENAME = N'e:\sqldata\mssql\data\testdb_Data1.NDF' ,
> SIZE = 2000, FILEGROWTH = 10%), (NAME = N'testdb_Data2',
> FILENAME = N'e:\sqldata\mssql\data\testdb_Data2.NDF' ,
> SIZE = 3000, FILEGROWTH = 10%) LOG ON (NAME = N'testdb_Log',
> FILENAME = N'g:\sqldata\mssql\data\testdb_Log.LDF' ,
> SIZE = 500, FILEGROWTH = 10%)
> I reran the tests a second time as I was wondering if checking the
> batch size for my bulk insert would have an affect. I ran the test
> twice once without setting the batch size and a second time of setting
> the batch size of 10000. Both tests resulted in the data files being
> filled equally, even though the second file is smaller than the others.
>
|||This is correct. In test 1, if you keep adding data, when the remaining free
space in file 1 versus file 2 become a ratio of 2:1, then we allocate 2
pages in file 1 for every page in file 2.
Please also note that when file becomes very large or when there are a lot
of concurrent transactions, we use some performance improvement technique
with a side effect that the free space calculation is not 100% accurate.
Wei Xiao [MSFT]
SQL Server Storage Engine Development
http://weblogs.asp.net/weix
This posting is provided "AS IS" with no warranties, and confers no rights.
"Andrew J. Kelly" <sqlmvpnooospam@.shadhawk.com> wrote in message
news:eHQruWNCFHA.2072@.TK2MSFTNGP10.phx.gbl...
> OK this is a guess but makes a lot of sense when you think about it. The
> results you had are listed below. Since the files are filled at the
> extent level it can't split an extent and put half in one file and half in
> another. So in your tests until you get differences in free space that are
> exact multiples of each other it will round down. So with a file size of
> 2000 (file2) and 3000 (file3) it would allocate one extent to file 2 and
> one to file 3. When you get double the free space as in File1 and file 2
> it can then allocate in a 2 to 1 fashion. I believe the algorithm has to
> be very simple to keep it efficient. That and the fact it allocates in
> extents lend to the behavior you see.
> 1. Test one
> File1 - 3000 - 520
> File2 - 2000 - 520
> File3 - 2000 - 520
> 2. Test two
> File1 - 3000 - 293
> File2 - 2000 - 293
> File3 - 3000 - 293
> 3. Test three
> File1 - 4000 - 416
> File2 - 2000 - 232
> File3 - 2000 - 232
>
> --
> Andrew J. Kelly SQL MVP
>
> "bert" <bertcord@.gmail.com> wrote in message
> news:1107287428.961190.60820@.z14g2000cwz.googlegro ups.com...
>
|||Ok I see so basically it will write equal pages to the files until the
free space is at a 2:1 ration. I wasn't thinking about it on an extent
by extent basis. I was thinking that the end result could be
calculated as such.
(FileSize/TotalFileGroupSizeFreeSpace) * TotalDataToLoad
So in this example
File1 - 3000 - 293
File2 - 2000 - 293
File3 - 3000 - 293
Results should be
File1 - 3000 - 329.625
File2 - 2000 - 219.75
File3 - 3000 - 329.625
Wei maybe you can answer this... as data is allocated at what level is
it determined what file should be written to? IS it at the extent
level? When using multiple files are teh extents grouped into larger
chunks based on file?
bert
|||Bert,
There are no cases that I know of where you would break apart an extent.
Each extent always has 8 pages allocated to it (whether or not they all
include data). An extent may have pages from multiple objects (Mixed
Extent) but this is rare and only used when the object has less than 8
pages. You can read and write at the page level but the page always lives
on an extent. So if you are writing to a file and the current extent is full
it will create a new extent in the proper file and begin writing to it. It
will not write half an extent here and half there so to speak.
Andrew J. Kelly SQL MVP
"bert" <bertcord@.gmail.com> wrote in message
news:1107362088.012050.183690@.l41g2000cwc.googlegr oups.com...
> Ok I see so basically it will write equal pages to the files until the
> free space is at a 2:1 ration. I wasn't thinking about it on an extent
> by extent basis. I was thinking that the end result could be
> calculated as such.
> (FileSize/TotalFileGroupSizeFreeSpace) * TotalDataToLoad
> So in this example
> File1 - 3000 - 293
> File2 - 2000 - 293
> File3 - 3000 - 293
> Results should be
> File1 - 3000 - 329.625
> File2 - 2000 - 219.75
> File3 - 3000 - 329.625
> Wei maybe you can answer this... as data is allocated at what level is
> it determined what file should be written to? IS it at the extent
> level? When using multiple files are teh extents grouped into larger
> chunks based on file?
> bert
>
|||I was wondering if it does the opposite...not smaller than an extent
but if the proportional fill ever allocates in groups of extents. For
example under load will it write 4 extents to file1 and then 4 extents
to file2 or does it always go extent by extent switching between files.
I am trying to figure out the best configuration for my new server. I
have a database that is about 3TB in size. We currently load about
300-500 million records a month into several tables and want to make
sure I set up the files and file groups correctly.
My new server has 2 PERC4 cards with 2 channels each. Each channel has
a RAID10 array that are each 1TB in size, this gives me four drives
with 1TB each F,G on one channel and H,I on another. I have other
drives for tempdb and tlog that are mirrors.
We are using partitioned tables for data with new tables created each
month.
The strategy I am leaning towards is as follows.
When new tables are created create 2 new file groups. For this month
they would be FebuaryData and FeburaryIndex, create one file in each
filegroup
F:\ FebuaryData
H:\ FeburaryIndex
Next month I would create
G:\MarchData
I:\MarchIndex
What I am trying to determine if there is any advantage to creating
multiple files within each file group, for example.
F:\ FebuaryData-File1
H:\ FeburaryIndex-File1
G:\ FebuaryData-File2
I:\ FebuaryData-File2
F:\ FebuaryData1
H:\ FeburaryIndex1
G:\ FebuaryData2
I:\ FeburaryIndex2
The current database has multiple files located on the same drives. I
have already determined that this does not give better performance.
Any input would be appreciated
Thanks
Bert
|||I suspect that it is possible they can allocate more than one extent at a
time per file when it makes sense. Wei Xiao said something to the effect
that they do make adjustments in certain circumstances. Is this going to
be a DW or a large OLTP system? Will you be doing more seeks or range
scans? I wouldn't worry as much about getting the data in as working with
it afterwards. While you may find that one technique is slightly faster
than others for loading the data it is usually a small percentage of the
overall time spent in a system. So one thing you need to decide is whether
the load time or the read time is more important. Then figure out how the
data will be read. If you do a lot of queries that scan ranges of indexes
it will most like benefit from separating the data from the indexes onto
different arrays. If you do a lot of scans (partially or totally) it may be
better to create 4 files for the filegroup and place one on each of the Raid
10's. Of coarse there are no hard and fast rules that you can ccount on in
this situation. It really depends a lot on how you use the data and most of
the time you can only find the optimal solution by trial and error. That
usually takes more time and resources than people want to give but that is
the way it goes. How long do you plan on keeping the data? Is it a yearly
process in which you drop one month out and bring in the next one? Having a
filegroup for each month in that case may not be too bad of an idea. It
allows you to backup at the filegroup level and that may help if the older
months don't have update activity. Again without knowing exactly how you
will use the data it's a guess but why only create 1 file per file group?
If you have 4 drives why not 2 files per FG and place 1 file on each drive.
Two for the indexes and 2 for the data.
Andrew J. Kelly SQL MVP
"bert" <bertcord@.gmail.com> wrote in message
news:1107403443.038720.258700@.c13g2000cwb.googlegr oups.com...
>I was wondering if it does the opposite...not smaller than an extent
> but if the proportional fill ever allocates in groups of extents. For
> example under load will it write 4 extents to file1 and then 4 extents
> to file2 or does it always go extent by extent switching between files.
> I am trying to figure out the best configuration for my new server. I
> have a database that is about 3TB in size. We currently load about
> 300-500 million records a month into several tables and want to make
> sure I set up the files and file groups correctly.
> My new server has 2 PERC4 cards with 2 channels each. Each channel has
> a RAID10 array that are each 1TB in size, this gives me four drives
> with 1TB each F,G on one channel and H,I on another. I have other
> drives for tempdb and tlog that are mirrors.
> We are using partitioned tables for data with new tables created each
> month.
> The strategy I am leaning towards is as follows.
> When new tables are created create 2 new file groups. For this month
> they would be FebuaryData and FeburaryIndex, create one file in each
> filegroup
> F:\ FebuaryData
> H:\ FeburaryIndex
> Next month I would create
> G:\MarchData
> I:\MarchIndex
> What I am trying to determine if there is any advantage to creating
> multiple files within each file group, for example.
> F:\ FebuaryData-File1
> H:\ FeburaryIndex-File1
> G:\ FebuaryData-File2
> I:\ FebuaryData-File2
> F:\ FebuaryData1
> H:\ FeburaryIndex1
> G:\ FebuaryData2
> I:\ FeburaryIndex2
> The current database has multiple files located on the same drives. I
> have already determined that this does not give better performance.
> Any input would be appreciated
> Thanks
> Bert
>
|||>>I suspect that it is possible they can allocate more than one extent
at a[vbcol=seagreen]
I guess the main thing I wanted to know is when they would fill at
differnet rates. IT makes it import to plan ahead as I dont want to
run in to a situation where I have the files spread out but they are
not being written to at the same rate. Would make it hard to plan for
space.
[vbcol=seagreen]
OLTP, the system will do real time processing of firewall and ids event
data. A set of queries will always be runnign processing the data and
looking for attackers.
[vbcol=seagreen]
yep that is what we are doing. This is actually one of 4 servers that
will be loading and processing the same data. We are going to set up
the other servers to act as our "backup" and they will use differnet
indexing stratagies. This is why I am spending so much time trying to
make sure I make the right file layout choice.
[vbcol=seagreen]
Some of our queries will span several months. With 4 drives I was
thinking I could have month1 on adrive Index 1 on another and then do
the same for month 2. After thinking about this some more and looking
at some of the query patterns I have determined that most queries
target data in the last month so this layout would not be ideal.
Ok last question for you ;) ... I have decided on using 2 FileGroups
for now. LargeTables and LargeTablesIndex do I ...
A. Each FileGroup gets one file on each array for a total of eight
files
Pro - Extra Files distribute teh IO across all disk
Con - Index IO could impact Table IO and vs versa
F:\ LargeTables - File1, LargeTablesIndex -File1
H:\ LargeTables - File2, LargeTablesIndex -File1
G:\ LargeTables - File3, LargeTablesIndex -File1
I:\ LargeTables - File4, LargeTablesIndex -File1
B. Each Filegroup has 2 dedicated arrays
Pro - EAch filegroup has dedicated drives
Con - Not fully utilizing resources
F:\ LargeTables - File1
H:\ LargeTables - File2
G:\ LargeTablesIndex -File1
I:\ LargeTablesIndex -File1
I think I am leaning towards A... I am also going to perform some more
tests to back up this theory. I am almost done writing a multi
threading utility that can be used to load test the DB once this is
done it will make my testing much easier

No comments:

Post a Comment