Friday, March 30, 2012

Proposed Architecture

I would be most grateful for any comments regarding the following
proposed datawarehouse architecture:
1. Backroom Box
This is where all the ETL work will be done.
Daily & monthly imports of data from a number of primary data sources.
Lots of surrogate key transformations.
- Instance of SQL Server 2000 (Enterprise) with the Data Staging Area
database.
2. Frontroom Box
This is where the star-schema, cubes & web application (XLCubed Web)
will be.
- Instance of SQL Server (Enterprise) with the Datawarehouse database.
- Instance of Analysis Services.
- Instance of XLCubed Web.
All the fact tables in the star-schema will probably total less than
50Gb.
The updates (whether daily or monthly) are likely to be less than 1Gb.
There will be max. 50 concurrent users across 4 sites.
There will be about 20 cubes (most will be small; i.e. less than 1Gb)
The biggest will have a fact table of about 2Gb (less than 20m rows &
max. 60 columns).
There will be no distinct counts.
There will be no virtual cubes.
The bigger cubes will have many measures (about 40).
With about another 20 calculated measures/members.
The biggest cube will have about 12 dimensions (all having 2 or 3
levels).
Some dimensions will be parent-child.
Some of the smaller cubes will have writeback ability which will be
done through XLCubed Excel version.
Does it make any difference where SQL Agent is run from?
Backroom instance makes most sense?
Please advise on what servers to buy.
Processors, RAM, disk, etc.
What matters most for the backroom server? Processor speed?
What matters most for the frontroom server? RAM?
2*3Ghz CPU, 4Gb RAM, 250Gb disk?
Good estimate! This is quite normal to have two dedicated boxes for
Staging and Production. Now, you need to be careful about update rate.
If it is 1 GB per day then you may need more space or need to come up
with horizontal partition and clustering but 1 GB per month is alright
although you may need partition and clustering down the track.
Cubes are seems OK. Write back is really depends on your application
but as you now, you need to keep track of your write back values which
means merge them back to fact table with DTS when write back activity
happens. Quite normal for Budgeting and Forecasting cubes and should be
OK.
SQL Agent can be run from anywhere but I think backroom box makes more
sense, please check with your system admin people.
Regarding buying server, have you actually done any research based on
price and performance yet? If you not then I strongly suggest that you
consider HP Itanium Box. SQL Server needs at least 4 CPU so that it can
share resources and at least 4 GB of RAM which you planned anyway.
Itanium is a 64 bit box, specially built for Windows 2003 server and
SQL Server. Also, IBM blade is OK but very expensive. I found Itanium
is more practical and value for money. It will be around 20-25 thousand
US Dollars these days.
Front room box should have same or more in it since it be will accessed
all the time.
Your overall estimate is quite legitimate.
Hope this helps!
Regards.
|||In my case, I prefer to use 1 big server instead of 2 dedicated.
Why?
simple, generally the loading process run during the night when the user are
not here, second a big server load data more quickly, specially if there is
a lot of and complex transformation in the staging.
also, a big server provide better response time for the end user.
another advantage if there is any data quality issue I can do some job
during working hour in a shorter time due to the better performance.
also when there are some cubes between the database and the user, playing in
the database don't impact the users.
using enterprise edition of Windows 2003 you have a tool to manage the
resources on the server (CPU, memory). so you can insure enough resources
available for the end users during the day and maximize the staging part
during the night.
Itanium CPU are good, but Dual Core CPU are good too!
regarding the license model of SQL server (per processor), dual core is
interesting (you pay for 1 CPU but you have 2 CPUs...)
I have found the AMD opteron CPU very efficient with SQL Server.
focus only on 64bits or x64 systems with at least 16Gb of memory.
Also try to use SAS drives, you can plug SATA drives on these controllers,
so you can save historical data on low price SATA drives.
"Dip" <soumyadip.bhattacharya@.gmail.com> wrote in message
news:1140392471.587633.100670@.g44g2000cwa.googlegr oups.com...
> Good estimate! This is quite normal to have two dedicated boxes for
> Staging and Production. Now, you need to be careful about update rate.
> If it is 1 GB per day then you may need more space or need to come up
> with horizontal partition and clustering but 1 GB per month is alright
> although you may need partition and clustering down the track.
> Cubes are seems OK. Write back is really depends on your application
> but as you now, you need to keep track of your write back values which
> means merge them back to fact table with DTS when write back activity
> happens. Quite normal for Budgeting and Forecasting cubes and should be
> OK.
> SQL Agent can be run from anywhere but I think backroom box makes more
> sense, please check with your system admin people.
> Regarding buying server, have you actually done any research based on
> price and performance yet? If you not then I strongly suggest that you
> consider HP Itanium Box. SQL Server needs at least 4 CPU so that it can
> share resources and at least 4 GB of RAM which you planned anyway.
> Itanium is a 64 bit box, specially built for Windows 2003 server and
> SQL Server. Also, IBM blade is OK but very expensive. I found Itanium
> is more practical and value for money. It will be around 20-25 thousand
> US Dollars these days.
> Front room box should have same or more in it since it be will accessed
> all the time.
> Your overall estimate is quite legitimate.
> Hope this helps!
> Regards.
>
sql

No comments:

Post a Comment