Showing posts with label series. Show all posts
Showing posts with label series. Show all posts

Wednesday, March 21, 2012

promotional mailing response prediction

I am trying to use the time series algorithm to predict responses to promotion mailings for subscription renewals. The problem i am having is that response is largely influenced by the number of mailings that are sent out. Can anyone give me any ideas on how i can structure the dataset so that it would take into account how many promotions were sent out? any help would be greatly appreciated.

Frank

You could have two columns in your mining model, one containing the number of mailings sent out (and used as Input) and another one containing the number of responses for the same period (used as Predict or Predict Only). You could also create a derived, calculated column, representing responses as percentage of mailings

Actually, it may be worth it to create one model with the responses modeled as Predict and another one with responses as PredictOnly

|||

Thanks, Bogdan

more detail on what I'm trying to do:

I have promotional mailings that go out periodically for subscription renewals. The goal is to predict responses to the promotion by week from the mailing date of the promotion. What I would like to do is to create a model that looks at the returns from promotions that are similar, also taking into account some other attributes(Publication,MailQty,etc).

Below is how the test data looks:

Pub Source MailQty Week1 Week2 Week3 Week4 Week5 ..................... Week20

Test Renewal 12000 45 123 234 200 34

Test2 Direct 33000 23 67 577 411 122

Test3 Renewal 22000 73 167 237 221 55

What I need to do is to use this historical response data to predict response to a future promotion. Is this possible?

Frank

|||

My guess is that you really don't want to use the time series algorithm here. You can create a dataset that uses previous response rates, calendar information and the amount of promotions mailed to predict how many responses you are going to get. For example, creating columns such as

PromotionID
DayOfWeek
WeekOfMonth
MonthOfYear
PreviousResponseRate1
PreviousReponseRate2
CurrentMailingAmount
Response

I'm sure you can think of other related variables as well - I used previous response rates in the example, but you can just as easily use absolute numbers as well. To start I would use Decision Trees (actually will create a regression tree) to predict Response, and I would also make CurrentMailingAmount a forced regressor (a parameter available in enterprise edition), to guarantee that it appears in the regression equation.

If you have the calendar attributes as numbers (which makes sense), I would also turn off the REGRESSOR flag on those columns so you don't get equations like Response = 2.45 + 0.86 * Monday

HTH

-Jamie

|||

Thanks Jamie,

Not sure if my example was as clear as it could have been. In my test data example, Week1 represents the total responses that were received 1 week from the promotion mail date. week2 represents the responses that were received the 2nd week from the mail date. i need to predict the responses by week for a future promotion by modeling past promotions. I would not have any response data for the future promotion at that point, just the other attributes such as mail quantity, day of week, monthofyear, source, etc. is this possible?

Frank

|||

Of course! This is what you are trying to predict! The example I gave is simply a sample of how you should arrange your data. For example:

(PRR = PreviousResponseRatio, CMA = CurrentMailingAmount)

PromotionID

Pub

DayOfWeek

WeekOfMonth

MonthOfYear

PRR1

PRR2

CMA

Response

P1W1

Test

4

2

3

<week 3 response/week 3 mail qty>

<week 4 response/week 4 mail qty>

<week 5 mailiing amount>

<week 5 reponse>

P1W2

Test

3

3

3

<week 4 response/week 4 mail qty>

<week 5 response/week 5 mail qty>

<week 6 mailiing amount>

<week 6 response>

In this model, one row represents a weeks mailing for a single publication. As I said before you can add additional variables as you like, such as actual previous responses, rather than a ratio - you just have to experiment.

You have other options as well. You can not have the "Pub" column at all, you could make independant models for each Pub, or you could do as above and create a model using Pub as an input. The advantage of the first option is that you could use the model to predict responses for new pubs, but you may want to add an attribute such as "TotalNumberofWeeks Sent" that describes the "newness" of a campaign.

sql

promotional mailing response prediction

I am trying to use the time series algorithm to predict responses to promotion mailings for subscription renewals. The problem i am having is that response is largely influenced by the number of mailings that are sent out. Can anyone give me any ideas on how i can structure the dataset so that it would take into account how many promotions were sent out? any help would be greatly appreciated.

Frank

You could have two columns in your mining model, one containing the number of mailings sent out (and used as Input) and another one containing the number of responses for the same period (used as Predict or Predict Only). You could also create a derived, calculated column, representing responses as percentage of mailings

Actually, it may be worth it to create one model with the responses modeled as Predict and another one with responses as PredictOnly

|||

Thanks, Bogdan

more detail on what I'm trying to do:

I have promotional mailings that go out periodically for subscription renewals. The goal is to predict responses to the promotion by week from the mailing date of the promotion. What I would like to do is to create a model that looks at the returns from promotions that are similar, also taking into account some other attributes(Publication,MailQty,etc).

Below is how the test data looks:

Pub Source MailQty Week1 Week2 Week3 Week4 Week5 ..................... Week20

Test Renewal 12000 45 123 234 200 34

Test2 Direct 33000 23 67 577 411 122

Test3 Renewal 22000 73 167 237 221 55

What I need to do is to use this historical response data to predict response to a future promotion. Is this possible?

Frank

|||

My guess is that you really don't want to use the time series algorithm here. You can create a dataset that uses previous response rates, calendar information and the amount of promotions mailed to predict how many responses you are going to get. For example, creating columns such as

PromotionID
DayOfWeek
WeekOfMonth
MonthOfYear
PreviousResponseRate1
PreviousReponseRate2
CurrentMailingAmount
Response

I'm sure you can think of other related variables as well - I used previous response rates in the example, but you can just as easily use absolute numbers as well. To start I would use Decision Trees (actually will create a regression tree) to predict Response, and I would also make CurrentMailingAmount a forced regressor (a parameter available in enterprise edition), to guarantee that it appears in the regression equation.

If you have the calendar attributes as numbers (which makes sense), I would also turn off the REGRESSOR flag on those columns so you don't get equations like Response = 2.45 + 0.86 * Monday

HTH

-Jamie

|||

Thanks Jamie,

Not sure if my example was as clear as it could have been. In my test data example, Week1 represents the total responses that were received 1 week from the promotion mail date. week2 represents the responses that were received the 2nd week from the mail date. i need to predict the responses by week for a future promotion by modeling past promotions. I would not have any response data for the future promotion at that point, just the other attributes such as mail quantity, day of week, monthofyear, source, etc. is this possible?

Frank

|||

Of course! This is what you are trying to predict! The example I gave is simply a sample of how you should arrange your data. For example:

(PRR = PreviousResponseRatio, CMA = CurrentMailingAmount)

PromotionID

Pub

DayOfWeek

WeekOfMonth

MonthOfYear

PRR1

PRR2

CMA

Response

P1W1

Test

4

2

3

<week 3 response/week 3 mail qty>

<week 4 response/week 4 mail qty>

<week 5 mailiing amount>

<week 5 reponse>

P1W2

Test

3

3

3

<week 4 response/week 4 mail qty>

<week 5 response/week 5 mail qty>

<week 6 mailiing amount>

<week 6 response>

In this model, one row represents a weeks mailing for a single publication. As I said before you can add additional variables as you like, such as actual previous responses, rather than a ratio - you just have to experiment.

You have other options as well. You can not have the "Pub" column at all, you could make independant models for each Pub, or you could do as above and create a model using Pub as an input. The advantage of the first option is that you could use the model to predict responses for new pubs, but you may want to add an attribute such as "TotalNumberofWeeks Sent" that describes the "newness" of a campaign.

Monday, February 20, 2012

Programatically created DataFLow task fails

Hi,

I have a package in which i have programatically created dataflow task. It used to work fine but now it fails with series of errors out of which one is below

Error 30002: Type 'MainPipe' is not defined. Line 63 Columns 33-40 Line Text: Dim DataFlowTask As MainPipe = CType(DataFlowTaskHost.InnerObject, MainPipe)

It seems all the function and classes referd from following dlls is not working:

Microsoft.SqlServer.DTSRuntimeWrap.dll

Microsoft.SqlServer.DTSPipelineWrap.dll

I say so because i get the following error also:

Error 30652: Reference required to assembly 'Microsoft.SqlServer.DTSRuntimeWrap, Version=9.0.242.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' containing the type 'Microsoft.SqlServer.Dts.Runtime.Wrapper.IDTSConnectionManager90'. Add one to your project. Line 86 Columns 33-64 Line Text: DtsConvert.ToConnectionManager90(ChildPackage.Connections(""Source""))

These dlls are in GAC and in C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies with same version but still it gives above error.

Thanks

Mohit

Mohit,

If it used to work and now it doesn't (as you suggest) then if it were me I would try and track down what got changed in the interim.

-Jamie

|||

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

|||

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

|||

Jamie Thomson wrote:

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

Any pointers to the article resolving this specific error will be very helpful.

Mohit

|||

MohitGupta wrote:

Jamie Thomson wrote:

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

Any pointers to the article resolving this specific error will be very helpful.

Mohit

Have you added the reference that it talks about?

-Jamie

|||

Jamie Thomson wrote:

MohitGupta wrote:

Jamie Thomson wrote:

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

Any pointers to the article resolving this specific error will be very helpful.

Mohit

Have you added the reference that it talks about?

-Jamie

Yes the dlls are refered from the following location: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies

|||

MohitGupta wrote:

Yes the dlls are refered from the following location: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies

That's wrong. Reference them from the GAC.

-Jamie

|||

Jamie Thomson wrote:

MohitGupta wrote:

Yes the dlls are refered from the following location: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies

That's wrong. Reference them from the GAC.

-Jamie

When the dll reference is added there is no option to browse and add assembly. It shows .NET assemblies which are from GAC out of which both the dll have this path: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies in the property box. This is on my development machine. But in dtsx file, it does not have any reference of this path. Below is the code generated used in production server:

<Reference

Name = "Microsoft.SqlServer.DTSPipelineWrap"

AssemblyName = "Microsoft.SqlServer.DTSPipelineWrap"

/>

<Reference

Name = "Microsoft.SQLServer.DTSRuntimeWrap"

AssemblyName = "Microsoft.SqlServer.DTSRuntimeWrap"

/>

I assume it is coming from GAC.

-Mohit

|||

MohitGupta wrote:

Jamie Thomson wrote:

When the dll reference is added there is no option to browse and add assembly. It shows .NET assemblies which are from GAC out of which both the dll have this path: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies in the property box. This is on my development machine. But in dtsx file, it does not have any reference of this path. Below is the code generated used in production server:

<Reference

Name = "Microsoft.SqlServer.DTSPipelineWrap"

AssemblyName = "Microsoft.SqlServer.DTSPipelineWrap"

/>

<Reference

Name = "Microsoft.SQLServer.DTSRuntimeWrap"

AssemblyName = "Microsoft.SqlServer.DTSRuntimeWrap"

/>

I assume it is coming from GAC.

-Mohit

This "property box" that you speak of. What object does that contain the properties of? On my machine I check properties of Microsoft.SqlServer.DTSPipelineWrap in c:\windows\assembly and there is no mention of a path anywhere. I'm confused.

The .dtsx file will not contain paths to DLLs, that's the whole point of GACing them.

-Jamie

|||

Finally I got resolution to this issue Smile

Cause:

Security Update for Microsoft .Net Framework 2.0 (KB928365) is installed on the server which causes SSIS packages to fail (Specifically custom data flow tasks using GACed dll)

Resolution:

Install this KB: SQLServer2005-KB932557-x86-ENU.exe on the server.

Link: http://support.microsoft.com/kb/932557/

This KB while installing gives option to Install the patch only for SQL Server Integration Services 2005 and SQL Server Tools and Workstation Components 2005. This KB is not required to be installed for SQL Server Instances. (Restart not required)

Known Issue:

After installing this KB, few warnings may be displayed which are known issues with SSIS.

Thanks

Mohit

Programatically created DataFLow task fails

Hi,

I have a package in which i have programatically created dataflow task. It used to work fine but now it fails with series of errors out of which one is below

Error 30002: Type 'MainPipe' is not defined. Line 63 Columns 33-40 Line Text: Dim DataFlowTask As MainPipe = CType(DataFlowTaskHost.InnerObject, MainPipe)

It seems all the function and classes referd from following dlls is not working:

Microsoft.SqlServer.DTSRuntimeWrap.dll

Microsoft.SqlServer.DTSPipelineWrap.dll

I say so because i get the following error also:

Error 30652: Reference required to assembly 'Microsoft.SqlServer.DTSRuntimeWrap, Version=9.0.242.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' containing the type 'Microsoft.SqlServer.Dts.Runtime.Wrapper.IDTSConnectionManager90'. Add one to your project. Line 86 Columns 33-64 Line Text: DtsConvert.ToConnectionManager90(ChildPackage.Connections(""Source""))

These dlls are in GAC and in C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies with same version but still it gives above error.

Thanks

Mohit

Mohit,

If it used to work and now it doesn't (as you suggest) then if it were me I would try and track down what got changed in the interim.

-Jamie

|||

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

|||

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

|||

Jamie Thomson wrote:

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

Any pointers to the article resolving this specific error will be very helpful.

Mohit

|||

MohitGupta wrote:

Jamie Thomson wrote:

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

Any pointers to the article resolving this specific error will be very helpful.

Mohit

Have you added the reference that it talks about?

-Jamie

|||

Jamie Thomson wrote:

MohitGupta wrote:

Jamie Thomson wrote:

MohitGupta wrote:

Hi Jamie,

There is no change in code because it was deployed in production. Is there possibility of this error being caused by any SQL server service pack or hotfix on the server.

Thanks

Mohit

yes, its certainly a possibility. That's what I meant by identify what has changed. Code changes? Service pack changes? Windows Update changes? They're all changes that could *potentially* cause problems.

-Jamie

Any pointers to the article resolving this specific error will be very helpful.

Mohit

Have you added the reference that it talks about?

-Jamie

Yes the dlls are refered from the following location: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies

|||

MohitGupta wrote:

Yes the dlls are refered from the following location: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies

That's wrong. Reference them from the GAC.

-Jamie

|||

Jamie Thomson wrote:

MohitGupta wrote:

Yes the dlls are refered from the following location: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies

That's wrong. Reference them from the GAC.

-Jamie

When the dll reference is added there is no option to browse and add assembly. It shows .NET assemblies which are from GAC out of which both the dll have this path: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies in the property box. This is on my development machine. But in dtsx file, it does not have any reference of this path. Below is the code generated used in production server:

<Reference

Name = "Microsoft.SqlServer.DTSPipelineWrap"

AssemblyName = "Microsoft.SqlServer.DTSPipelineWrap"

/>

<Reference

Name = "Microsoft.SQLServer.DTSRuntimeWrap"

AssemblyName = "Microsoft.SqlServer.DTSRuntimeWrap"

/>

I assume it is coming from GAC.

-Mohit

|||

MohitGupta wrote:

Jamie Thomson wrote:

When the dll reference is added there is no option to browse and add assembly. It shows .NET assemblies which are from GAC out of which both the dll have this path: C:\Program Files\Microsoft SQL Server\90\SDK\Assemblies in the property box. This is on my development machine. But in dtsx file, it does not have any reference of this path. Below is the code generated used in production server:

<Reference

Name = "Microsoft.SqlServer.DTSPipelineWrap"

AssemblyName = "Microsoft.SqlServer.DTSPipelineWrap"

/>

<Reference

Name = "Microsoft.SQLServer.DTSRuntimeWrap"

AssemblyName = "Microsoft.SqlServer.DTSRuntimeWrap"

/>

I assume it is coming from GAC.

-Mohit

This "property box" that you speak of. What object does that contain the properties of? On my machine I check properties of Microsoft.SqlServer.DTSPipelineWrap in c:\windows\assembly and there is no mention of a path anywhere. I'm confused.

The .dtsx file will not contain paths to DLLs, that's the whole point of GACing them.

-Jamie

|||

Finally I got resolution to this issue Smile

Cause:

Security Update for Microsoft .Net Framework 2.0 (KB928365) is installed on the server which causes SSIS packages to fail (Specifically custom data flow tasks using GACed dll)

Resolution:

Install this KB: SQLServer2005-KB932557-x86-ENU.exe on the server.

Link: http://support.microsoft.com/kb/932557/

This KB while installing gives option to Install the patch only for SQL Server Integration Services 2005 and SQL Server Tools and Workstation Components 2005. This KB is not required to be installed for SQL Server Instances. (Restart not required)

Known Issue:

After installing this KB, few warnings may be displayed which are known issues with SSIS.

Thanks

Mohit