Tuesday, March 20, 2012

Rerunning package with Parallel tasks

Hello,

I have done a search and have read some of the posts, but am left more confused than before. I am fairly new to SSIS. Here is my situation and what i am trying to accomplish.

I have a package that has a sequence container, in which there are multiple SQL tasks (about 20) running in parallel. I have checkpoints enabled, and FailPackageOnFailure enabled as well. If the package fails, when i re-run the package it will run the last task as well as all the other tasks. What I am looking to accomplish is when the package is re-run, have the SQL tasks that failed ran and not the previous successful tasks.

I think the best way would be via disabling tasks on successful completion of a task, where it writes the name of the SQL task to a temp table, but I am skeptical.

Can anyone point me in a direction to help me accomplish what I am looking for please.

Thanks in Advance.

As far as best practices in a situation like this I wouldn't know. Not sure how long each SQL task takes but have you thought of making the sequence containter have the transaction property set to required. If the containter fails then it would rollback and wouldn't matter if it everything ran over again.

|||I thought about this, however, this package takes about a day and half to run. Thus if I have to start over again, then I have lost all that time. It is more feasible to re-start the package at the point the package failed...|||You could use what I'll call "hard" checkpointing. That is, before each phase that you'd like to be able to restart from, you issue an Execute SQL task to write to a table. How you implement it would be up to you, but basically something like this:

TASK1 > TASK2 > TASK3 > TASK4

WOULD BE:

SQL (select step from checkpoint table) > TASK1 > SQL (update checkpoint table with new step value) > TASK2 > SQL (update checkpoint table with new step value) > TASK3 > SQL (update checkpoint table with new step value) > TASK4 ........ > SQL (Reset checkpoint table with default value so that next run starts at beginning)

Basically, when you start the package, you select from the checkpoint table the step number contained within. Store this number in a variable. Then, using precedence constraints right before each TASK, you can evaluate that variable to decide if the task should be skipped or not.

When a task completes successfully, it moves on to an Execute SQL task to update the "step" value in the checkpoint table, so that if the next task fails, when the package is restarted it can resume where it left off.

Make sense?|||

Thanks for the reply Phil.

From what you are proposing above, it looks very similar to running packages in sequence, where you have one task run then update a sql table, then move onto the next task. Basically, isnt that the purpose of checkpoints- to log to a file a restart point, thus if you added checkpoints to all 4 tasks, and it failed on task 3 it would restart at 3?

Your idea did get me to think. Rather than use "TASK" in your diagram, what if I used Sequence containers with tasks in it. But it still raises the issue of what happens if a task in the sequence container fails, and when the package is re-run to only execute those tasks that failed (after they have been modified)..

No comments:

Post a Comment