Running dependent tasks in Ruby with TSort

I was intrigued to learn more about the TSort module after reading about it from a blog post.

In summary, TSort implements toplogical sorting and determine the order to run a list of dependencies. You may have come across such scenarios when using tools such as Bundler and Rake.

After using other task runners such as Grunt.js I wonder if I could implement a very simple task runner in Ruby using TSort:

 1 require 'tsort'
 3 TASKS={
 4   :one => [],
 5   :two => [],
 6   :three => [],
 7   :four => [:one, :three],
 8   :five => [:four, :two]
 9 }
11 def one
12   puts "Task one ran!"
13   command "date '+TASK 1 DATE: %m/%d/%y%nTASK 1 TIME:%H:%M:%S' > t1"
14 end
16 def two
17   puts "Task two ran!"
18   command "date '+TASK 2 DATE: %m/%d/%y%nTASK 2 TIME:%H:%M:%S' > t2"
19 end
21 def three
22   puts "Task three ran!"
23   command "date '+TASK 3 DATE: %m/%d/%y%nTASK 3 TIME:%H:%M:%S' > t3"
24 end
26 def four
27   puts "Task four ran!"
28   command 'cat t1 t3 > t4'
29 end
31 def five
32   puts "Task five ran!"
33   command 'cat t4 t2 > t5'
34 end
36 def command(arg)
37   puts "Running command: #{arg}"
38   system arg
39   puts
40 end
42 def task_to_run
43   yield :five
44 end
46 def dependent_tasks(t, &block)
47   TASKS[t].each(&block)
48 end
50 each_node = method(:task_to_run)
51 each_child = method(:dependent_tasks)
53 puts "Running tasks in the following order:"
55 puts TSort.tsort(each_node, each_child)
57 puts
59 TSort.each_strongly_connected_component_from(:five, each_child) {|scc|
60   method(scc.first).call
61 }

In the example, the TASKS hash constant defines the list of tasks that needs to be completed. The key refers to the task name and its value is a list of dependent tasks that needs to be completed before it can run.

For example, TASKS[:one] is an empty array which means it has no dependencies and can be run first. TASKS[:four] is dependent on the completion of tasks :one and :three. Each hash key is a symbol which corresponds to a local method.

We require ‘tsort’. Then we invoke the module method each_strongly_connected_component_from, which takes two arguments: the starting node, which in this case, is the task :five from the TASKS hash.

The second argument must be an object which responds to call. ‘each_child’ is a method object which references the ‘dependent_tasks’ method. The dependent_tasks method yields the children of each node into a block which is processed by TSort.

tsort returns a list of sorted nodes from children to parents.

If we run the above script, this is the output we see, with task :five set as the default:

 1 Running tasks in the following order:
 2 one
 3 three
 4 four
 5 two
 6 five
 8 Task one ran!
 9 date '+TASK 1 DATE: %m/%d/%y%nTASK 1 TIME:%H:%M:%S' > t1
11 Task three ran!
12 date '+TASK 3 DATE: %m/%d/%y%nTASK 3 TIME:%H:%M:%S' > t3
14 Task four ran!
15 cat t1 t3 > t4
17 Task two ran!
18 date '+TASK 2 DATE: %m/%d/%y%nTASK 2 TIME:%H:%M:%S' > t2
20 Task five ran!
21 cat t4 t2 > t5

From the output, TSort sees that TASK[:five] is dependent on tasks :four and :two. Task :four has two children: tasks :one and three. Hence, tasks :one and :three are executed first, followed by task :four. It proceeds to run task :two and since it has no child nodes, task :five is then run.

If any of the tasks encounter an exception, execution stops at that point and no further tasks are run. For example, if an error is raised in task one method call:

1 def one
2   puts "Task one ran!"
3   raise "Error!"
4   command "date '+TASK 1 DATE: %m/%d/%y%nTASK 1 TIME:%H:%M:%S' > t1"
5 end

none of the other tasks are executed:

 1 Task one ran!
 2 tasks.rb:13:in `one': Error! (RuntimeError)
 3   from tasks.rb:62:in `call'
 4   from tasks.rb:62:in `block in <main>'
 5   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:420:in `block (2 levels) in each_strongly_connected_component_from'
 6   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:420:in `block (2 levels) in each_strongly_connected_component_from'
 7   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:429:in `each_strongly_connected_component_from'
 8   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:419:in `block in each_strongly_connected_component_from'
 9   from tasks.rb:53:in `each'
10   from tasks.rb:53:in `dependent_tasks'
11   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:413:in `call'
12   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:413:in `each_strongly_connected_component_from'
13   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:419:in `block in each_strongly_connected_component_from'
14   from tasks.rb:53:in `each'
15   from tasks.rb:53:in `dependent_tasks'
16   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:413:in `call'
17   from /Users/chee/.rvm/rubies/ruby-2.2.0/lib/ruby/2.2.0/tsort.rb:413:in `each_strongly_connected_component_from'
18   from tasks.rb:61:in `<main>'

Although one could have structured the above using just arrays, TSort automatically works out where it should start from, rather than having to work it out manually.