Friday, April 6, 2012

Prerequisites Problem aka Task Priority with Lexicographic Order

People who are familiar with TopCoder might have run into the problem "Prerequisites". I won't rewrite the complete statement here as it is exclusive property of TopCoder, but I'll give my own definition of it.
The problem statement is relatively simple: N tasks are given; each task depends on a set of tasks that must be performed before the given one. The time required to perform a task is irrelevant and the tasks must be performed one at a time. The requirement is to sort the tasks so that all the dependencies are satisfied. In case of ties (i.e., two or more tasks could be performed at the same time), the tasks must be sorted according to a given lexicographic criterion (e.g., in alphabetical order). It's assumed that there are no conflicts between dependencies, i.e., the sorted list always exists and is unique.
By analyzing the structure of the input data, we easily recognize a directed acyclic graph (DAG) in which each task is a vertex and each edge represents a dependency of a task from another (A depends on B if and only if that there is an edge going from B to A). The requested solution is a particular topological sorting of the graph (examples of algorithms used for topological sorting are Kahn's algorithm or DFS). However, enforcing the lexicographic order to tied tasks requires complex topological sorting algorithms.
An interesting and very intuitive solution can be devised by applying the greedy strategy: always making the choice that looks best at the moment. This approach works if the problem has optimal substructure and I'll do my best to explain that it does in this case. Since there are no dependency conflicts (i.e., no cycles), there must be one or more tasks with no dependencies at all (no incoming edges). Once we found all of these tasks, we can sort them according to the lexicographic order to decide which one must be performed first. We append the first task in this sorted set to the output list. This first task will also be removed from the graph together with all its outgoing edges (i.e., the dependencies of all the other tasks from it: since this task is performed first, we can ignore all these dependencies). At this point we are left with one sub-problem that's absolutely analogous to the previous one (here's the optimal substructure!) and we can proceed to solve it as we did before, appending to the output list one task per sub-problem.
Follows a java implementation of this algorithm. There are N tasks. For simplicity, the tasks are identified by their name only. The input is composed of N strings. Each string is in the form "TASK: DEP1 DEP2 ... ", where "TASK" is the name of a task and "DEP1", "DEP2", etc. are its dependencies (there can be zero up to N-1). Each task will always appear only once as the head of the string.
 import java.util.Collections;  
 import java.util.Iterator;  
 import java.util.LinkedList;  
 import java.util.regex.Pattern;  
 public class TaskPriority {  
      public String[] orderTasks(String[] inputTasks) {  
           // this array will contain the sorted tasks  
           String[] sortedTasks = new String[inputTasks.length];  
           int index = 0;  
           // create a list of tasks and populate it  
           LinkedList<Task> tasks = new LinkedList<Task>();  
           for (String s : inputTasks)  
                tasks.add(new Task(s));  
           // list of independent tasks  
           LinkedList<Task> independentList = new LinkedList<Task>();  
           while (!tasks.isEmpty()) {  
                // find elements with no dependency, add them to the list of  
                // independent tasks and remove them to the task list  
                for (Iterator<Task> it = tasks.iterator(); it.hasNext();) {  
                     Task c = it.next();  
                     if (c.hasNoDependencies()) {  
                          // this task is independent  
                          independentList.add(c);  
                          // we can remove it from the list of tasks since it will  
                          // stay in the list of independent tasks instead  
                          it.remove();  
                     }  
                }  
                // sort the independent tasks  
                Collections.sort(independentList);  
                // poll the first independent task and store it into the return  
                // array  
                Task c = independentList.poll();  
                sortedTasks[index++] = c.toString();  
                // remove the dependencies to the task we just processed  
                for (Task a : tasks)  
                     a.removeDependency(c.toString());  
           }  
           // store the remaining tasks in the list of independent tasks into the  
           // output array. note that they are already properly sorted  
           while (!independentList.isEmpty())  
                sortedTasks[index++] = independentList.poll().toString();  
           return sortedTasks;  
      }  
      private final Pattern splitPattern = Pattern.compile("\\s");  
      private String[] getTokens(String s) {  
           return splitPattern.split(s);  
      }  
      private class Task implements Comparable<Task> {  
           private final String name;  
           private final LinkedList<String> dependencies;  
           @Override  
           public String toString() {  
                return name;  
           }  
           @Override  
           public boolean equals(Object o) {  
                if (o == null)  
                     return false;  
                if (this == o)  
                     return true;  
                if (o instanceof Task) {  
                     Task c = (Task) o;  
                     return name.equals(c.name);  
                }  
                return false;  
           }  
           public Task(String c) {  
                String[] tokens = getTokens(c);  
                name = tokens[0].substring(0, tokens[0].length() - 1);  
                dependencies = new LinkedList<String>();  
                for (int i = 1; i < tokens.length; i++) {  
                     dependencies.add(tokens[i]);  
                }  
           }  
           public boolean hasNoDependencies() {  
                return dependencies.isEmpty();  
           }  
           public boolean removeDependency(String s) {  
                return dependencies.remove(s);  
           }  
           public int compareTo(Task arg0) {  
                return this.name.compareTo(arg0.name);  
           }  
      }  
 }  
Note that the Task class might seem unnecessary since we are only dealing with strings and a simple alphabetical order. However, the lexicographic ordering can be more complex than that (it is in fact more complex in the original problem statement) and the Task class nicely encapsulate the properties of the graph making the algorithm much more readable and easy to understand.

No comments:

Post a Comment