A similar interface to the executor service was added in Java 7 that included functionality to split a task into smaller subtasks and re-enqueue them into the thread pool. This is particularly useful for more intensive tasks and actually implements parallelism to do it (we’ll touch on this more in the next exercise).
This new interface is part of a Fork-Join framework and is called ForkJoinPool
, which distributes a task across several worker threads and waits for a result.
Another big difference is the type of task object we’ll be passing to this pool. Instead of implementing the Runnable
class, we’ll actually be extending either RecursiveAction
or RecursiveTask
depending on the implementation we want.
Here is the difference between the two:
RecursiveAction
: does not return any resultRecursiveTask
: returns a result
As the namesake suggests, the Fork-Join framework utilizes recursion to perform its subtask splitting, which we’ll limit by defining a set number of threads. This controls how many times the task is allowed to split itself since each subtask will be assigned to an individual thread.
Additionally, instead of calling execute()
as we did before to add new tasks to the thread pool queue, we’ll be calling invoke()
. Since we’re splitting a singular task across the worker threads, we only have to call invoke()
once, which means we can invoke the task in the same place we create the pool itself:
private static final int N = 4; ForkJoinExecutor pool = new ForkJoinPool(N); pool.invoke(task);
Instructions
For this exercise, we’re going to do something generally very computationally heavy. We’re going to compute the factorial of a very, very large number! Let’s go out on a limb and say… 5000!
!
That probably sounds scary, but don’t worry, a simple for
loop can actually compute this serially in less than 100 ms. The result however is so big that we need to utilize Java’s BigInteger
class to compute and display it, otherwise the normal primitives overflow.
MakeBigIntArray.java has been completed for you, and given the complexity of this exercise, most of RecursiveFactorial.java has also been provided. We’re going to focus on the overridden compute()
method inside RecursiveFactorial.java.
- In
compute()
, create anif
/else
statement. Since we’re utilizingRecursiveAction
to compute our results recursively, we’ll want a stopping point. To do this, make yourif
conditionlist.length == 1
. If this is true, setresult = list[0]
.
In the else
case, we have a lot more work to do. For instance, we need to perform the actual forking
and joining
as we reduce our problem set. But first, the reduction process.
- Inside your
else
statement, create a local int calledmidpoint
and calculate the midpoint index oflist
(you can letint
naturally truncate this). Using thismidpoint
, create two newBigInterger
arrays and set their values as follows:
BigInteger[] l1 = Arrays.copyOfRange(list, 0, midpoint); BigInteger[] l2 = Arrays.copyOfRange(list, midpoint, list.length);
- Now, the recursive part. Still within the
else
, create two newRecursiveFactorial
objects calledrf1
andrf2
and passl1
inside the first one’s parameter andl2
inside the second one. Once created,fork()
bothrf1
andrf2
thenjoin()
them in the same order after forking.
The act of doing this will cascade the recursive aspects of our code via creating new threads and waiting for them to finish.
- Lastly, add this as the final line of the
else
block:
result = rf1.result.multiply(rf2.result);
This performs the factorial multiplication each step of the way utilizing BigInteger
s mulitply()
method.
So, with all that uncontrollable forking, how do we actually limit how many threads exist at any given time? We’ll be using a ForkJoinPool
to do this. Think of this like the Executor
framework’s thread pool.
A quick way to avoid eating up too much RAM is to just make a call to the running JVM and ask it how many available processors it has. This way, the program will never try to use more resources than it has.
We can do this by calling this line and storing it in an int
called nThreads
:
int nThreads = Runtime.getRuntime().availableProcessors();
- In Main.java, print
nThreads
to get a grasp of how many processors the current JVM has during runtime (note: you don’t have to write any code for this step, you can just run Main.java).
Run your code and see what that number is.
Now that we’ve established a safe limit for our number of threads, create a
RecursiveFactorial
object calledrSum
in Main.java and pass ittest.getList()
. For context,test
currently contains an array ofBigInteger
objects that range from1
to5000
.After, create a
ForkJoinPool
calledpool
and pass itnThreads
. This will establish that safe upper bound we found earlier.To invoke the pool, call:
pool.invoke(rSum);
This acts like execute
from the Executor
framework and enqueues the recursive action we coded into the pool’s queue. The threads now process that task and continuously break it down into subtasks until it’s done.
Because we’ve embedded some joins into the recursive class being processed, calling invoke()
will naturally wait until all processing is done.
- After invoking the pool, create a
BigInteger
calledresult
and assign it the valuerSum.result
. Then printresult
as follows:
System.out.println("Pooled Result: " + result);
Run your code and compare the result to the serially printed result computed in the for
loop. Take note of the computed time in ms as well.
- Let’s see how the
ForkJoinPool
compares to serial processing. On the lines directly before and afterpool.invoke(rSum)
, add these:
start = clock.millis(); pool.invoke(rSum); stop = clock.millis();
- Print this time in ms on the line preceding the pooled result:
System.out.println("Time in ms: " + (stop - start));
Run your code and compare the speed difference between the methods.