how to perform an operation one time only at the end of a scalding job?
The execution order in Scalding job is a bit tricky:
- The initializer statements in the Job class are executed and operation tree is built (that connects Pipes, Taps etc.)
- The tree is handed off to the optimizer. The execution plan is created
- The job starts executing. Hadoop jobs' Map and Reduce steps are kicked off according to the plan
- The main program waits for everything to complete and exits.
According to your code, the println
statement will execute on step 1.