Learn
Congratulations! You’ve just finished your first coding adventure with PySpark! In this lesson, we learned that:
- RDDs are the foundational data structure of Spark
- RDDs are fault-tolerant, partitioned, and operated on in parallel
- Transformations are lazy and do not execute until an action is called
We also learned how to:
- Transform and summarize RDDs with transformations and actions
- Send information to all nodes with broadcast variables
- Debug work with accumulator variables
Take this course for free
By signing up for Codecademy, you agree to Codecademy's Terms of Service & Privacy Policy.