AWS blog post with nice flowcharts / node graphs (very high-level)
Lessons from Using Spark to Process Large Amounts of Data – Part I