Skip to main content

Resources

Blog Posts and Journals

  1. The Modern Data Stack Repository
  2. Medium Blog Posts - Data Engineering
  3. Start Data Engineering Blog Posts
  4. High Scalability
  5. The GitHub Blog
  6. Engineering at Quora
  7. Yelp Engineering Blog
  8. Twitter Engineering
  9. Facebook Engineering
  10. Yammer Engineering
  11. Etsy Code as Craft
  12. Foursquare Engineering Blog
  13. Airbnb Engineering
  14. WebEngage Engineering Blog
  15. LinkedIn Engineering
  16. The Netflix Tech Blog
  17. BankSimple Simple Blog
  18. Square The Corner
  19. SoundCloud Backstage Blog
  20. Flickr Code
  21. Instagram Engineering
  22. Dropbox Tech Blog
  23. Cloudera Developer Blog
  24. Bandcamp Tech
  25. Oyster Tech Blog
  26. THE REDDIT BLOG
  27. Groupon Engineering Blog
  28. Songkick Technology Blog
  29. Google AI Blog
  30. Google Developers Blog
  31. Pinterest Engineering Blog
  32. Twilio Engineering Blog
  33. Bitly Engineering Blog
  34. Uber Engineering Blog
  35. Godaddy Engineering
  36. Splunk Blog
  37. Coursera Engineering Blog
  38. PayPal Engineering Blog
  39. Nextdoor Engineering Blog
  40. Booking.com Development Blog
  41. Microsoft Engineering Blog
  42. Scalyr Engineering Blog
  43. Myntra Engineering Blog
  44. Fastly Blog
  45. AWS Architecture Blog
  46. Lyft Engineering Blog
  47. Wish Engineering
  48. Doordash Engineering
  49. SnowFlake Blog
  50. Palantir Blog
  51. Awesome Data Engineering

Data Engineering

  1. 97 Things Every Data Engineer Should Know
  2. Data Engineering with AWS [code]
  3. Data Engineering with Google Cloud Platform [code]
  4. Scalable Data Streaming with Amazon Kinesis [code]
  5. Fundamentals of Data Engineering
  6. Designing Data-Intensive Applications [code]
  7. Data Engineering with Python [code]
  8. Simplifying Data Engineering and Analytics with Delta [code]
  9. Azure Data Engineering Cookbook [code]
  10. Data Engineering with Apache Spark, Delta Lake, and Lakehouse [code]
  11. Data Pipelines Pocket Reference [code]
  12. Serverless Analytics with Amazon Athena [code]
  13. Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications [code]
  14. Apache Spark 3 for Data Engineering and Analytics with Python [code]
  15. Data Pipelines with Apache Airflow

Spark

  1. Mastering Big Data Analytics with PySpark [code]
  2. PySpark Cookbook [code]
  3. Learning Spark, 2nd Edition [code] [Alternative]
  4. Spark: The Definitive Guide [code]
  5. Spark Programming in Python for Beginners with Apache Spark 3 [code]
  6. Beginning Apache Spark 3: With DataFrame, Spark SQL, Structured Streaming, and Spark Machine Learning Library
  7. Data Algorithms with Spark [code]
  8. Scaling Machine Learning with Spark
  9. Modern Data Engineering with Apache Spark: A Hands-On Guide for Building Mission-Critical Streaming Applications
  10. Apache Spark 3 Advance Skills for Cracking Job Interviews
  11. Apache Spark 3 for Data Engineering and Analytics with Python [code]
  12. Advanced Analytics with PySpark [code]
  13. Spark in Action, Second Edition [code]
  14. High Performance Spark [code]
  15. Real-Time Stream Processing Using Apache Spark 3 for Python Developers [code]

Hadoop

  1. The Ultimate Hands-On Hadoop [code]
  2. Hadoop: The Definitive Guide, 4th Edition [code]
  3. Mastering Hadoop 3 [code]
  4. Sams Teach Yourself Hadoop in 24 Hours
  5. Modern Big Data Processing with Hadoop [code]
  6. Moving Hadoop to the Cloud [code]
  7. Hadoop with Python [code]