Войти
  • 814Просмотров
  • 8 лет назадОпубликованоDatabricks

Testing Apache Spark—Avoiding the Fail Boat Beyond RDDs - Holden Karau

"As Spark continues to evolve, we need to revisit our testing techniques to support Datasets, streaming, and more. This talk expands on ""Beyond Parallelize and Collect"" (not required to have been seen) to discuss how to create large scale test jobs while supporting Spark's latest features. We will explore the difficulties with testing Streaming Programs, options for setting up integration testing, beyond just local mode, with Spark, and also examine best practices for acceptance tests. Session hashtag: #EUeco4" About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business. Read more here:  Connect with us: Website:  Facebook:  Twitter:  LinkedIn:  Instagram:  Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here.