Spark Training | BULUT BİLİŞİM VE BÜYÜK VERİ ARAŞTIRMA LABORATUVARI

Duration of Training	3 Days
Prerequisites	To have knowledge of data, data analysis, mathematics, statistics, computer science, database, database query. To have basic Linux knowledge.
Audience	Suitable for people who want: Software developers, analysts and data scientists who need to apply data science and machine learning with Spark, To collect, analyze and interpret extremely big amounts of data, To use advanced analysis technologies, To use various analysis and reporting tools by collecting and analyzing data, identifying patterns, trends and relationships in data sets, who want to work on big amounts of data.
Training Goals	Learning the basic information about Python Programming, Learning the basics of big data history, Hadoop fundamentals and basic technologies in Hadoop ecosystem, Learning the basic information about the distributed file system (HDFS) that constitutes the core Hadoop and the features and usage of YARN, which provides resource management, Learning the basics of SQL, DataFrame, Machine Learning and GraphX libraries with Spark, which is used to perform in-memory analysis and analytical studies on big data, Learning the basics of using machine learning algorithms with Spark, which is used to perform in-memory analysis and analytical studies on big data.
Syllabus	Introduction to Python Big Data Basics Core Hadoop: HDFS and YARN Spark Architecture Spark Low Level API (RDD) Spark High Level API (DataFrame, Dataset, SQL) DataFrame and Dataset Persistence Spark Streaming Spark Structured Streaming Spark Distributed Processing Writing, Configuring, and Running Spark Applications Performance Tuning Spark ML Deep Learning with Spark