Audience | Suitable for people who want: - To design highly scalable distributed systems that deal with big data expertise and using different open source tools,
- To understand how algorithms work and to create high performance algorithms,
- To work on processes such as collecting, parsing, managing, analyzing and visualizing complex big data projects,
- To decide the necessary hardware and software design needs and designing processes according to these decisions
|
Training Goals | - Learning the basics of big data history, Hadoop fundamentals and basic technologies in Hadoop ecosystem,
- Learning basic information about project life cycle, data collection, data evaluation, data transformation and data analysis,
- Learning the basic information about the distributed file system (HDFS) that constitutes the core Hadoop and the features and usage of YARN, which provides resource management,
- Planning the big data cluster setup, learning about big data cluster setup, configuration and management with Ambari,
- Learning general information about usage scenarios and basic components for Kafka and Nifi, which are the basis of data transfer technologies,
- Learning basic information about Flume and Sqoop used for data transfer to Hadoop environment,
- Learning basic information about Hive that enables running query scripts on files in the distributed file system,
- Learning the basics of SQL, DataFrame, Machine Learning and GraphX libraries with Spark, which is used to perform in-memory analysis and analytical studies on big data,
- Learning basic information about Pig Latin script language for data analysis,
- Learning basic information about Zookeeper, which is a service manager in the big data ecosystem, and Oozie services, which is a workflow scheduler,
- Learning basic information about NoSQL databases and their usage.
|