Big Data means high volume, high velocity, high variety information and data veracity. To deal with big data challenges, new professions have emerged in the market: data engineers and data scientists. Data engineers are the builders and masters of various data-oriented information systems. Data scientists, on the other hand, are people who know how to extract knowledge or insights from these large volumes of data. In the first three weeks of course, you will learn basic of data engineering using Hadoop ecosystem tools. Last week will be dedicated to machine learning and data science basics.
During this course You will learn:
- Data engineer’s skill set that is needed for the builders and masters of various data-oriented information systems
- Different Hadoop ecosystem tools: HDFS, Hive, Spark, MapReduce, Sqoop etc.
- Basic programming, Scala, Spark
- SQL for big data: Hive
- noSQL: Hbase, MongoDB, Cassandra
- Streaming solutions like Flume, Kafka
- Work on a small project
- Cloud: AWS, Azure, Google
- Machine learning: data preparation, introduction to supervised and unsupervised learning, model development and evaluation.
Tools and platforms mentioned above are used on daily basis and ensures good and broad enough understanding to interact, work together with data engineers, data scientists, DevOps, cloud automation and data visualization specialists.
Pre-requirement for effective learning during course:
- Basic understanding of programming principles (e.g., Java, Python, Scala), some programming experience
- Basic understanding of SQL
- Basic understanding of shell scripting (basic Unix commands)
- List of file formats (“Text-based” sub-part.)
- Log file
- Programming languages and programming basics (e.g. Java, Python or Shell script)
- Basic Unix commands
- Machine learning
Apply until June 27th!