Big Data

[IBM Africa Skills Academy] Welcome to the Big Data Specialist course. This course provides all the student guides and exercises that are essential for the Big Data Specialist.
After the completion of each unit, you are required to take the checkpoint tests to mark your progress completed. This course presents a holistic approach to Big Data, taking both a top-down and a bottom-up approach to questions such as: What is Big Data? How do we tackle Big Data? Why are we interested in it? What is a Big Data platform?

The course emphasizes that we study Big Data to gain insight that will be used to get  people throughout the enterprise to run the business better and to provide better service to customers. Rather than a implementation of a single open-source systems such as Hadoop, the course recommends that Big Data should be processed in a platform that can handle the variety, velocity, and volume of data by using a family of components that require integration and data governance.  Big Data is NoHadoop (“not only Hadoop”) as well as NoSQL (“not only SQL”).

enroll now to IBM pre-requisite courses for certification !


Goals

After completing this course, students will be able to:

  • Describe Big Data concepts and architecture considerations.
  • Select an appropriate hardware platform for a Big Data.
  • Design and implement a big data handling using Map Reduce.
  • Design and implement a big data handling using Hive.
  • Design and implement a big data handling using HBase.
  • Design and implement a big data handling using Big SQL.
  • Design and implement a big data handling using JAQL.
  • Design and implement a big data handling using AQL and Text analytics.

Outline

  • Unit 1. Big Data Overview (30 minutes)
  • Unit 2. Hadoop big data analysis tool (45 minutes)
  • Unit 3. HDFS (60 minutes)
  • Unit 4. MapReduce (60 minutes)
  • Unit 5. Hadoop Query Languages (45 minutes)
  • Unit 6. Hive (45 minutes)
  • Unit 7. HBase (60 minutes)
  • Unit 8: Big SQL (45 minutes)
  • Unit 9. JAQL (60 minutes)
  • Unit 10. Application Development (45 minutes)
  • Unit 11. Browser-based data analytics tool (45 minutes)
  • Unit 12. Text Analytics (60 minutes)
  • Unit 13. AQL Syntax (45 minutes)
  • Unit 14. Streams (45 minutes)
  • Unit 15: Data Explorer overview (30 minutes)

Prerequisites and related courses

Basics about Relational Data Bases  and Data Warehouse and a certain taste for SQL programming are required for this course.

Having attended the courses on DB Administration and on Java development can be useful and reveal interesting.

Language and material

The classes will be given in French by default. Slides will be in French/ English and available in PDF.

Bibliography

  • coming soon

Tentative Schedule

Installing IBM InfoSphere BigInsights Quick Start Edition (Download)

DATE (DD/MM/YY) TIME CONTENT MATERIAL
Week 1 14:00 PM to 16:00 PM

14:00 PM to 16:00 PM

  • Unit 1. Big Data Overview (30 minutes)
  • Unit 2. Hadoop big data analysis tool (45 minutes)
  • Unit 3. HDFS (60 minutes)
  • Unit 4. MapReduce (60 minutes)
Week 2  14:00 PM to 16:00 PM

14:00 PM to 16:00 PM

  • Unit 5. Hadoop Query Languages (45 minutes)
  • Unit 6. Hive (45 minutes)
  • Unit 7. HBase (60 minutes)
Week 3 14:00 PM to 16:00 PM

14:00 PM to 16:00 PM

  • Unit 8: Big SQL (45 minutes)
  • Unit 9. JAQL (60 minutes)
  • Unit 10. Application Development (45 minutes)
Week 4  14:00 PM to 16:00 PM

14:00 PM to 16:00 PM

  • Unit 11. Browser-based data analytics tool (45 minutes)
  • Unit 12. Text Analytics (60 minutes)
  • Unit 13. AQL Syntax (45 minutes)
Week 5  14:00 PM to 16:00 PM

14:00 PM to 16:00 PM

  • Unit 14. Streams (45 minutes)
  • Unit 15: Data Explorer overview (30 minutes)