[Wiki] In computing, a data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis, and is considered as a core component of business intelligence. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data and are used for creating analytical reports for knowledge workers throughout the enterprise. Examples of reports could range from annual and quarterly comparisons and trends to detailed daily sales analysis.
The data stored in the warehouse is uploaded from the operational systems (such as marketing or sales). The data may pass through an operational data store for additional operations before it is used in the DW for reporting.
Best Project
2017:
RACHIDI-ALAOUI Ilham & DIOURI Mahmoud (Download here)
Goals
After completing this course, students will be able to:
- Describe data warehouse concepts and architecture considerations.
- Design and implement a data warehouse.
- Select an appropriate hardware/software platform for a data warehousing project.
- Implement Data Flow using Talend, Pentaho BA, Microsoft BI Suite …
- Implement Control Flow using Talend, Pentaho BA.
- Implement an ETL solution that supports incremental data extraction using Talend/Pentaho Data Integration.
- Implement an ETL solution that supports incremental data loading using Talend/Pentaho Data Integration.
- Implement data cleansing and enforce data integrity.
- Describe how IBM Cognos solutions can consume data from the data warehouse.
Outline
- Chapitre 1: Introduction to decision Information systems
- Chapitre 2: Data warehousing
- Chapitre 3: Multidimensional analysis
- Chapitre 4: Reporting (Optional training academy for IBM certification)
- Chapitre 5: Cas study
Prerequisites and related courses
Basics about Relational Data Bases and a certain taste for SQL programming are required for this course.
Having attended the courses on DB Administration and on Web development can be useful and reveal interesting.
Language and material
The classes will be given in French by default. Slides will be in French/ English and available in PDF.
Bibliography
- « Entrepôts de données, Guide paratique de modélisation dimentionnelle » Ralph Kimballe et Margy Ross, ed Vuibert, 2eme édition 2005.
- « Le Datawarehouse, Guide de conduite de projets » Ralph Kimballe, Laura Reeves, Margy Ross, ed Eurolles 2005.
- « Business intelligence avec SQL Server 2008 » Mise en oeuvre d’un projet décisionnel, Auteur : Bertrand Burquier, Editeur : Dunod, Septembre 2009
- « Pentaho Solutions Business Intelligence and Data Warehousing with Pentaho and MySQL » Roland Bouman Jos van Dongen, Wiley Publishing, Inc, Sep 2009
Tentative Schedule
DATE (DD/MM/YY) | CONTENT | material |
---|---|---|
Week 1 | Introduction to Decision Support Systems + Introduction to ETL Labs | Chapter 1 + Lab1 + Data for Lab |
Week 2 | Data warehousing + Exercices series n° 1 + Advanced ETL Labs | Chapter 2 + Exercices n°1 + Lab2 |
Week 3 | Design and implement a data warehouse + Exercices series n° 1 + GIMSI | Chapter 2 + Exercices n°1 + Lab2 |
Week 4 | Multidimensional Analysis + Exercices series n° 3 | Chapter 3 + Exercices n°2 + Lab3 + Dump |
Week 5 | IBM Cognos Business Analytics (training academy for IBM certification) | Introduction + Cognos BA Fundamentals 1.1 |
Week 6 | IBM Cognos Business Analytics (training academy for IBM certification) | Lab4 + Cognos BA Fundamentals 1.2 |
Week 7 | Cas study | Case study Pentaho BA + Script Dump |
New Project Download here:
Project requirements:
• Team of 2 students at most;
• The work must be provided as a report and CD containing the data and Oracle and talend scripts before: Sunday, May 22, 2017;
• The date of the individual oral evaluation will be communicated later.