Génie industriel - Rubrique Formation - 2022

UE Data analytics for industrial engineering - WGUS2092

  • Number of hours

    • Lectures 17.25
    • Projects -
    • Tutorials 12.75
    • Internship -
    • Laboratory works -
    • Written tests 1.0

    ECTS

    ECTS 3.0

Goal(s)

Part I
Students will learn about data preparation methods for machine learning, and how to integrate them into data science projects.
Students will know how to manage their data, sort it, and organize it efficiently. They will be able to present relevant visualizations of their data and results. They will have acquired the behavior of a responsible and ethical data engineer.

Part II
The course introduces the first tools for processing quantitative and qualitative data using machine learning.
The analysis methods that will be learned allow for automatic classification; construction of predictive models; evaluation of the performance of the methods; diagnosis of the limits of the applications of these methods.

Responsible(s)

Iragael JOLY, Pierre LEMAIRE

Content(s)

Part I
.0 Introduction: Data Science Project Management
Steering data science projects, based on CRISP-DM
.1 Data handling & Data Engineer responsibilities (ethics, security, etc.)
.1.1 Technical data management
Data format, variable formats; basic operations (reads, writes; sorts; selections, projections, filters; merges)
.1.2 Technical management of results (visualization)
Types of graphics, principles of good visualization
Make and discuss technical choices and representations
.1.3 Societal management
Legal aspects (RGPD), sustainable aspects (risks on people [customer and staff] as well as environmental costs), security (who holds the data, spying...).
.1.4 Implementation : Micro-project

Part II
.1 Issues in machine learning, supervised machine learning (regression, classification)
supervised vs unsupervised methods (quick presentation of some unsupervised methods (k-means, dendrograms)).
.2 Regression and classification methods: linear regression and logistic regression; models, algorithms and resolutions
.3 Internal evaluation of regression and classification: Errors, residuals and prediction evaluation
.4 External Evaluation: Statistical Assumptions and Model Evaluation
.5 Implementation on different databases

Prerequisites

Students will have taken and validated the following courses: Probability and Statistics; Programming with R and Python

Test

This weighting is compatible with the organization of distance learning courses and exams

Session 1
UE_grade = 0.5*Grade1 + 0.5*Grade2
Part I - Data Engineering
Continuous assessment grade : TP (based on at least 1 grade TP1)
Individual examination grade : EX
Grade1 = 0.4*TP + 0.6*EX

Part II - Machine Learning
Continuous assessment grade : TP (based on at least 1 grade TP1)
Individual examination grade : EX
Grade2 = 0.3*TP + 0.7*EX

Session 2
If Grade1 > 10/20 or Grade2 > 10/20 then choice to keep the grade or a pass new test

UE_grade2 = 0.5*Grade1.2 + 0.5*Grade2.2

Part I - Data Engineering
Individual examination grade : EX (based on written or oral evaluation)
Grade1.2 = EX
Part II - Machine Learning
Individual examination grade : EX (based on written or oral evaluation)
Grade2.2 = EX

Calendar

The course exists in the following branches:

  • Curriculum - Master 1 GI SIE program - Semester 7
  • Curriculum - Master 1 GI program GID - Semester 7
see the course schedule for 2025-2026

Additional Information

Course ID : WGUS2092
Course language(s): FR

You can find this course among all other courses.

Bibliography

Elff, (2020), Data Management in R , SAGE publication
Nicholas J. Horton and Ken Kleinman , (2016), Using R and RStudio for Data Management, Statistical Analysis, and Graphics (second edition)
J.H. McDonald, (2009), Handbook of Biological Statistics, Sparky House Publishing.
I.H. Witten et E. Frank, (2005), DataMining – Practical machine learning tools and technics, Elsevier.
Stéphane Tufféry, (2005), Datamining et statistique Décisionnelle – L’intelligence dans les bases de données, Ed. Technip.
Cornillon et al., (2008), Statistiques avec R, Presses Universitaires de Rennes.
Gaël Millot, (2011), Comprendre et réaliser les tests statistiques à l'aide de R, 2ème édition, Editions De Boeck, 767 pages
Hill, Griffiths and Lim, (2011), Principles of Econometrics, Fourth Edition