Génie industriel - Rubrique Formation - 2022

UE Data analytics for industrial engineering - WGUS2092

  • Number of hours

    • Lectures 17.25
    • Projects -
    • Tutorials 12.75
    • Internship -
    • Laboratory works -
    • Written tests 1.0

    ECTS

    ECTS 3.0

Goal(s)

Part I
Students will learn about data preparation methods for machine learning, knowledge engineering and text mining, and how to integrate them into data science projects.
Students will know how to manage their data, sort it, and organize it efficiently. They will be able to present relevant visualizations of their data and results. They will have acquired the behavior of a responsible and ethical data engineer.

Part II
The course introduces the first tools for processing quantitative and qualitative data using machine learning.
The analysis methods that will be learned allow for automatic classification; construction of predictive models; evaluation of the performance of the methods; diagnosis of the limits of the applications of these methods.

Responsible(s)

Pierre LEMAIRE, Iragael JOLY

Content(s)

Part I
.0 Introduction: Data Science Project Management
Steering data science projects, based on CRISP-DM
.1 Data handling & Data Engineer responsibilities (ethics, security, etc.)
.1.1 Technical data management
Data format, variable formats; basic operations (reads, writes; sorts; selections, projections, filters; merges)
.1.2 Technical management of results (visualization)
Types of graphics, principles of good visualization
Make and discuss technical choices and representations
.1.3 Societal management
Legal aspects (RGPD), sustainable aspects (risks on people [customer and staff] as well as environmental costs), security (who holds the data, spying...).
.1.4 Implementation : Micro-project

Part II
.1 Issues in machine learning, supervised machine learning (regression, classification)
supervised vs unsupervised methods (quick presentation of some unsupervised methods (k-means, dendrograms)).
.2 Regression and classification methods: linear regression and logistic regression; models, algorithms and resolutions
.3 Internal evaluation of regression and classification: Errors, residuals and prediction evaluation
.4 External Evaluation: Statistical Assumptions and Model Evaluation
.5 Implementation on different databases

Prerequisites

Students will have taken and validated the following courses: Probability and Statistics; Programming with R and Python

Test

This weighting is compatible with the organization of distance learning courses and exams

Part I
Continuous assessment grade : TP (based on at least 2 grades TP1 and TP2)
Individual examination grade : EX
Grade = 0.4*TP + 0.6*EX
Part II
Continuous assessment grade : TP (based on at least 2 grades TP1 and TP2)
Individual examination grade : EX
Grade = 0.4*TP + 0.6*EX

Cette pondération est compatible avec une organisation des enseignements et des examens en distanciel

Notes de contrôle continu (au moins 2 notes de TP: TP1 et TP2)
Une note d'examen individuelle: E1

Note = 0.4*((TP1+TP2)/2) + 0.6*E1

Calendar

The course exists in the following branches:

  • Curriculum - - Semester 7
  • Curriculum - Master 1 GI SIE program - Semester 7
see the course schedule for 2023-2024

Additional Information

Course ID : WGUS2092
Course language(s): FR

You can find this course among all other courses.

Bibliography

Elff, (2020), Data Management in R , SAGE publication
Nicholas J. Horton and Ken Kleinman , (2016), Using R and RStudio for Data Management, Statistical Analysis, and Graphics (second edition)
J.H. McDonald, (2009), Handbook of Biological Statistics, Sparky House Publishing.
I.H. Witten et E. Frank, (2005), DataMining – Practical machine learning tools and technics, Elsevier.
Stéphane Tufféry, (2005), Datamining et statistique Décisionnelle – L’intelligence dans les bases de données, Ed. Technip.
Cornillon et al., (2008), Statistiques avec R, Presses Universitaires de Rennes.
Gaël Millot, (2011), Comprendre et réaliser les tests statistiques à l'aide de R, 2ème édition, Editions De Boeck, 767 pages
Hill, Griffiths and Lim, (2011), Principles of Econometrics, Fourth Edition