Number of hours
- Lectures 15.0
- Projects -
- Tutorials 15.0
- Internship -
- Laboratory works -
- Written tests 2.0
To understand the challenges of data analysis
To be able to structure the information for an adapted analysis
To be able to choose an analysis methodology adapted to the case study
To be able to implement a professional analysis on concrete data sets
To be able to interpret, understand and produce statistical results
To understand the limits of these approaches, and consider alternatives, extensions, etc.
The course is structured around case studies to be treated according to a rigorous scientific approach and to discover different facets of data analysis in an industrial engineering context.
The course addresses different forms of data analysis:
- Data mining (e.g., analysis of variance, principal component analysis, etc.)
- Data segmentation (e.g. clustering, decision rules, etc.)
- Supervised learning (e.g. regression, classification, survival analysis...)
In doing so, we will develop different aspects that are essential for a good analysis:
- structuration and manipulation of the information contained in multidimensional data for an adapted analysis, including the management of errors and other missing data.
- validation of the results obtained: validation method, indicators used, interpretation of results
- understanding the limitations of these approaches and their alternatives
Most of this course will be done through practical exercises and case studies using R/Rstudio software.
- Statistics (descriptive and summary statistics; estimation by moment method and maximum likelihood, confidence interval; test of mean and proportion)
- Manipulation de la donnée (2A)
- Introduction to R software
CC: continuous assessment (case studies, to be completed alone or in a group)
EX: individual examination at the end of the course
UE: final grade for the course
The jury may decide to allow students to progress to the next year, subject to deferred validation of this UE. This decision is exceptional; the jury has sovereignty over each student.
UE = 0.5*CC + 0.5*EX
Cette pondération est compatible avec une organisation des enseignements et des examens en distanciel
The course exists in the following branches:
- Curriculum - Engineer student Master SCM - Semester 8
- Curriculum - Engineer student Master PD - Semester 8
Course ID : 4GUL10A5
You can find this course among all other courses.
I.H. Witten et E. Frank, (2005), DataMining – Practical machine learning tools and technics, Elsevier.
Stéphane Tufféry, (2005), Datamining et statistique Décisionnelle – L’intelligence dans les bases de données, Ed. Technip.
Cornillon et al., (2008), Statistiques avec R, Presses Universitaires de Rennes.
Gaël Millot, (2011), Comprendre et réaliser les tests statistiques à l'aide de R, 2ème édition, Editions De Boeck, 767 pages
J.H. McDonald, (2009), Handbook of Biological Statistics, Sparky House Publishing.