Génie industriel - Rubrique Formation - 2022

Data Engineering techniics and responsibilities - 4GMC1411

  • Number of hours

    • Lectures 9.0
    • Projects -
    • Tutorials 6.0
    • Internship -
    • Laboratory works -
    • Written tests 1.0


    ECTS 1.5


Students will learn about data preparation methods for machine learning, knowledge engineering and text mining, and how to integrate them into data science projects.
Students will know how to manage their data, sort it, and organize it efficiently. They will be able to present relevant visualizations of their data and results. They will have acquired the behavior of a responsible and ethical data engineer.




B0 Introduction: Data Science Project Management
Steering data science projects, based on CRISP-DM
B1 Data handling & Data Engineer responsibilities (ethics, security, etc.)
B1.1 Technical data management
Data format, variable formats; basic operations (reads, writes; sorts; selections, projections, filters; merges)
B1.2 Technical management of results (visualization)
Types of graphics, principles of good visualization
Make and discuss technical choices and representations
B1.3 Societal management
Legal aspects (RGPD), sustainable aspects (risks on people [customer and staff] as well as environmental costs), security (who holds the data, spying...).
B1.4 Implementation : Micro-project


Students will have taken and validated the following courses: Probability and Statistics; Programming with R


This weighting is compatible with the organization of distance learning courses and exams

Continuous assessment marks (at least 2 TP marks: TP1 and TP2)
An individual examination grade: E1

Grade = 0.4*((TP1+TP2)/2) + 0.6*E1

Cette pondération est compatible avec une organisation des enseignements et des examens en distanciel

Note de contrôle continu : TP
Note d'examen individuelle : EX
Note = 0.4*TP + 0.6*EX


The course exists in the following branches:

  • Curriculum - Engineer student Master SCM - Semester 7
  • Curriculum - Engineer student Master PD - Semester 7
see the course schedule for 2024-2025

Additional Information

Course ID : 4GMC1411
Course language(s): FR

You can find this course among all other courses.


Elff, (2020), Data Management in R , SAGE publication
Nicholas J. Horton and Ken Kleinman , (2016), Using R and RStudio for Data Management, Statistical Analysis, and Graphics (second edition)