BDA 552 Model Building and ValidationMEF UniversityDegree Programs Big Data Analytics (English) (Non-Thesis)General Information For StudentsDiploma SupplementErasmus Policy Statement
Big Data Analytics (English) (Non-Thesis)
Master Length of the Programme: 1.5 Number of Credits: 90 TR-NQF-HE: Level 7 QF-EHEA: Second Cycle EQF: Level 7

Ders Genel Tanıtım Bilgileri

School/Faculty/Institute Gradutate School of Science and Engineering
Course Code BDA 552
Course Title in English Model Building and Validation
Course Title in Turkish Model Kurma ve Doğrulama
Language of Instruction EN
Type of Course Flipped Classroom
Level of Course Intermediate
Semester Summer School
Contact Hours per Week
Lecture: 3 Recitation: Lab: Other:
Estimated Student Workload 171 hours per semester
Number of Credits 7.5 ECTS
Grading Mode Standard Letter Grade
Pre-requisites None
Expected Prior Knowledge None
Co-requisites None
Registration Restrictions Only Graduate Students
Overall Educational Objective
Course Description This course will teach the students how to start from scratch in answering questions about the real world using data. The model building process involves setting up ways of collecting data, understanding and paying attention to what is important in the data to answer relevant business questions, finding a statistical, mathematical or a simulation model to gain understanding and make predictions. This process involves asking questions, gathering and manipulating data, building models, and ultimately testing and evaluating them. R is the primary programming tool for reading datasets and model building and validation.
Course Description in Turkish Bu ders öğrencilere verileri kullanarak gerçek dünyayla ilgili soruları yanıtlamaya sıfırdan nasıl başlayabileceklerini öğretecektir. Model oluşturma süreci, veri toplamanın yollarını oluşturmayı, ilgili iş sorularını yanıtlamak için verilerde neyin önemli olduğunu anlamayı ve bunlara dikkat etmeyi, anlayış kazanmak ve yapmak için istatistiksel, matematiksel veya bir simülasyon modeli bulmayı içerir.

Course Learning Outcomes and Competences

Upon successful completion of the course, the learner is expected to be able to:
1) Understanding the QMV Process
2) Investigating The Questioning Phase
3) Building In The Modeling Phase
4) Understanding The Validation Phase
Program Learning Outcomes/Course Learning Outcomes 1 2 3 4
1) Building on the skills acquired during the undergraduate degree, an improved and deepened level of expertise in the field of big data analytics related to machine learning.
2) Applied in-depth theoretical and practical knowledge in the fields of statistics, computation and computer science related to machine learning.
3) Extensive knowledge about the analysis and modeling methods used in machine learning and their limitations.
4) Ability to design and perform exploratory research based on analytics, modeling and experimentation; to generate solutions to complex situations encountered in this process and to interpret the results.
5) Ability to describe the analytics process and its results both verbally and in writing on national and international platforms within or outside of the field of machine learning.
6) Awareness of social, scientific and ethical values regarding the machine learning, processing, usage, interpretation and dissemination stages and in all related professional activities.
7) Professional awareness new and emerging applications in the machine learning field and an ability to demonstrate their uses.
8) Competence to act as a leader in multi-disciplinary teams, to develop big data-driven solutions to complex situations; to take responsibility.
9) Ability to communicate in English both verbally and in writing at European Language Portfolio General Level B2.
10) Understanding of social and environmental aspects of machine learning applications.

Relation to Program Outcomes and Competences

N None S Supportive H Highly Related
     
Program Outcomes and Competences Level Assessed by
1) Building on the skills acquired during the undergraduate degree, an improved and deepened level of expertise in the field of big data analytics related to machine learning. N
2) Applied in-depth theoretical and practical knowledge in the fields of statistics, computation and computer science related to machine learning. N
3) Extensive knowledge about the analysis and modeling methods used in machine learning and their limitations. N
4) Ability to design and perform exploratory research based on analytics, modeling and experimentation; to generate solutions to complex situations encountered in this process and to interpret the results. N
5) Ability to describe the analytics process and its results both verbally and in writing on national and international platforms within or outside of the field of machine learning. N
6) Awareness of social, scientific and ethical values regarding the machine learning, processing, usage, interpretation and dissemination stages and in all related professional activities. N
7) Professional awareness new and emerging applications in the machine learning field and an ability to demonstrate their uses. N
8) Competence to act as a leader in multi-disciplinary teams, to develop big data-driven solutions to complex situations; to take responsibility. N
9) Ability to communicate in English both verbally and in writing at European Language Portfolio General Level B2. N
10) Understanding of social and environmental aspects of machine learning applications. N
Prepared by and Date ,
Course Coordinator ÖZGÜR ÖZLÜK
Semester Summer School
Name of Instructor Asst. Prof. Dr. ŞİRİN ÖZLEM

Course Contents

Week Subject
1) Basic concepts (a simple example with basic eda and model validation)
2) Basic concepts (a simple example with basic eda and model validation)
3) Building a Linear Regression model with Python (a more detailed example with eda, feature engineering and cross-validation)
4) Building a Linear Regression model with Python (a more detailed example with eda, feature engineering and cross-validation)
5) Building a Tree Based Regression model with Python (an example with model tuning, model evaluation, cross validating time series data)
6) Building a Tree Based Regression model with Python (an example with model tuning, model evaluation, cross validating time series data)
7) Building a Boosted Tree Classifier model with python(an example with roc_auc, metrics, confusion matrix)
8) Building a Boosted Tree Classifier model with python(an example with roc_auc, metrics, confusion matrix)
9) Building an MLP model with Python (a basic implementation example with neural networks)
10) Building an MLP model with Python (a basic implementation example with neural networks)
11) Model Evaluation, Interpretation and Explainability (an example with feature importances, shap, comparison of different model outputs)Model
12) Model Evaluation, Interpretation and Explainability (an example with feature importances, shap, comparison of different model outputs)
13) Building an ensemble of models (diversity check, ensembles, stacks)
14) Building an ensemble of models (diversity check, ensembles, stacks)
15) Final Examination Period
16) Final Examination Period
Required/Recommended ReadingsJames, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 6). New York: springer. For the free book in pdf, programs and datasets are available at http://www-bcf.usc.edu/~gareth/ISL/ ● Stanford University Professor Andrew Ng’s Class Notes http://www.holehouse.org/mlclass/ ● Udacity Online Course : Model Building and Validation https://classroom.udacity.com/courses/ud919
Teaching MethodsThe lectures will be formatted as readings and powerpoint presentations; these will be available on Blackboard or Google Drive
Homework and ProjectsHomework and Projects
Laboratory WorkNone
Computer UseRequired
Other ActivitiesNone
Assessment Methods
Assessment Tools Count Weight
Quiz(zes) 1 % 25
Homework Assignments 1 % 25
Project 1 % 50
TOTAL % 100
Course Administration
02123953600

ECTS Student Workload Estimation

Activity No/Weeks Hours Calculation
No/Weeks per Semester Preparing for the Activity Spent in the Activity Itself Completing the Activity Requirements
Course Hours 14 2 1.5 49
Laboratory 14 2 1.5 49
Project 1 30 3 33
Homework Assignments 9 2 4 54
Total Workload 185
Total Workload/25 7.4
ECTS 7.5