BDA 505 Big Data ManagementMEF UniversityDegree Programs Big Data Analytics (English) (Non-Thesis)General Information For StudentsDiploma SupplementErasmus Policy Statement
Big Data Analytics (English) (Non-Thesis)
Master Length of the Programme: 1.5 Number of Credits: 90 TR-NQF-HE: Level 7 QF-EHEA: Second Cycle EQF: Level 7

Ders Genel Tanıtım Bilgileri

School/Faculty/Institute Gradutate School of Science and Engineering
Course Code BDA 505
Course Title in English Big Data Management
Course Title in Turkish Büyük Veri Yönetimi
Language of Instruction EN
Type of Course Flipped Classroom
Level of Course Introductory
Semester Fall
Contact Hours per Week
Lecture: 3 Recitation: Lab: Other:
Estimated Student Workload 174 hours per semester
Number of Credits 7.5 ECTS
Grading Mode Standard Letter Grade
Pre-requisites None
Expected Prior Knowledge Querying on RDBMS systems
Co-requisites None
Registration Restrictions Only Graduate Students
Overall Educational Objective To learn the basic designing relational database and big data systems.
Course Description Big Data Analysis is the hot topic job nowadays. But it’s a big problem too. In this lesson’s aim is how to query on RDBMS and Big data ecosystems products, designing modern edition data warehouses and managing massively parallel processing data warehouse technologies on cloud platforms. We will start to ask a few questions: What’s the problem of data world. What’s the technologies? Why does this technology exist and why do I need it? How can I get the best out of it utilizing something familiar like SQL. How can I design and query on RDBMS system, Hadoop ecosystem products like Pig Latin, Hive, Spark etc. and MPP products like Azure SQL DW, AWS Redshift, Azure Stream Analytics, Big Data Lake Analytics etc.
Course Description in Turkish Bu ders büyük veri analizine giriş olarak tasarlanmıştır. Günümüzde gittikçe önem kazanan ve aynı zamanda bir problem olarak karşımıza çıkan büyük veri yapılarının tasarlanması ve üzerinde çalışılması hedeflenmektedir. Derse bir kaç soru ile başlayacağız. Gerçek dünyanın veri dünyasındaki problemleri nelerdir ve bu problemler için hangi teknolojiler mevcuttur. Hangi problemde hangi teknolojileri ve muadillerini kullanmalıyız. Veritabanı ve büyük veri ekosistemindeki ürünlerin tasarlanması ve sorgulanması için gerekli programlama dillerinin öğrenilmesi ve ürünlerin kurulumlarının gerçekleştirilerek üzerinde gerçek dünya ile ilgili problemlerin çözümlerinin sunulması.

Course Learning Outcomes and Competences

Upon successful completion of the course, the learner is expected to be able to:
1) Design and querying OLTP Systems, some of NoSQL Products and Hadoop products.
2) Selecting the right products (compare other products) to related with problems.
3) Designing of modern data warehouse architecture.
4) Execution on real world problems and data systems
Program Learning Outcomes/Course Learning Outcomes 1 2 3 4
1) Building on the skills acquired during the undergraduate degree, an improved and deepened level of expertise in the field of big data analytics related to machine learning.
2) Applied in-depth theoretical and practical knowledge in the fields of statistics, computation and computer science related to machine learning.
3) Extensive knowledge about the analysis and modeling methods used in machine learning and their limitations.
4) Ability to design and perform exploratory research based on analytics, modeling and experimentation; to generate solutions to complex situations encountered in this process and to interpret the results.
5) Ability to describe the analytics process and its results both verbally and in writing on national and international platforms within or outside of the field of machine learning.
6) Awareness of social, scientific and ethical values regarding the machine learning, processing, usage, interpretation and dissemination stages and in all related professional activities.
7) Professional awareness new and emerging applications in the machine learning field and an ability to demonstrate their uses.
8) Competence to act as a leader in multi-disciplinary teams, to develop big data-driven solutions to complex situations; to take responsibility.
9) Ability to communicate in English both verbally and in writing at European Language Portfolio General Level B2.
10) Understanding of social and environmental aspects of machine learning applications.

Relation to Program Outcomes and Competences

N None S Supportive H Highly Related
     
Program Outcomes and Competences Level Assessed by
1) Building on the skills acquired during the undergraduate degree, an improved and deepened level of expertise in the field of big data analytics related to machine learning. H
2) Applied in-depth theoretical and practical knowledge in the fields of statistics, computation and computer science related to machine learning. H
3) Extensive knowledge about the analysis and modeling methods used in machine learning and their limitations. S
4) Ability to design and perform exploratory research based on analytics, modeling and experimentation; to generate solutions to complex situations encountered in this process and to interpret the results. S
5) Ability to describe the analytics process and its results both verbally and in writing on national and international platforms within or outside of the field of machine learning. H
6) Awareness of social, scientific and ethical values regarding the machine learning, processing, usage, interpretation and dissemination stages and in all related professional activities. H
7) Professional awareness new and emerging applications in the machine learning field and an ability to demonstrate their uses. H
8) Competence to act as a leader in multi-disciplinary teams, to develop big data-driven solutions to complex situations; to take responsibility. N
9) Ability to communicate in English both verbally and in writing at European Language Portfolio General Level B2. S
10) Understanding of social and environmental aspects of machine learning applications. N
Prepared by and Date ,
Course Coordinator ÖZGÜR ÖZLÜK
Semester Fall
Name of Instructor Öğr. Gör. SERHAT ÇEVİKEL

Course Contents

Week Subject
1) 1. History of Data 1.1 Definition of OLTP Systems 1.2 Why we need datawarehouse systems 1.3 Real world problems (volume, velocity, variety, variability) 2. Vendors of Big Data 2.1 Data Transformation Products 2.2 Data Visualization Products 2.3 Data Analytics Products 2.4 Cloud oriented MPP Systems 2.5 Volume Problem Oriented Vendors (Cloud Products & Open Source) 2.6 Velocity Problem Oriented Vendors (Cloud Products & Open Source) 2.7 Variety Problem Oriented Vendors (Cloud Products & Open Source) 2.8 Variability Problem Oriented Vendors (Cloud Products & Open Source) 2.9 Data Mining Vendors (Cloud Products & Open Source) 2.10 Automated Reporting Products
2) 2. Data Warehousing and Business Intelligence Insights 2.1 Designing and implementing a data warehouse 2.3 Developing Data Access and Transformation Layer (ETL) 2.4 Reporting Layer 2.5 Analytics Layer 2.6 Building Data Quality Solutions 2.6 Scenarios & solutions for Data warehouse
3) 3. Designing & Querying on OLTP Systems 3.1 Database Design 3.2 Learning SQL Query (CRUD Operations) 3.3 Logical Query Processing 3.3 Programmable SQL Objects 3.4 Aggregates and Analysis 3.5 Query Optimization and Understanding in-memory tables 3.6 T-SQL for Business Intelligence Practioners
4) 4. Designing ETL Layer 4.1 Introduction to Integration Services 4.2 Learning item list of Control Flow And Data Flow Tasks 4.3 Using Variables, Parameters and Expressions 4.4 Error and Event Handling 4.3 Data Cleansing Demos with Real World Dirty Data 4.4 ETL package monitoring and optimization 4.5 Special Design Scenarios
5) 3. Sql On Hadoop 5.1 Introduction of Hadoop Ecosystems And Azure HDInsight 5.2 Hive Architecture and Principles 5.4 Data Definition, Description And Selection using Hive Query Language 5.3 Advanced Data Analysis using Hive
6) 3. Real Time Analytics 3.7 Definition of Real Time Analytics 3.2 Ingestion Data into Event Hubs 3.8 Benefits and use cases of Azure Event Hub & Stream Analytics 3.4 Querying on Azure Stream Analytics 3.5 Case Study: Real Time Social Media Analytics 3.6 Data Visualization for Streaming Data on Power BI
7) 4. Working with Unstructured Data 4.1 Understanding the rationale of Pig 4.2 Writing Evalutation and Filter Functions 4.3 Developing and Testing Pig Latin Scripts 4.4 Real World Scenarios: Analyzing TV Series
8) 5. Massively Parallel Processing Products 5.1 Introduction to MPP Systems 5.2 Amazon Data Warehouse and Amazon Redshift Integration Projects 5.3 Azure Data Warehouse Overview
9) 5. Massively Parallel Processing Products 5.4 Designing & Querying Data on MPP Systems 5.5 Scalability & elasticity for Amazon Redshift and Azure SQL Data Warehouse
10) 6. Data Visualization 6.1 Creating Visualization and Dashboard Architecture 6.2 Visual Analytics with Microsoft Power BI 6.3 In Memory Analytics using Qlikview
11) 6. Advanced Analytics 6.4 Getting started with Azure Machine Learning 6.5 Using Azure ML Studio
12) 6. Advanced Analytics 6.6 Getting Data in and out of ML Studio 6.7 AWS Machine Learning vs. Azure Machine Learning 6.8 Advanced Analytics on SQL Server 2016 using R Script
13) 8. Big Data Lake Analytics 8.1 The Need for Data Lake 8.2 ADLA complements Hadoop systems
14) 8. Big Data Lake Analytics 8.3 Using C# with U-SQL
15) Final Ezamination Period
16) Final Examination Period
Required/Recommended ReadingsNone
Teaching MethodsFlipped Classroom/Exercise/Laboratory/Project
Homework and ProjectsStudents are required to complete a portfolio to be able to enter the final exam
Laboratory WorkThere will be laboratory session
Computer UseRequired
Other ActivitiesNone
Assessment Methods
Assessment Tools Count Weight
Quiz(zes) 8 % 20
Homework Assignments 1 % 30
Project 1 % 10
Final Examination 1 % 40
TOTAL % 100
Course Administration
02123953600

ECTS Student Workload Estimation

Activity No/Weeks Hours Calculation
No/Weeks per Semester Preparing for the Activity Spent in the Activity Itself Completing the Activity Requirements
Course Hours 14 2 1.5 49
Laboratory 14 2 1.5 49
Homework Assignments 9 2 1 27
Midterm(s) 1 30 30
Final Examination 1 30 3 33
Total Workload 188
Total Workload/25 7.5
ECTS 7.5