Course detail

Knowledge Discovery in Databases

FIT-ZZNAcad. year: 2023/2024

Data warehouses. Data mining techniques  association rules, classification and prediction, clustering. Mining unconventional data - data streams, time series and sequences, graphs, spatial and spatio-temporal data, multimedia. Text and web mining. Working-out a data mining project by means of an available data mining tool.

Language of instruction

Czech

Number of ECTS credits

5

Mode of study

Not applicable.

Entry knowledge

  • Knowledge of the basic steps of the data mining process and methods of data preparation for the step of data modelling (discussed in the subject UPA - Data Storage and Preparation).
  • Basic knowledge of probability and statistics.
  • Knowledge of database technology at a bachelor subject level. 

Rules for evaluation and completion of the course

  • A mid-term test - 15 points
  • Formulation of a data mining task - 5 points
  • Presentation of the project - 29 points
  • Final exam - 51 points
  • To be allowed to sit for written examination student is to present and defend project oucomes in due dates, and to earn at least 24 points during the semester.
  • The minimum number of points for the final examination is 20.

  • Mid-term written exam, there is no resit, excused absences are solved by the guarantor.
  • The formulation of the data mining task in the prescribed term, excused absences are solved by the assistent.
  • The presentation of the project results in the prescribed term, excused absences are solved by the assistent.
  • Final exam, The minimal number of points which can be obtained from the final exam is 20. Otherwise, no points will be assigned to the student. Excused absences are solved by the guarantor.

Aims

To familiarize students with the methods and algorithms of data modelling for knowledge discovery from it.
  • Students get a broad, yet in-depth overview of the field of data mining and knowledge discovery.
  • They are able both to use and to develop knowledge discovery tools.
  • Student learns terminology in Czech and English.
  • Student gains experience in solving projects in a small team.
  • Student improves his ability to present and defend the results of projects.

Study aids

Not applicable.

Prerequisites and corequisites

Not applicable.

Basic literature

Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann Publishers, 2012, 703 p., ISBN 978-0-12-381479-1.Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p., ISBN 1-55860-901-3.    

Recommended reading

Bishop, C.M: Pattern Recognition and Machine Learning. Springer, 2006, 738 p. ISBN 0387310738.
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Second Edition. Elsevier Inc., 2006, 770 p., ISBN 1-55860-901-3.
Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Third Edition. Morgan Kaufmann Publishers, 2012, 703 p., ISBN 978-0-12-381479-1.
Skiena, S.S.: The Data Science Design Manual. Springer, 2017, 445 p. ISBN 978-3-319-55443-3.
Zendulka, J. a kol.: Získávání znalostí z databází. FIT VUT v Brně, 160 s., 2009. (elektronicky)

Elearning

Classification of course in study plans

  • Programme IT-MSC-2 Master's

    branch MBS , 0 year of study, winter semester, compulsory-optional
    branch MPV , 0 year of study, winter semester, compulsory-optional
    branch MIS , 2 year of study, winter semester, compulsory-optional
    branch MSK , 2 year of study, winter semester, compulsory-optional
    branch MIN , 2 year of study, winter semester, compulsory
    branch MGM , 2 year of study, winter semester, elective
    branch MBI , 2 year of study, winter semester, compulsory
    branch MMM , 0 year of study, winter semester, elective

  • Programme MITAI Master's

    specialization NSPE , 0 year of study, winter semester, elective
    specialization NBIO , 0 year of study, winter semester, compulsory
    specialization NSEN , 0 year of study, winter semester, elective
    specialization NVIZ , 0 year of study, winter semester, elective
    specialization NGRI , 0 year of study, winter semester, elective
    specialization NADE , 0 year of study, winter semester, elective
    specialization NISD , 2 year of study, winter semester, compulsory
    specialization NMAT , 0 year of study, winter semester, elective
    specialization NSEC , 0 year of study, winter semester, elective
    specialization NISY up to 2020/21 , 0 year of study, winter semester, compulsory
    specialization NCPS , 0 year of study, winter semester, elective
    specialization NHPC , 0 year of study, winter semester, elective
    specialization NNET , 0 year of study, winter semester, elective
    specialization NMAL , 0 year of study, winter semester, elective
    specialization NVER , 0 year of study, winter semester, elective
    specialization NIDE , 0 year of study, winter semester, elective
    specialization NEMB , 0 year of study, winter semester, elective
    specialization NISY , 0 year of study, winter semester, compulsory

  • Programme RRTES_P Master's

    specialization RRTS , 2 year of study, winter semester, compulsory-optional

  • Programme MITAI Master's

    specialization NEMB up to 2021/22 , 0 year of study, winter semester, elective

Type of course unit

 

Lecture

39 hod., optionally

Teacher / Lecturer

Syllabus

  1. Data Warehouse and OLAP Technology for knowledge discovery.
  2. Mining frequent patterns and associations - basic concepts, efficient and scalable frequent itemset mining methods.
  3. Multi-level association rules, association mining and correlation analysis, constraint-based association rules.
  4. Predictive modelling - basic concepts, classification methods - decision tree, Bayesian classification, rule-based classification.
  5. Classification by means of neural networks, SVM classifier, Random forests.
  6. Other classification and regression methods. Evaluation of quality of classification and regression.
    Cluster analysis - basic concepts, types of data in cluster analysis.
  7. Partitioning-based and hierarchical clustering. Other clustering methods. Evaluation of quality of clustering.
  8. Outlier analysis. Mining in biological data.
  9. Introduction to mining data stream and time-series.
  10. Introduction to mining in sequences, graphs, spatio-temporal data, moving object data and multimédia data. 
  11. Text mining.
  12. Mining the Web. Process mining.
  13. Other selected topics (Process Mining, Recommender Systems, Big data analytics).

Project

13 hod., compulsory

Teacher / Lecturer

Syllabus

  • Working-out a data mining project by means of an available data mining tool.

Elearning