DSBA/ITCS 6162/8162

Knowledge Discovery in Databases - KDD


Prerequisites: ITCS6160, full graduate standing or content of the department.
Textbook (not required): "Introduction to Data Mining", by Pang-Ning Tan, Michael Steinbauch, Vipin Kumar, Addison Wesley.


Course Syllabus



Lectures (Class Videos available on LINK)

August 24

[1] Data Preprocessing

[2] Classification Trees, PDF

[3] Classification Trees(Video by L. Powell)

[4] Association Rules, PDF

[5] Association Rules (Video by L. Powell)

August 31
[1] LERS, PDF

[2] Granular Computing, PDF

[3] Reducts and Discretization, PDF

[4] Reducts(Video by L. Powell)

[5] Discretization(Video by L. Powell)

Sept 7
[1] Sample Problems

[2] Rough Set Exploration System (RSES)

September 14
[1] Mining Incomplete Data

[2] Support Vector Machine

[3] Bratko's ORANGE or WEKA

[4] Sample Problems

September 21
[1] Action Rules and Meta-Actions

[2] Action Rules Extraction Using Action Reducts

September 28
[1] Chase Algorithms

[2] Data Sanitization

[3] Exercise

October 5
[1] Clustering Methods

[2] Clustering(Video)

[3] TV Trees

[4] Clustering - Sample problems with solutions

October 19

[1] Clustering - Sample problems

[2] Sample Problems (Midterm Exam), Part I

October 26
1:00am-2:30pm (Lecture by Zbyszek Ras)
[2] Sample Problems (Midterm Exam), Part II & Project

2:45-3:45pm (Lecture by Sapna Pareek)
[2] Lisp Miner

November 2
Midterm Exam (on CANVAS)

November 9
[1] Solutions to Midterm Exam
[2] Project Questions & Answering Session

November 16
[2] Evaluation Methods

[3] Distributed Data and Big Data

[4] Sample Problems

November 23
(NO CLASS - VIDEOS ARE AVAILABLE IN A CLASS FOLDER ON GOOGLE DRIVE)
[1] Health Analytics Procedure Graph

[2] Business Analytics

November 30
[1] Art Analytics I

[2] Art Analytics II (Music)

December 7
Sample Problems (Final Exam)


December 14
Final Exam (on CANVAS)


Project
Project and LISp-Miner
Upload the project report and the dataset you created to Canvas or send them to
Sapna Pareek at [spareek@uncc.edu]
not later than December 3 (Friday), 2021
Project Rubric


Midterm (on Canvas): November 2
Final (on Canvas): December 14
Points: Midterm - 30 points, Final - 30 points, Project - 40 points

Grades: A [90-100], B [80-89], C [65-79].


Class Location - https://uncc.zoom.us/j/94554407742
Meeting Time: Tuesday, 1:00am-3:45pm



Instructor:       Zbigniew W. Ras

Office: Woodward Hall 430C
Telephone: 704-687-8574
e-mail: ras@uncc.edu Office Hours on ZOOM: https://uncc.zoom.us/j/98763508031
Tuesday: 11:30-12:45pm


GTA:       Sapna Pareek

Office Hours on ZOOM:
Monday: 5:00-7:00pm, Friday: 2:30-4:30pm
e-mail: spareek@uncc.edu

[1] Lisp Miner(by Jan Rauch)

[2] Lisp Miner Manual(by Jan Rauch's Student)

[3] Rough Set Exploration System (RSES)

[4] Software for data mining

[5] Repository of large datasets

[6] LERS vs ERID

[7] Extracting Rules from Incomplete Table

[8] Lance & Williams Distance

[9] Sample Problems for Midterm Exam