With the large quantity of data being collected from web sites or the catalogue of previous orders made by clients, data mining provides unique insights into the data which may have previously not been seen. Although possible to draw conclusions from small samples of data, the larger the collection becomes and the more variables that are introduced the process of deducing a simple insight becomes an impossible task. Data mining provides the ability to work with any data set size and draw new unseen perspectives on the data. This course in data analytics is designed to provide the learner with the skills needed to collect data from any data source and extract useful insights which have previous been unseen, providing a unique view on the problems faced during the business decision making process.
In this programme, the learner will become familiar with a suite of different leading tools available to gather information from different sources and apply commonly used algorithms to deduce answers to common business questions from data, predominantly Rapidminer. Data mining aids the decision-making process by informing the key stake holders by relying on the most reliable source, the data available. This approach helps the decision-making process by making informed decisions before key changes or alterations are made to any business process.
This course is aimed at learners with no previous experience with data mining or data analysis and wish to begin the process of understanding how data is aggregated, cleaned and utilised for data mining processes.
This module aims to introduce the learner to the area of data mining and analytics by providing real world examples of business questions that can be encountered during day to day life, and how they can be solved using freely available data mining software packages.
On completion of this course, the learner will have acquired the skills to:
- Assess the needs of a customer and how they can be met with one or more developed data mining solutions
- Assess and aggregate available data sources to utilise during a data mining process
- Utilise industry standard methodologies for data mining, ensuring a robust process is created
- Develop a data mining process to identify anomalies, clean and extract quality data to run the identified algorithms on
- Run leading data mining software packages on available data to identify patterns and predict outcomes
- Document and visualise the findings to inform the business decision making processes
The programme is delivered through tutor led classes, concentrating on labs and hands on skills providing the learner first-hand experience with each of the approaches and technologies described during the classes. Topics Covered during the programme include:
- What is Predictive Analytics?
- The business case for data analytics
- The Data Mining Life Cycle
- KDD and CRISP-DM
- Overview of Supervised and Unsupervised Learning
- Overview of Classification and Regression
Data Preparation & Pre-Processing
- Data Types
- Exploratory Data Analysis
- Handling Missing Data – Removal vs Imputation
- Outlier Detection
- Noise Filtering
- Feature Selection
- k-Means & k-Mediods
- Association Rule Mining
- Self-Organising Maps
- Decision Trees
- k-Nearest Neighbours
- Naive Bayes
- Linear & Polynomial Regression
- Regression Trees
Validation and Testing
- Hold-out and Cross Validation
- Evaluating the performance of classifiers
- Evaluating the performance of regression models
- Tokenisation & N-Grams
- The Term Document Matrix
- Term Frequency – Inverse Document Frequency
- Sentiment Analysis
- Exploratory vs Explanatory Data Visualisation
- Quantitative Data Visualisation
- Qualitative Data Visualisation
Continuous Assessment will be utilised to assess student progression on this programme ensuring a high level of proficiency is achieved. All assessments for this programme are directly mapped to each of the practical tasks which will be explored during lectures and lab time.
This programme provides a strong foundation in Data Mining and Data Analytics. It is envisioned that graduates will be able to fulfil a wide range of entry-level roles within data mining industries, and/or engage in further study in a wide range of areas within Computing and Information Technology, and specifically programming, to further develop their careers.