Technical Skill

Data Science

Data Science – Advanced

Duration: 65+ hours

Types of Analytics
Analytics Project Life Cycle

Introduction to R and Python
• Data Collection
• Surveys and Design

Statistical Analysis
• Data Types 1 – Continuous, discreet, categorical, count
• Data Types 2 – Qualitative & Quantitative
• Data Types 3 – Nominal, Ordinal, Interval, Ratio
• Random Variable
• Probability Distribution
• Balanced and Imbalanced Data Sets
• Central Tendency – Mean, Median Avg
• Dispersion – Variance, Range, Standard Deviation
• Normal Distribution and Standard Distribution
• Quantile-Quantile Plot
• Sampling Variation
• T-Distribution
• Central Limit Theorem
• Confidence Interval

Data Science with Applied Machine Learning
• Python and R for Data Representation – Visual & Graphical
• Measure of Skewness
• Measure of Kurtosis
• Graphical Techniques – Bar and Box plot, Histogram
• R Programming Language
• Python Programming language
• Studio and IDE
• Anaconda and Spyder

Linear Regression
• Principles of regression
• Scatter Diagram & Correlation
• R Shiny
• Python Flask
• Linear and Multiple regression
• Data Sets categories
• Quality Metrics and Diagnosis

Multi Nominal Regression
• Logit and Log Likelihood
• Category Baselining
• Nominal categorical data

Regularization Techniques

Data Mining
• Supervised Learning
• Unsupervised Learning
• Numeric – Euclidian, Manhattan, Mahala Nobis
• Binary Euclidean
• Simple matching co-efficient, Jacquard’s co-efficient
• Mixed co-efficient
• Linkages – Single, Complete and Average
• Hierarchical Clustering
• Agglomerative clustering

Non – Clustering – K-means and Measurements
• K -Modes and K – Medians
• Clustering Large Apps – CLARA
• Density Based Clustering

Dimension Reduction
• 2D Virtualization
• Matrix Algebra
• Decomposition of Matrix data

Data Mining Unsupervised
• Network Analytics
• Node strength
• Degree and Closeness
• Eigenvector, Adjacency
• Cluster co-efficient
• Market Basket Analysis
• Affinity Analysis
• Association Rules
• Apriori Algorithm
• Sequential Pattern mining
• Collaborative filtering
• Computation reduction techniques

Text Mining
• Sources of Data
• Pre-processing
• Corpus Matrix
• Term Matrix
• Corpus Word Clouds
• Sentiment Analysis
• Positive Word Clouds
• Negative Word Clouds
• Unigram
• Bigram
• Trigram
• Semantic Network
• Clustering
• Intro to Shell Scripting

Natural Language Processing
• LDA
• Topic Modelling
• Sentiment Extraction
• Lexicons
• Emotion Analysis

Machine Learning for Data Analysis
• K Value
• KNN Model
• Generalization Techniques
• Regulation Techniques

Classifier
• Probability
• Bayes Rule
• Naïve Bayes Classification
• Text Classification

Decision Tree and Random Forest
• Root Node
• Child Node
• Leaf Node
• Greedy Algorithm
• Entropy
• Ensemble Techniques
• Decision Tree
• Random Forest

Forecasting
• Introduction to time series data
• Steps of forecasting
• Components of time series data
• Scatter plot and Time Plot
• Lag Plot
• ACF – Auto-Correlation Function
• Correlogram
• Visualization principles
• Naive forecast methods
• Errors in forecast
• Metrics
• Model Based approaches
• Linear & Exponential Model
• Quadratic Model
• Additive & Multiplicative Seasonality
• Model-Based approaches
• AR (Auto-Regressive) model for errors
• Random walk
• ARMA (Auto-Regressive Moving Average)
• Order p and q
• Data-driven approach to forecasting
• Smoothing techniques
• Moving Average
• Exponential Smoothing
• Holts / Double Exponential Smoothing
• De-seasoning and de-trending
• Econometric Models
• Forecasting Best Practices with Python and R

R – Programming
• Variable in R
• R-Overview
• Vector
• Matrix
• Array
• List
• Data-Frame
• Operators in R
• Arithmetic
• Relational
• Logical
• Assignment
• Miscellaneous
• Conditional Statement
• Decision Making<
• IF Statement
 IF-Else Statement
 Nested IF-Else Statement
 Switch Statement
 Loops
 While Loop
 Repeat Loop
 For Loop
 Strings
 Functions
 User-defined Function
 Calling a Function
 Calling a Function without an Argument
 Calling a Function with an Argument
 Programming Statistical –
• Box Plots
• Bar Charts
• Histogram
• Pareto Chart
• Pie Chart
• Line Chart
• Scatterplot
 Importing –
• Read CSV Files
• Read Excel Files
• Read SAS Files
• Read STATA Files
• Read SPSS Files
• Read JSON Files
• Read Text Files

Tableau
 What is Data Visualization?

• Why Visualization came into Picture?
• Importance of Visualizing Data
• Poor Visualizations Vs. Perfect Visualizations
• Principles of Visualizations
• Tufte’s Graphical Integrity Rule
• Tufte’s Principles for Analytical Design
• Visual Rhetoric
• Goal of Data Visualization

Data Visualization Tool
• Introduction to Tableau
• What is Tableau? Different Products and their functioning
• Architecture Of Tableau
• Pivot Tables
• Split Tables
• Hiding
• Rename and Aliases
• Data Interpretation

Basic Chart types
• Text Tables, Highlight Tables, Heat Map
• Pie Chart, Tree Chart
• Bar Charts, Circle Charts
• Intermediate Chart
• Time Series Charts
• Time Series Hands-On
• Dual Lines
• Dual Combination
• Advanced Charts
• Bullet Chart
• Scatter Plot
• Introduction to Correlation Analysis
• Introduction to Regression Analysis
• Trendlines
• Histograms
• Bin Sizes in Tableau
• Data Connectivity in-depth understanding
• Joins
• Unions
• Data Blending
• Cross Database Joins
• Sets
• Groups
• Parameters
• Creating Calculated Fields
• Logical Functions
• Case-If Function
• ZN Function
• Else-If Function
• Ad-Hoc Calculations
• Quick Table Calculations
• Level of Detail (LoD)
• Fixed LoD
• Include LoD
• Exclude LoD

Open chat
Hello from Kanektify Academy.
How can we help you today?