Technical Skill

Data Science

Data Science – Regular

Duration: 45+ hours

Types of Analytics
Analytics Project Life Cycle

Introduction to R and Python
• Data Collection
• Surveys and Design

Statistical Analysis
• Data Types 1 – Continuous, discreet, categorical, count
• Data Types 2 – Qualitative & Quantitative
• Data Types 3 – Nominal, Ordinal, Interval, Ratio
• Random Variable
• Probability Distribution
• Balanced and Imbalanced Data Sets
• Central Tendency – Mean, Median Avg
• Dispersion – Variance, Range, Standard Deviation
• Normal Distribution and Standard Distribution
• Quantile-Quantile Plot
• Sampling Variation
• T-Distribution
• Central Limit Theorem
• Confidence Interval

Data Science with Applied Machine Learning
• Python and R for Data Representation – Visual & Graphical
• Measure of Skewness
• Measure of Kurtosis
• Graphical Techniques – Bar and Box plot, Histogram
• R Programming Language
• Python Programming language
• Studio and IDE
• Anaconda and Spyder

Linear Regression
• Principles of regression
• Scatter Diagram & Correlation
• R Shiny
• Python Flask
• Linear and Multiple regression
• Data Sets categories
• Quality Metrics and Diagnosis

Multi Nominal Regression
• Logit and Log Likelihood
• Category Baselining
• Nominal categorical data

Regularization Techniques

Data Mining
• Supervised Learning
• Unsupervised Learning
• Numeric – Euclidian, Manhattan, Mahala Nobis
• Binary Euclidean
• Simple matching co-efficient, Jacquard’s co-efficient
• Mixed co-efficient
• Linkages – Single, Complete and Average

• Hierarchical Clustering
• Agglomerative clustering

Non – Clustering – K-means and Measurements
• K -Modes and K – Medians
• Clustering Large Apps – CLARA
• Density Based Clustering

Dimension Reduction
• 2D Virtualization
• Matrix Algebra
• Decomposition of Matrix data

Data Mining Unsupervised
• Network Analytics
• Node strength
• Degree and Closeness
• Eigenvector, Adjacency
• Cluster co-efficient
• Market Basket Analysis
• Affinity Analysis
• Association Rules
• Apriori Algorithm
• Sequential Pattern mining
• Collaborative filtering
• Computation reduction techniques

Text Mining
• Sources of Data
• Pre-processing
• Corpus Matrix
• Term Matrix
• Corpus Word Clouds
• Sentiment Analysis
• Positive Word Clouds
• Negative Word Clouds
• Unigram
• Bigram
• Trigram
• Semantic Network
• Clustering
• Intro to Shell Scripting

Natural Language Processing
• LDA
• Topic Modelling
• Sentiment Extraction
• Lexicons
• Emotion Analysis

Machine Learning for Data Analysis
• K Value
• KNN Model
• Generalization Techniques
• Regulation Techniques

Classifier
• Probability
• Bayes Rule
• Naïve Bayes Classification
• Text Classification

Decision Tree and Random Forest
• Root Node
• Child Node
• Leaf Node
• Greedy Algorithm
• Entropy
• Ensemble Techniques
• Decision Tree
• Random Forest

R – Programming
• Variable in R
• R-Overview
• Vector
• Matrix
• Array
• List
• Data-Frame
• Operators in R
• Arithmetic
• Relational
• Logical
• Assignment
• Miscellaneous
• Conditional Statement
• Decision Making<
• IF Statement
 IF-Else Statement
 Nested IF-Else Statement
 Switch Statement
 Loops
 While Loop
 Repeat Loop
 For Loop
 Strings
 Functions
 User-defined Function
 Calling a Function
 Calling a Function without an Argument
 Calling a Function with an Argument
 Programming Statistical –
• Box Plots
• Bar Charts
• Histogram
• Pareto Chart
• Pie Chart
• Line Chart
• Scatterplot
 Importing –
• Read CSV Files
• Read Excel Files
• Read SAS Files
• Read STATA Files
• Read SPSS Files
• Read JSON Files
• Read Text Files

• Visual Rhetoric
• Goal of Data Visualization

Data Visualization Tool
• Introduction to Tableau
• What is Tableau? Different Products and their functioning
• Architecture Of Tableau
• Pivot Tables
• Bin Sizes in Tableau

 

Open chat
Hello from Kanektify Academy.
How can we help you today?