Data Science Training by Experts
Our Training Process
Data Science - Syllabus, Fees & Duration
MODULE 1
- The Data Science Process
- Apply the CRISP-DM process to business applications
- Wrangle, explore, and analyze a dataset
- Apply machine learning for prediction
- Apply statistics for descriptive and inferential understanding
- Draw conclusions that motivate others to act on your results
MODULE 2
- Communicating with Stakeholders
- Implement best practices in sharing your code and written summaries
- Learn what makes a great data science blog
- Learn how to create your ideas with the data science community
MODULE 3
- Software Engineering Practices
- Write clean, modular, and well-documented code
- Refactor code for efficiency
- Create unit tests to test programs
- Write useful programs in multiple scripts
- Track actions and results of processes with logging
- Conduct and receive code reviews
MODULE 4
- Object Oriented Programming
- Understand when to use object oriented programming
- Build and use classes
- Understand magic methods
- Write programs that include multiple classes, and follow good code structure
- Learn how large, modular Python packages, such as pandas and scikit-learn, use object oriented programming
- Portfolio Exercise: Build your own Python package
MODULE 5
- Web Development
- Learn about the components of a web app
- Build a web application that uses Flask, Plotly, and the Bootstrap framework
- Portfolio Exercise: Build a data dashboard using a dataset of your choice and deploy it to a web application
MODULE 6
- ETL Pipelines
- Understand what ETL pipelines are
- Access and combine data from CSV, JSON, logs, APIs, and databases
- Standardize encodings and columns
- Normalize data and create dummy variables
- Handle outliers, missing values, and duplicated data
- Engineer new features by running calculations • Build a SQLite database to store cleaned data
MODULE 7
- Natural Language Processing
- Prepare text data for analysis with tokenization, lemmatization, and removing stop words
- Use scikit-learn to transform and vectorize text data
- Build features with bag of words and tf-idf
- Extract features with tools such as named entity recognition and part of speech tagging
- Build an NLP model to perform sentiment analysis
MODULE 8
- Machine Learning Pipelines
- Understand the advantages of using machine learning pipelines to streamline the data preparation and modeling process
- Chain data transformations and an estimator with scikit- learn’s Pipeline
- Use feature unions to perform steps in parallel and create more complex workflows
- Grid search over pipeline to optimize parameters for entire workflow
- Complete a case study to build a full machine learning pipeline that prepares data and creates a model for a dataset
MODULE 9
- Experiment Design
- Understand how to set up an experiment, and the ideas associated with experiments vs. observational studies
- Defining control and test conditions
- Choosing control and testing groups
MODULE 10
- Statistical Concerns of Experimentation
- Applications of statistics in the real world
- Establishing key metrics
- SMART experiments: Specific, Measurable, Actionable, Realistic, Timely
MODULE 11
- A/B Testing
- How it works and its limitations
- Sources of Bias: Novelty and Recency Effects
- Multiple Comparison Techniques (FDR, Bonferroni, Tukey)
- Portfolio Exercise: Using a technical screener from Starbucks to analyze the results of an experiment and write up your findings
MODULE 12
- Introduction to Recommendation Engines
- Distinguish between common techniques for creating recommendation engines including knowledge based, content based, and collaborative filtering based methods.
- Implement each of these techniques in python.
- List business goals associated with recommendation engines, and be able to recognize which of these goals are most easily met with existing recommendation techniques.
MODULE 13
- Matrix Factorization for Recommendations
- Understand the pitfalls of traditional methods and pitfalls of measuring the influence of recommendation engines under traditional regression and classification techniques.
- Create recommendation engines using matrix factorization and FunkSVD
- Interpret the results of matrix factorization to better understand latent features of customer data
- Determine common pitfalls of recommendation engines like the cold start problem and difficulties associated with usual tactics for assessing the effectiveness of recommendation engines using usual techniques, and potential solutions.