Duration: 30 mins

Revision Questions:

  1. What is PDF?(
  2. What is CDF?(
  3. explain about 1-std-dev, 2-std-dev, 3-std-dev range?
  4. What is Symmetric distribution, Skewness and Kurtosis?(
  5. How to do Standard normal variate (z) and standardization?(
  6. What is Kernel density estimation?(
  7. Importance of Sampling distribution & Central Limit theorem.(
  8. Importance of Q-Q Plot: Is a given random variable Gaussian distributed?(
  9. What is Uniform Distribution and random number generators(
  10. What Discrete and Continuous Uniform distributions?(
  11. How to randomly sample data points?(
  12. Explain about Bernoulli and Binomial distribution?(
  13. What is Log-normal  and power law distribution?(
  14. What is Power-law & Pareto distributions: PDF, examples(
  15. Explain about Box-Cox/Power transform?(
  16. What is Co-variance?(
  17. Importance of Pearson Correlation Coefficient?(
  18. Importance Spearman Rank Correlation Coefficient?(
  19. Correlation vs Causation?(
  20. What is Confidence Intervals?(
  21. Confidence Interval vs Point estimate?
  22. Explain about Hypothesis testing?(
  23. Define Hypothesis Testing methodology, Null-hypothesis, test-statistic, p-value?(
  24. How to do K-S Test for similarity of two distributions?(

Self Learning:

  1. You are given a data set. The data set has missing values which spread along 1 standard deviation from the median. What percentage of data would remain unaffected? Why?(
  2. You are given a data set consisting of variables having more than 30% missing values? Let’s say, out of 50 variables, 8 variables have missing values higher than 30%. How will you deal with them?(

