Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.
This key competency area deals with statistical errors, statistical hypothesis testing, regularization, among others.
- Commonly used error metrics - A statistical error is the (unknown) difference between the retained value and the true value. An understanding of some common error metrics: Mean Squared Error (MSE) Root Mean Square Error (RMSE) Mean Absolute Scaled Error (MASE).
- Bias/Variance - In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimates across samples can be reduced by increasing the bias in the estimated parameters.
- Type - 1 / Type - 2 - In statistical hypothesis testing, a type I error is the rejection of a true null hypothesis, while a type II error is the non-rejection of a false null hypothesis.
- Noise - Statistical noise refers to variability within a sample, stochastic disturbance in a regression equation, or estimation error. This noise is often represented as a random variable. Y = m (X ǀ θ) + ε, with E (ε) = 0, the random variable ε is called a disturbance or error term and reflects statistical noise.
- Regularization - Regularization refers to a wide variety of techniques used to bring structure to statistical models in the face of data size, complexity and sparseness. Regularization is used to allow models to usefully model such data without overfitting.
- Hypothesis Testing - A statistical hypothesis is a hypothesis that is testable on the basis of observed data modelled as the realised values taken by a collection of random variables. Familiarity with p-value and its applications, z-test, t-test (one and two sample), and chi-squared test.