Tensor Decomposition
These 2 types of methods distinguish themselves based on their answer to the following question: "Will I use the same amount of memory to store the model trained on $100$ examples than to store a model trained on $10 000$ of them ? " If yes then you are using a parametric model. If not, you are using a non-parametric model.
-
Parametric:
- The memory used to store a model trained on $100$ observations is the same as for a model trained on $10 000$ of them .
- I.e: The number of parameters is fixed.
- Computationally less expensive to store and predict.
- Less variance.
- More bias.
- Makes more assumption on the data to fit less parameters.
- Example : K-Means clustering, Linear Regression, Neural Networks:
-
Non Parametric:
- I will use less memory to store a model trained on $100$ observation than for a model trained on $10 000$ of them .
- I.e: The number of parameters is grows with the training set.
- More flexible / general.
- Makes less assumptions.
- Less bias.
- More variance.
- Bad if test set is relatively different than train set.
- Computationally more expensive as it has to store and compute over a higher number of "parameters" (unbounded).
- Example : K-Nearest Neighbors clustering, RBF Regression, Gaussian Processes:
Practical : Start with a parametric model. It's often worth trying a non-parametric model if: you are doing clustering, or the training data is not too big but the problem is very hard.
Side Note : Strictly speaking any non-parametric model could be seen as a infinite-parametric model. So if you want to be picky: next time you hear a colleague talking about non-parametric models, tell him it's in fact parametric. I decline any liability for the consequence on your relationship with him/her .