Machine Learning for Big Data
Question 1
The appropriate machine learning technique of identifying handwritten characters [0 to 9] is the t-distributed Stochastic Neighbor Embedding (t-SNE).
Question 2
Naïve Bayes Classification is useful for large data sets. The method is used when there is an assumption of independence between predictors. Thus, the occurrence of one variable is assumed not to influence the presence of other variables or features.
Question 3
- Regression is a type of machine learning where the outcome of what is to be predicted is a continuous variable. The technique assumes a relationship between the observed and predicted variables.
- Residual is the difference between the observed values and the predicted values.
- A Kernel function is a mathematical function for transforming data from non-linear observations using linear algebra.
Question 4
Entropy is the metric measure of the level of impurity or uncertainty of a particular attribute while specifying the randomness of the data. Mutual Information for classification between a positive and negative class is employed when calculating the correlation coefficient between the two variables.
Question 5
Face recognition in machine learning uses various nodal points in the human face to correctly identify a person. Different machine learning algorithms are applied when a picture of a face is fed into the appropriate software. Therefore, the software will try to identify any new face based on the info that is learned and stored in the algorithm.