Account for that data
The addition of zero’s in a set of data makes it challenging to account for that data. Consequently, the more the zeros are present in a set of data, the higher the likelihood of errors and an incorrect outcome from the data. Hence, in displaying the data making use of the graph will make it presentable and also very easy to look into contrasted with a situation when zeros are included, thereby making the graph look vague (Tan et al., 2016). More importantly, the technique of contemplating only the existence of non-zeros may not be worthwhile, especially when performing a clustering assessment. Hence, if the exact magnitude of those values is contemplated, the outcomes will demonstrate the valid number of clusters in the data set. An example of a situation where the existence of non-zeros can be utilized is the market basket. When looking at the number of zeros, one can presume the relationship of a small list of objects instead of assessing the entire inventory list.
Question 2
Typically, the K-means possess the time complexity feature in it. For each iteration, there will be distance calculation, distance comparison, calculation of centroids, and the number of operations. Therefore, as the number of clusters gets bigger, there will be an increase in time complexity in the steroid since both entities are directly proportional. More importantly, the change of K-means time complexity as the number of clusters rises linearly rises as that number of cluster rises. Hence as the number of clusters increases, the number of times complexity in the steroid will also increase since the two entities are directly proportional.
The following is the representation for O(m):
0(I x K x m x n)→time requirements.
I represent the number of iterations, K represents the number of clusters while m represents the number of points.
Question 3
Typically, testing clustering refers to an optimization problem that is mainly applied when solving numerous problems that may produce different solutions. The main disadvantage of treating clustering as an optimization issue is that it is inefficient. The advantage is that treating clustering as an optimization problem can be applied to problems that are difficult and complex to solve manually (Tan et al., 2016). Some of the examples of optimization algorithms include EM- clustering-mean and BIRCH. The optimization-based approach significantly captures every type of clustering. This can be deemed an optimization issue since all entities, such as non-determinism and efficiency, will be taken into account. The optimization-based approach is a more effective and suitable technique for approaching clustering. Thus, the number of attributes and the number of points remains the same as the number of K rises. In that regard, the K-means algorithm posses a time complexity of O (m).
Question 4
Typically, the time required by fuzzy c-means is essential, the same as K-means, even though the constant is greater. Moreover, SOM’s time complexity is also the same as K-means since it is made up of multiple passes in which objects are assigned toK-means attempts to decrease the objective function, as shown above. The above complexities differ from K-means’ objective function through fuzzifier and membership values (Iliyasu et al., 2016). Consequently, the fuzzifier establishes the value of cluster fuzziness. A large number of results in smaller membership value and, thus, fuzzier clusters. Moreover, in the limit, the membership converges to either zero o one, which implies a crisp partitioning.
Question 5
Probability and likelihood differ in that the former is normalized. Typically, probability implies how likely something will happen. It is the occurrence of future events, while likelihood refers to past occurrences with known outcomes. Probability is mainly used when describing a function of the outcome given fixed parameter value. Moreover, their difference is based on the interpretation of what varies and what is fixed (Tan et al., 2016). For instance, in the case of a conditional probability, that is, P(D)(H), the hypothesis in this is fixed, and again, the data are also free to change. In likelihood, hypothesis L(h) is typically conditioned on the data, that is, as if they are fixed, but the hypothesis can differ. To sum up, in probability, the hypothesis of a particular set of data is usually treated as given, and again, the data is free to differ. In likelihood, a particular set of data is treated as it is given, and again, its hypothesis varies as well.
Question 6
Mainly, clustering is a data mining technique as well as a discovery process used to group different sets of data in a manner that intracluster similarity escalates while intracluster similarity is lessened (Saxena et al., 2017). An example of such a cluster where merging is based on the proximity of cluster resulting in a more natural set of clusters as opposed to the strength of interconnectedness of clusters: When the centers of the clusters are close to each, therefore the connection of clusters is lesser compared to the inter clusters desiccate. The market product purchase pattern is an excellent example of this cluster sets.