Data visualization
Task 1
There are various techniques of storytelling with data and this includes defining and understanding the audience, information architecture, and a call to action, to mention a few. When it comes to the first technique which is defining and understanding the audience, this is one of the critical elements that will help in shaping the story and ensuring that the delivered content is well understood. The application of this technique can dictate the selection of words, figures and graphics that should be used. For instance, when the audience belongs to a particular profession, some jargons that relate to such area of expertise may be used. For example, when the audience at hand are accountants, the use of terms such as turnover ratio may be used. This is unlike when the audience does not belong to any professional affiliation whereby there will be the use of simpler terms. The advantage of defining and understanding the audience is that it helps in enhancing relevance in the story.
Apart from that, the other technique that can be used is information architecture (Kirk, 2019). In this case, the person telling the story may choose to minimize the number of selections that the audience has to make. This can be done by having a clear outline that will guide the audience through the whole process. An example of information architecture that the audience may follow through is definition of the problem> enabling the audience to relate> quantification of the problem> call to action. Having this kind of flow enables the audience to easily get on track without getting lost in between, and that is the advantage of information architecture. Conversely, the last storytelling technique is a call to action. In this regard, the audience are given a chance to explore the various recommendations depending on the story at hand. For example, the call to action may be accounting standards should be applied to prevent liability. The advantage of the call to action is that it helps in providing a solution to a problem if the story was based on some problems.
Task 2
- Define the following
Well-separated clusters
This is whereby any point within a given cluster tends to be closer or rather more similar to every other point within the cluster.
Centre-based clusters
This is a kind of clustering whereby the objects in a cluster are nearer the centre of the cluster as opposed to the centre of any other cluster.
Contiguous clusters
This is one where a cluster is closer to at least one point within the cluster than to any other point that is not within the cluster.
Density-based clusters
Here is whereby the clusters are intertwined or are rather not regular. Besides, there has to be some noise along with outliers.
Property or Conceptual
This is whereby the cluster shares some common concept.
- Define the strengths of Hierarchical Clustering and then explain the two main types of Hierarchical Clustering.
One of the strengths of hierarchical clustering is that it is easy to implement.
The other strength is that one does not need to specify the number of clusters that are required for the algorithm.
The two main types of hierarchical clustering are divisive, which adopts a top-down approach and agglomerative which takes a bottom-up approach.
3.DBSCAN is a density-based algorithm. Explain the characteristics of DBSCAN.
DBSCAN has elements of noise.
DBSCAN finds shaped clusters.
- List and Explain the three types of measures associated with Cluster Validity.
Compactness: how close the objects are.
Separation: How well separated a cluster is.
Shape: The kind of shape that a cluster takes.
- In regards to Internal Measures in Clustering, explain Cohesion and Separation
Cohesion is the measure of how close objects are in a cluster while separation is how far away the objects are in a cluster.
Reference
Kirk, A. (2019). Data Visualization: A Handbook for Data-Driven Design. 2nd Ed.
Thousand Oaks, CA: Sage Publications, Ltd.