Business analytics and intelligence

Business analytics and intelligence

Abstract

Business Intelligence application in any kind of business aids in bringing about a pool of advantages that steers up significant outcomes on investment. It simplifies the converting raw data into meaningful business intelligence and hence reliefs lots of tasks due to physical complexity by providing organizations the meaningful strength to transmute the data from several sources into correct, usable information that can be passed across securely throughout the organization. To add on, it enables the managers to make sound business decisions instantly by the means of query provision as well as providing reporting tools they need as well as sharing to the general public. The paper also aims at describing processes of building Business Intelligence (BI) systems.

Introduction

Business intelligence (BI) refers to a collection of applications and software that are used to analyze various aspects of data sets and presents it informs that enhance decision making. Further, (BI) has emerged from developing fundamental reports and tools used for the historical query, to include a lot of components such as forecasting, online analytical processing, predictive modeling, data management, data mining, and optimization. When BI is well contained with essential tools, business organizations as well as companies can successfully retrieve what is or is not working at present, discriminate what historical factors occurred to make it so, and freely identify future trends to maximize their potential.

Clustering algorithms

Clustering refers to grouping together data into k subgroups called clusters. During clustering, k should be less than or equal to n (population number). In multivariate data sets from fields such as marketing, clustering methods are used to identify groups with similar objects. The different types of clustering algorithms are hierarchical clustering algorithms, k-means, self-organizing maps (SOM) algorithm, and exceptional maximization (EM) clustering algorithm.

K-means algorithm

K-means is a dividing method, where objects are classified as belonging to one of the k groups. The number of clusters K is chosen prior. The multidimensional version of each cluster’s mean is calculated, and each object is assigned to a group with the neighboring centroid. This reduces the overall within-cluster dispersion by reiterative rearrangement of cluster members (Abbas, 2008). The algorithm takes a set of object S as the input, an integer k, and yields subsets of S (that is S₁, S₂, …, S_k), using the sum of the squares optimization criterion.

Steps of clustering

Choose k as the number of clusters.

Calculate the multidimensional version of the mean (centroid)

Allocate each object to the neighboring centroid.

Benefits

The popularity of the method in time and space makes it a suitable clustering technique. It is also ordered, independent. For a given set of cluster centroid, “it produces the same group of data, regardless of the order in which the patterns are presented to the algorithm” (Abbas,2008). This clustering algorithm is applied to markets, whose primary goal is to minimize the total cost.

Hierarchical clustering algorithm

Here, each object is treated as a separate cluster. The algorithm is based on identifying the first number of groups to converge and iteratively reallocating objects among groups to converge. These existing groups are either combined or divided, depending on their similarities or differences, to create a hierarchical construction that mirrors the order in which groups are separated or combined. The object initially belongs to subsets S₁, S₂, …, S_n, they are then paired (say S_i and S_j) using a cost function, from the cheapest list to merge. After merging, the pair, S_i and S_j are eliminated from the catalog to form a union of S_i and S_j (Abbas, 2008). The iteration process is performed until all objects are in a single group.

Steps for clustering

Identify two clusters that are closely related.

Merge the two clusters.

Continue with iterations until all objects are merged to form a single group.

Benefits

It provides efficiency when performing specific tasks, as the algorithm is more versatile, and it is easy to handle any form of similarity or difference. The hierarchical clustering algorithm is used in environments where the objects have similar or different levels of resemblance.

Self-organization maps (SOM) algorithm

The self-organizing map algorithm is a training scheme for a neural network, where input dimensionality is reduced, to represent its distribution as a map. This type of algorithm is used to obtain solutions for robust data. It has a “substantial deposition for visual cluster analysis as it provides the data reduction and spatialization of cluster prototypes, forming a baseline for visualization and interaction with data” (Schreck, 2010). The algorithm can be applied in a broad range of data types. Clusters are obtained from the primary data. Each data competes for presentation. The weight vectors are initialized, and sample vectors are selected randomly. In the map, the samples are mapped together, and iterations are performed.

Steps of clustering

The weight of Each node is initialized.

A vector is selected randomly from the set of training data.

The examination of each node takes place to determine the input vector, depending on their weight. The winning node is referred to as the Best Matching Unit (BMU).

The area of BMU is calculated to reduce the number of neighbors over time.

The winning weight becomes the sample vector, and iterations continue.

Benefits

It is useful for vector quantization and speech recognition.

The method utilizes different kinds of distance measures and joining criteria (Abbas, 2007).

When the region of the map units is convex, several map units allow the creation of non-convex clusters (Abbas, 2007).

If the initial weights are not appropriately selected, the SOM generates a sub-optimal partition.

Application

This algorithm is applied in fields that deal with big data such as business domains and multimedia. Also, “SOM clustering has often been successfully applied in the Text (Nuernberger & Detyniecki (2006); Honkela et al. (1997)) and Multimedia Retrieval (Laaksonen et al. (2000); Pampalk et al. (2002); Bustos et al. (2004)) domains” (Schreck, 2010). Therefore, this algorithm is suitable for all fields that use robust data.

The expectation maximum clustering Algorithm

Expectation maximization (EM) is an algorithm that assumes modeling of a data set as linear combinations of multivariate normal distributions. As the name suggests, it alternates between expectation and maximization of the data. This algorithm determines the maximum log-likelihood parameters to maximize a model quality measure.

Steps of clustering

This technique only has two steps and iterations. First, estimate the clusters’ values, and second, optimize the model (Brownlee, 2019). Then repeat the two steps for N iterations until convergence occurs.

Benefits

The EM algorithm has a strong statistical basis. The data converges faster, and it is a linear database thus does not have a lot of complications given a proper initialization. The other benefit is that this algorithm can handle high dimensionality and accept its inputs to be the desired number of clusters. It is also robust to noisy data. Considering that noisy data must be experienced once clustering has begun, the expectation maximum grouping to deal with big data makes it a more effective tool than other methods such as hierarchical clustering algorithms.

Application

The expectation-maximization clustering algorithm is applicable in areas dealing with big data, as it provides a solution to maximum likelihood estimation of the clusters (variables). It is also used by machine learning students to solve problems such as probability density estimations and clusters (Brownlee, 2019).

To conclude the above types of clustering algorithms, we can say that the application criteria are the same. In all, k is the number of clusters. Difference sets in at a hierarchical algorithm, where the number of clusters k is not identified prior. All these algorithms can be used for big data. However, k-means and the expected maximum clustering algorithms are less complicated, thus becoming more suitable for comfortable dealing with big data.

However, for an analyst to effectively perform cluster analysis, they must be able to deal with all data types and be good at scaling to avoid biasness. The analyst should also be well equipped with techniques on how to deal with noisy data since there must be noisy data during clustering processes. The structure of the data should also be considered to determine the algorithm that of suitable for the data, depending on the algorithm’s flexibility, ability to handle dimensionality, applicability, and how accessible the algorithm is (Abbas, 2019).

Task 2: Bicycle manufacturing company.

The company’s goal is to maximize profit concerning the new bike.

Table 1: Estimated demand for a new bicycle based on a price level.

Price (£)	Demand (in Thousand)
1100	170
1125	166
1150	150
1175	138
1200	133
1225	131
1250	120
1275	105
1300	95
1325	90
1350	80
1375	75
1400	72
1425	70
1450	67
1475	64
1500	62
1525	61.5
1550	60
1575	59.5

Assumption: the unit production and supply cost of each bike is £900.

Profit= Sale price – unit production cost

The estimated demand quadratic equation function is y = -0.2447x + 425.77

The optimal price that maximizes profits is £1100

The optimal demand will be 156.6

The firm’s optimal profit will be

optimal profit = (sales – unit cost) * Demand = (1100-900) 156.6 = £31320

Therefore, the company’s optimal profit will be £31320.

From the data on the prices and demand of the new bike, the demand equation is y = -0.2447x + 425.77. The firm’s decision is the production cost (p), which is 900. Since the firm’s profit is (c-p) D(p), that is (sales price- unit production cost) times the demand, the firm wants to solve the following unconstrained maximization problem: Max (c – p) D(p). For the company to attain the maximum profit, the optimal demand and optimal selling price are used. This is the theoretical view that has resulted in our solution to finding the firm’s optimal advantage (part d).

Optimal prices can also be obtained by calculating the firm’s profit and loss by subtracting fixed costs from the contribution margin. If one uses a table to analyze, the analyst will receive the optimal price by checking the column with the highest profit. The contribution margins are obtained by subtracting total variable costs from revenue (total sales). Revenue is derived from the product of price and quantity. That is, R=q*p.

Optimal profit is directly affected by optimal prices and optimal demand. And since the price is directly affected by demand, if the supply costs change to 1000, 1100, and 1150 and maintain the production cost, the difference between the unit production cost is 100, 200, and 250, respectively. The initial supply cost of a bike is £900, which is equal to the production cost. The demand for 1000 prices will be higher than the others. Therefore, the higher demand and the lower cost of production are likely to maximize profit as the firm will sell more units. Consecutively, if the change in supply cost does not affect the optimal demand, the optimal price could have been 1150.

Task 3

Compute the forecasted sale using an alpha value of 0.3 [5 marks]

Period	Sale (in Thousands)		α=0.3
1	137	#N/A	Differences
2	138	137	1
3	134	137.7	-3.7
4	133	135.11	-2.11
5	136	133.633	2.367
6	140	135.2899	4.7101
7	137	138.58697	-1.58697
8	141	137.476091	3.523909
9	136	139.9428273	-3.9428273
10	142	137.1828482	4.81715181
11	145	140.5548545	4.445145543
12	143	143.6664563	-0.666456337
13	145	143.1999369	1.800063099
14	142	144.4599811	-2.45998107
15	150	142.7379943	7.262005679

Plot the data for the actual demand and the forecasted figures. Describe the main features of the series. [5 marks]

The differences between the actual values and the predicted values are so minimal, hence the accuracy level is boosted in this prediction model. The model indicates a cyclic future trends.

Calculate the error, Mean Absolute Deviation (MAD) error, Mean Square Error (MSE) and Mean Absolute Percentage Error (MAPE). [10 marks]

The following output is a summary of what is expected in this section.

α	0.3
	Period	Sale (in Thousands)	Exponential forecast(0.3)	error	ABS(error)	error^2	%error	ABS%error
	1	137
	2	138	137	1	1	1	1%	1%
	3	134	138	-4	4	16	-3%	3%
	4	133	134	-1	1	1	-1%	1%
	5	136	133	3	3	9	2%	2%
	6	140	136	4	4	16	3%	3%
	7	137	140	-3	3	9	-2%	2%
	8	141	137	4	4	16	3%	3%
	9	136	141	-5	5	25	-4%	4%
	10	142	136	6	6	36	4%	4%
	11	145	142	3	3	9	2%	2%
	12	143	145	-2	2	4	-1%	1%
	13	145	143	2	2	4	1%	1%
	14	142	145	-3	3	9	-2%	2%
	15	150	142	8	8	64	5%	5%
	16				3.5	15.64285714		2%

Compute the forecasted values with the smoothing constants of 0.4 and 0.5. [10 marks]

Compute the Error, Mean Absolute Deviation (MAD) error, Mean Square Error (MSE), and Mean Absolute Percentage Error (MAPE) for the smoothing constants of 0.4 and 0.5. [10 Mark]

α	0.4
	Period	Sale (in Thousands)	Exponential forecast(0.4)	error	ABS(error)	error^2	%error	ABS%error
	1	137
	2	138	137	1	1	1	1%	1%
	3	134	138	-4	4	16	-3%	3%
	4	133	134	-1	1	1	-1%	1%
	5	136	133	3	3	9	2%	2%
	6	140	136	4	4	16	3%	3%
	7	137	140	-3	3	9	-2%	2%
	8	141	137	4	4	16	3%	3%
	9	136	141	-5	5	25	-4%	4%
	10	142	136	6	6	36	4%	4%
	11	145	142	3	3	9	2%	2%
	12	143	145	-2	2	4	-1%	1%
	13	145	143	2	2	4	1%	1%
	14	142	145	-3	3	9	-2%	2%
	15	150	142	8	8	64	5%	5%
	16				3.5	15.64285714		2%

Compare extracted measures for different smoothing constants and select the proper one. [5 marks]

The extracted measures for different smoothing constants indicate that there is little or no change at all across all the values.

Forecast the demand for the 16th period and discuss the outcome of the forecasted sale amount. [5 Mark]

From the excel output below, the expected 16^th-period demand is about 155

Conclusion

An exponential smoothing technique is widely used by many organizations purposely to predict future events. However, there arise challenges to assign the value of exponential smoothing constant. In this BI study, this problem was solved by determining the optimal value of exponential smoothing constant. The Mean square error and mean absolute deviation are minimized to get optimal value of the constant and optimal values from the excel formula for minimum mean square error and mean absolute deviation respectively. From this method criterion, any organization can adopt the optimal value of exponential smoothing constant to enhance the accuracy of forecasting. However, some advancement can be performed to this study to minimize other forecast errors that might occur such as mean absolute percent error (MAPE), cumulative forecast error (CFE).

References

Abbas, A (2008). Comparisons between Data Clustering Algorithms. The International Arab Journal of information technology. Vol 5, No. 3 (320-325)

Ahn, J. (2005). Self-organizing map tutorial system. Interactive system design. http://www.pitt.edu/~is2470pb/Spring05/FinalProjects/Group1a/tutorial/som.html

Brownlee, J. (2019). A gentle introduction to Expectation-maximization (EM) algorithm. Retrieved July 25, 2020, from http://machinelearningmastery.com/expectation-maximization-em-algorithm

Schreck, T. (2010). Visual-interactive Analysis with self-organizing maps- advances and research challenges. Intech-open. http://doi.org/10.5772/9171

Business analytics and intelligence

Pssst… we can write an original essay just for you.

Remember! This is just a sample.

Save time and get your custom paper from our expert writers