This essay has been submitted by a student. This is not an example of the work written by professional essay writers.
Uncategorized

the development of micro-based applications

Pssst… we can write an original essay just for you.

Any subject. Any type of essay. We’ll even meet a 3-hour deadline.

GET YOUR PRICE

writers online

the development of micro-based applications

\begin{ The abstract}

The developed container technologies have enabled the development of micro-based applications and their deployment in the cloud in an agile fashion. Nevertheless, anomalous services’ behavior can be detected through anomaly detection methods, although this may result in the infringement of the service level agreements commonly abbreviated (S.L.A.’s). It is possible to determine anomalous in the micro-based-cloud-applications (M.C.A.) by employing the response time and resource utilization techniques. In this respect, the machine learning was used in prediction of the anomalous in the M.C.A. Usually; the framework employs three stages in the detection of anomalous behavior. These stages include localization, pre-processing, as well as detection. In this case, the input is the monitoring data purposed for a target application from the monitoring system. Typically, the pre-processing techniques ought to be applied to the monitoring data. Consequently, the detection phase is purposed in the detection of the anomalies in the monitoring data. Should there be anomalies, the localization element is used in the specification of the fault type and the microservices where the fault exists. In a quest to obtain the requisite output such as the C.P.U.’s. Fault in the microservice-x. Additionally, to achieve the objectives, the machine learning algorithm was purported to train the detection element, and the localization element allows the classification of the anomalies in the detection phase. After that, the designer uses the performance indicators obtained from the monitoring system of M.C.A. to implement the framework. The target application constituted three micro-services. In this respect, for both localization and detection training, various algorithms were deployed. These included the decision tree, which is, linear-discriminate-analysis, Naïve Bayes, and support-vector-machine. In the aspect of the detection task, the algorithms were 100% accurate. However, for the localization task, degradation of the results from every used algorithm was yielded. Conclusively, the most effective algorithm was the decision tree.

\section{Introduction}\label{sec: An introduction}

Parallel computing, utility computing, service-oriented architecture, and virtualization are the concepts that are employed in cloud computing. The term cloud is defined as the delivery of services through the blending of different networks, interfaces, storages, as well as hardware \cite{8057500}. Per National-Institute-of-Standards & Technology (N.I.S.T.), cloud computing is on-demand access purposed for sharing an innumerable pool of resources for computing. Thus, it is an all-inclusive computing solution that utilizes various computing resources such as storage, hardware, networking, and software. Cloud computing services are offered to the users per their demand levels and promise to bring a significant revolution in how I.T. is integrated into the business modus operandi (Cite{60336733}. The cloud computing technology allows organizations in the deployment of the enterprise applications that (if properly designed) can effectively scale the computing resources as per the demand. Thus, organizations can successfully employ their applications on the Infrastructure-as-Service (I.a.a.S) or the Platform-as-a-Service (P.a.a.S) I.T. solutions. Alternatively, the organizations may consider using the Software-as-Service, where they deploy their business applications \Cite {7333476}.

In a bid to achieve better reliability, scalability, as well as flexibility, businesses are considering developing cloud-based applications by using the micro-service design. Such cloud computing applications services are obtainable through the cloud computing servers such as Microsoft Azure, Amazon A.W.S., and more so, Google Cloud. Usually, the application of the micro-service approach constitutes of collection of cloud computing services that are isolatable, scalable, and are also resilient to failure. Not only do such services application reveal their endpoints per se, but they are also perceived as applications in their own capacity. Numerous benefits are obtained from the use of micro-service architecture. For instance, it enables the much faster release of the software, and more so, teams can be smaller and allow the team to focus on the work (\CITE{10.1007/978-3-030-05063-4_42). The designer may consider the generation of enough isolated resources by deploying suitable virtualization techniques. Typically, the virtual machines (V.M.’s) are traditional approaches used in the achievement of virtualization. Every established V.M. has its operating system (O.S.). On the other hand, the container is the newest technology employed for virtualization, and its gradually gaining more popularity compared with V.M.’s due to their lightweight, better scalability, and high-performance \Cite{8229914}.

The inception of virtual technology, such as container technology contributed to the adoption of the micro-service architecture in recent years. For instance, a typical web application server can consist of various constituents such as user interface (U.I.), the server’s back end, and the database. By definition, a micro-service is essentially a container purported to run a particular element of the application or service. The microservice-based cloud gets immensely hosted in great commercial data-centers as well as colocation facilities. Also, hyper-scale server organizations get operated by service providers such as Facebook, Amazon, Google, and Microsoft. Such service providers emphasize on the dependability of such micro-services. S.L.A.’s get harmonized between the service providers and the end-users according to the specification of the services’ quality \cite{10.1007/978-3-030-05063-4_42}. As such, the data-centers ought to be consistently monitored to enable stable operation of every hosted application. If anything, the performance of the cloud services can be effectively analyzed as well as observed on various means such as application logs, audit trails, and more so, monitoring data systems \cite{SAUVANAUD201884} as shown in \cite{7381796} and \cite{8814569}.

\subsection{The Problem-description}

To develop and deploy the cloud-based applications that call for scalability, reliability, and more so, agility, micro-services are trusted to be a reliable approach. Thus, micro-services can be perceived as the disintegration of the monolithic application elements into autonomous software components, as advocated by the architecture for microservice-based applications. The fact that the autonomous microservices can be designed, updated, and also deployed mutually exclusively, it results in a sophisticated runtime performance monitoring, which may contribute to immense management problems \cite{8814569}. For this reason, the end-users end up experiencing performance anomalies that are a result of the distributed aspect of computation, which encompasses underlying resources. Such a phenomenon results in performance deterioration hence application failure. As such, it is time-consuming for the cloud computing service providers to gather information regarding every individual service that gets yielded by the monitoring tool. The monitoring tool analyzes the data, after which it detects anomalous behavior in the services. Thus, the anomalous detecting techniques are important in detecting any abnormal behavior of the services that may result in the S.L.A.’s infringement.

\subsection{ Research Objectives}

The project is purported to establish feasible techniques for detecting abnormal behavior of the microservices in cloud computing. The overarching aim of the project is to design a comparatively simple framework. Further, the project also aims to ensure that the designed framework is efficient in determining if the microservices will experience any anomalies in its performance due to the distributed nature of the computation in the central processing unit and memory. The latter and the former are perceived as the underlying resources which may contribute towards performance deterioration and, more so, application failure. The application failure is manifested by the slow response rate as well as significantly low throughput.

The following are the objectives of the project

To propose and design an autonomous anomaly detecting system that deploys the machine learning classification algorithms.

To implement the suggested dataset framework that is obtained from the cloud-based monitoring system

To evaluate the classification algorithms quantitatively which will efficiently localize the abnormalities in the micro-service performance that is based on precision and abnormality detection time

\subsection{Report-outline}

The rest of the document is organized, as illustrated. First off, section two discusses the related research work on the anomaly detection research on cloud applications. After that, section three of the document illustrates the design methodology that will be adopted to come forth with the detecting and localizing model for detecting anomalies by use of the machine learning algorithms. Consequently, section four will document the automatic detection classifier performance and the localization of the anomalies. After that, the report ends with a summary, a reflective conclusion and later a discussion about research gap that can be focused in future.

\section{Background and the related work}

\subsection {The Anomaly detection}

Chandola et al 2009 describes that the anomalies are data patterns that never conform to the expected behavior. Such patterns are indicators for the problems in a corresponding measurement domain. The anomaly detection technique is dependent on a research area, application domain, as well as the problem characteristics. Figure 1 below illustrates the correlation between the key elements and the anomaly detection tool. The main elements are attendant to the anomaly detection technique.

Various approaches are developed for particular application domains. In this respect, principles from different disciplines, such as machine learning, statistics, and data-mining, have been used in problem formulation \cite{chandola2009anomaly}. As such, this project greatly uses concepts from machine-learning discipline.

Machine-learning uses scientific principles that are also commonly used in artificial-intelligence. Its overarching purpose is making a machine to learn from a given dataset and adopt its operations as per the new information. Therefore, machine learning employs various algorithms which operate by establishment of a data input model and a couple of features that make predictions about the final output \cite{7548905}. The machine-learning concept is applicable in three modes. These include, unsupervised, supervised, and semi-supervised modes.

There are several factors that influence the particular features of an anomaly detecting problem. The crucial aspects for a typical anomaly detecting approach is the nature of both input data and output data. Other influential features include the availability of the data that is labelled and the nature of the anomaly. The input data defines the collected data instances whereby each data instance is characterized by a specific set of features. The reporting of an anomaly is defined by the output data.

The anomalies are classifiable into three classes. These include, collective, contextual, and point anomalies. An instance is depicted as a point anomaly if it is perceived as anomalous when compared with the remaining data. If the data instance is defined to be anomalous in a particular aspect, it is referred to as contextual anomaly. The latter has been previously researched on in both spatial and time series type of data. The last type of data instance referred to as collective anomaly is defined to be a collection of the related data instances that is anomalous in respect to the whole data set. Dependent on the availability of the labelled data, the anomaly detecting techniques operate in three modes which in this case include, unsupervised, supervised, and semi supervised. A supervised mode is applicable if the labelled instances for both the normal and the anomalous classes get availed. The approaches that get operated in a semi supervised approach need labelled instances especially for the normal class. The last mode of the anomaly detecting approach is the unsupervised approach which is more of a clustering technique. It never requires a labelled instance by any chance.

The anomaly detecting techniques have been pervasively researched on in different disciplines for numerous years. For instance, there is remarkable research on fraud, medical diagnostics, and intrusion detecting techniques. Since micro-service architecture as well as visualization technologies are widely adopted, performance monitoring systems and the performance evaluating systems are hotly debated topics in cloud computing by researchers. Thus, the detecting of the anomalous activities research is a greatly researched area in troubleshooting and analysis of micro-service based cloud applications.

\subsection{The Related work}

Performance monitoring and evaluation is a great topic in cloud computing that interests many researchers. The detecting of the anomalous activities is greatly researched to improve the analysis of the micro-service based cloud applications. Statistical analysis, adaptive approaches, and machine-learning are widely used in literature.

\subsubsection{ The-machine-learning-anomaly detection Technique}

In \cite{SAUVANAUD201884},Different authors have presented their various anomaly detecting models that are used in cloud services. In this case, they employed a cloud application which constituted several services tailored in several V.M.’s whereby every V.M. operated a particular service. The V.M.’s performance data was collected, processed, and produced results showcasing anomalies. During the process, the machine-learning techniques were employed.

In \cite{10.1007/978-3-030-05063-4_42},Usually, an anomaly detecting system (A.D.S.) is orchestrated in detection and analysis of the anomalies in the established micro-services. Such is achieved through monitoring and analysis of the real time performance of data. The A.D.S. proposed by this research project constitutes monitoring module which gathers container’s performance data. It also constitutes a data processing module whose operations are commanded by machine-learning prototypes.

Jiang et al suggested a web application scheme that can be tailored in cloud technology. The scheme employed the machine-learning to the analysis of the web requests historical distribution. Furthermore, the machine-learning technique was also used in analysis and determination of the future workload as well as auto-scaling the resources as required in a quest to avoid anomalies that may result from the workload.

\subsubsection{The Rest of Anomaly-detecting method}

In \cite{7116586 opinionated automatic fault diagnosing framework which is abbreviated as F.D.4.C. The authors presented four conventional periods purposed in F.D.4.C framework. These also included system-monitoring, fault-localization, status-characterization, and fault-detection.

The \cite{7904276} authors opinionated the use of adaptive anomaly detecting approach in the cloud infrastructure. Such a model analyzes the key components in performance metrics.

The report \cite{8795337} depicts the anomaly detecting and a predicting system model that is established on the hidden-markov-model (H.M.M.). In this case, the response time analysis is done to establish and forecast the container’s resource utilization that focuses on probabilities in the model.

\Section{The Anomaly-Detection-Framework}

The overarching objective of this research is design of an anomaly detecting framework purposed for applications and systems used in cloud computing. The systems will be used in diagnosis of the web applications faults. In this respect, the container-based web application is deployable in a multiple-container clusters and not just in one host. Every container-cluster constitutes several hosts which are also referred to as nodes. Every node can sustain various containers and various services can be accommodated in one container. However, during practice, it is usually recommended to have different small containers rather than having one big one. If every container constitute a tight focus, it is usually very easy to sustain micro-services and diagnostic issues. For the purpose of this research project, there is one presumption that is upheld. In this respect, it is presumed that one single container contain only one micro-service. It is possible to tell if a container performs well through processing and analysis of the container’s performance data. On the other hand, we can also inform if a micro-service is anomalous if its performance data is processed.

The proposed framework is designed to automatically single out the anomalies in the monitored micro-service. Such is achieved through location of different performance degradation. It also makes the dependency that occurs between the possible faults and the observe failures which are a result of the fundamental cloud resources. Such is attained through the application of the multi-resolution model \cite{7420511}. Such an approach encompasses analysis of the performance data in phases. The very first phase is analysis of failure-detection whereas the second phase entails fault-localization. The figure demonstrates the proposed framework. Consequently, in this part, will elaborately discuss the framework in appreciable details

\subsection {Data-Monitoring}

The metrics to showcase performance are usually obtained from the micro-service based cloud system application. It is critical that monitoring metrics get gathered at each layer right from the hardware, operating system, virtual-machine layer, application layer, and more so, database layer. The hardware metrics may include but not limited to memory utilization as well as the central processing unit. For the purpose of this project, it will necessitate use of three metrics to analyze the anomalies present in micro-services.

  1. Response period/time: is the time spent in processing a request till the response arrival.
  2. Throughput: is depicted to be the average complete interactions per unit time in an interval session
  3. Resource-utilization: defines the quantity of the resources such as memory allocation and central processing unit compared with the entire number of resources that are provided in a unit time.

\Subsection {The Data pre-processing element}

The pre-processing element processes the data that is gathered through extraction of the feasible data that is needed by the failure detecting element from different layers. After that, it relays the data that is processed to failure detecting element for other analysis.

\Subsection {Failure Detecting component}

For this research project, the response times are used in the analysis of the application’s normal behavior. In a quest to decide if the present situation is abnormal per se, a threshold should be set.

Usually, the time response is observed to vary between the abnormal to the normal categories. Such a phenomenon is a result of the resource workload. An anomalous response time can be perceived as an observed failure that the end users use to determine the workload type that may trigger anomalies. Thus, it is possible to categorize both the response time as well as throughput in two classes. In any poor performing micro-service, the abnormal response period may vary between 501 milliseconds to one second. Abnormal response period impedes throughput efficiency although the normal response time is usually unaffected in a micro service which is less than or equivalent to 500 milliseconds hence the network administrator should not perceive it as a network default.

To do this, the failure detecting component is used to compute lightweight algorithms to categorize response time behavior to either normal or abnormal.

\subsection {fault localizing component}

Following fault detection through tracking of the status change of the micro-service. Fault localization accurately locates an anomalous micro-service when an abnormality happens. Such faults stems from the resource utilization fluctuations. In the project, the resource utilization refers to the memory utilization and central processing unit. The utilization rate refers to the resource’s capacity which is put into operation. Resource utilization assists in determining the workload of the resource. For instance, a resource that is saturated has the usage capacity of fifty of its capacity.

Resource workload fluctuation is classifiable into overload and normal load. The overload resource occurs when the resource use rate surpasses fifty of the total capacity. After surpassing resource overload, the system gets back to the normal load. The process is done by use of multi-class categorization of algorithms to analyze the additional resource metrics of both the central processing unit and the memory.

\subsection{ the output of anomaly-detection-module}

As earlier stated, the anomaly localizing calls for a multiclass categorization which determines the various resource anomaly classes in every micro-service. Some of the pragmatic examples include anomaly in the central processing unit of the micro-service k or anomaly in the micro-service x memory.

\section {Detecting and Localizing by use of Machine-learning}

Typically, machine-learning is amongst the popular topics in computer science. It explores the strategies of implementation of the automatic computing techniques where a task is learned without necessarily involving extreme programming. For this reason, machine-learning was found to be very relevant especially in problem categorization of numerical data {8597302}. Usually, machine-learning is applicable to categorize behaviors that corresponds to various kinds of anomalies through creation of classification model that operates the monitoring data system. There exist a significant number of machine-learning algorithms. For the purpose of this project, we will focus on the supervised-learning algorithms which are largely employed classifiers.

As discussed in

\cite{SAUVANAUD201884 there are two phases of supervised-learning algorithms. The first stage is referred to as the training phase. In this stage, the classification models get created offline. Usually, the creation of models usually requires different samples of the labelled-monitoring-data. The latter should indicate if there was an anomaly during collecting of the sample. The samples that are collected are referred to as the training data. Usually, they get collected by the monitoring entity which further secures them in a database. The samples are collected from the training purported offline execution system. In the process of executing, the target operation system is monitored to establish the time it is manifesting normal behaviors or anomalies that may be revealed through a fault injecting component. Consequently, when a model becomes trained, it is deployed during the second phase and hence referred to as the detection phase. During the real deployment, the second stage is congruent to operational stage of system. Consequently, after the offline-model-training stage, it becomes feasible to spot the anomalies that occur in the system. The observation of every micro-services is directly conveyed by the monitoring system to the data processing elements which corresponds to the micro-services as well as the processed online. However, this project only applies the training phase.

During the implementation of this project, four algorithms were tested and included in an open source particularly the python library purposed for machine-learning-scitkit-learn-6. This would help during the implementation of the tasks. This project chooses the algorithms commonly used in machine-learning and also in detecting anomalies. Some of the algorithms chosen include; Naïve Bayes, Linear-discriminate, decision tree, and support-vector-machine.

\subsection {decision tree}

Decision-tree classifier is commonly manifested by the fact that the unknown sample can be categorized into a specific cohort by use of one or many decision functions successively. Generally, the decision tree constitutes a root node, several interior nodes, and more so, terminal nodes. In this respect, the root node as well as the interior node are commonly referred to as the nonterminal nodes. They get linked into decision phases. Terminal nodes are aimed at representing the final classification. Related to the root node, there exist a set of classes which other samples can be grouped. A layer can be defined to be a set of node at a particular tree level. Every node constitutes a group of classes that should be discriminated. It also comprises of the group of features that ought to be used and more so, a decision rule that is orchestrated to do the classification \cite{6498972}.

The decision tree are generally preferred because of a couple of reasons. For instance, when compared with the conventional or normal singe stage classifiers, they are more efficient. For the single stage classifiers every data sample should be tested in all classes hence reducing efficieincy. However, in the tree-classifier, the sample ought to be tested against a given subset of the classes hence doing away with the unneeded computation \cite{97458}.

Additionally, the decision-tree flexibility makes it preferred by most of the designers due to its scalability. It is scalable hence can be used in improvement of the classifier performance. Such an improvement in performance optimizes the accuracy whilst reducing the computation prerequisites. The decision-tree also treats specific applications whereby the multi-level logic is the primary and a pragmatic approach \cite{6498972}.

\subsection {Naïve-Bayes}

Naïve-Bayes, commonly abbreviated as N.B. is derived from Bayesian-Network which is commonly employed during data classification. B.N. is a classifier hence intelligently learns much from the training data by use of conditional probability of every attribute when given the class label. The Baye’s concept rule is used in the computation of the classes probability when it is issued with a specific attribute instance. On the other hand, the class prediction is usually done through identification of the class with the uppermost posterior-probability. Indepth research demonstrates that N.B. performs effectively despite its high dependency of the attributes \cite{4273200}.

N.B. is also preferred on account of several benefits. For instance, implementing N.B. is comparatively easy and more so, it offers an effective performance. Furthermore, the approach is commonly used since it does not need much training data and handles both the discrete as well as the continuous data. It can also operate with the binary and the multi-class classification of common problem. I tis never sensitive to irrelevant characteristics. Also, it is an approach that is typically robust and its highly accurate \cite{8862451}.

\subsection{Support-Vector-Machines}

Support-Vector-Machines (S.V.M.), are tailored to address the issue of classification. In such an approach, hyperplane requires to be defined as it is the decision boundary. When there exists a set of particular objects that belongs to a given class, the decision plane isolates them. The objects may either be linear on not linearly isolatable. In such scenarios, sophisticated mathematical functions referred to as kernels are needed to separate such objects as they are members from varied classes. S.V.M. is tailored to accurately classfy objects based on training-data-set examples \cite{SVMTutorial}.

There are advantages that are associated with the adoption of S.V.M. First off, they are adapted to grapple with sophisticated functions when the suitable kernel functions are derived. When there is generalization of the S.V.M., there exist less likelihood of over-fitting \cite{8862451}.

\subsection{Linear Discriminant Analysis}

The general representation of the L.D.A. is usually straight forward. In such instances, such model constitutes statistical features of the data that is calculated for every class. The same characteristics get computed by use of the multi-variate Gaussian using multiple variables. Multivariates are usually means & a covariate matrix.

The predictions are done through provision of the statistical properties right into L.D.A. equation. Normally, the properties get approximated from the data obtained. Eventually, the model values get saved to a file to come up with the L.D.A. model.

L.D.A. employs the Baye’s theorem to approximate probabilities. They predict on the basis of the probability that a fresh dataset input should belong to every class. A class that has high probability is assumed to be an output class and hence L.D.A. makes the prediction.

A prediction gets done by employing the Baye’s theorem which approximates the probability of an output class if it is given an input. They also employ the probability of every class and the probability of every class data \cite{LDA}.

The benefit of this model is that the L.D.A. representation is simple or a straightforward per se. It usually allows both multiclass and binary-classification

\subsection {Hyperparameter-optimization of the ML-algorithms}

The hyper-parameter values influences the performance of the algorithm. Sometimes it is referred to as the hyper-parameter optimization. The parameters of the algorithm are commonly referred to as hyper-parameters and the coefficient that are computed by the machine-learning algorithm are referred to as the parameters \cite{9033691}. The search characteristic of the issue is suggested by optimization process. There exists various search strategies that can be used in finding a good or a robust parameter. This project focuses on grid search parameter. The latter is a technique used in parameter tuning that systematically builds and evaluates model for every algorithm combination specified in a given grid. There are a couple of key terms that are mainly used in the grid search method \cite{9033691}

Cross validation

Cross-validation is a technique that is purposed for resampling of the data in a quest to assess machine-learning models.

Estimator

Estimator used in implementation of the interface of scikit-learning estimator. The trained classifier is conveyed to this parameter

The parameter-Grid

This is a dictionary that constitutes the parameter names depicted as keys or lists of parameter set as values. The combination of such parameters are usually tested to determine the best precision.

 

Overfitting is a situation whereby the classifier get exactly fitted into a training data. Nevertheless, it has to show the capacity to generalize to given examples that are not manifested in a training process. In a simplified form, it is to say that classifiers are tailored to memorize during the training instance as opposed to learning. Therefore, it is perceived that it is best halting training at a feasible time. This is a technique that is alluded or believed to be early stopping \cite{PRECHELT1998761}.

The use of the early stopping process commonly employed in K-fold technique used during cross validation was aimed at preventing over-training. This was further espoused in \cite{PRECHELT1998761}:

In respect to k-fold type of cross-validation, the original training set should be partitioned randomly into a subset of k-equal portions. Consequently, the k-subset can be used in validation of information for which a model can be accurately tested. Such a subset is referred to as the validation set. Also, the other (k-1) set can be employed for data training. Evaluation scores ought to be stored for every subset and needs to be summarized at the end to determine model performance.

There are a couple of hyper-parameters that are pivotal in determination fo the output of a decision tree algorithm \cite{6498972}.

Min-sample-leaf: this depicts the minimal number of samples that should be at the leaf node. It showcases the minimal sample numbers of leafs situated at the tree’s base.

The max-depth: Max-depth is usually the very first parameter that ought to be tuned. Such describes the depth of the tree. The deeper a tree is, the more splits it constitutes. Thus, such a tree obtains much information regarding a given data.

The Max-features; When a designer is in the look out for the most feasible split, they assess the max-features. The latter depicts the total number of the features that should be considered when determining the best split.

On the other hand, there are a couple of hyper-parameters that are instrumental in determination of the S.V.M. output algorithm \cite{SVMTutorial}.

Kernal: The kernel function is defined as the similarity measure variety amongst input objects. As such, kernel is aimed at finding the optimal boundary amongst the likely outputs. In a nutshell, it transforms complex data to determine the process of separation of data based on labels.

C: Defines the regularization parameter which is used in the control of the trade off between achievement of the low-training-error and the low-testing-error. Thus, C generalizes the classifier into being unseen data.

Gamma: This is a R.B.F. kernel parameter and usually it is perceived to be the kernel ‘spread’. Hence, it can also be viewed as a decision region. Usually, when a gamma is on low status, the decision curve boundary is low and hence the decision region is quite broad. Should gamma be high, decision curve boundary is usually high.

Notably, L.D.A. and the N.B. are acknowledged as parametric algorithms and hence do not have any hyper-parameters that should be tuned.

\subsection{Performance evaluation of the classification-algorithm}

This project sheds light on the classification activities as well as the standard metrics that are purposed for evaluation of ML-models performance. Models get trained on data set whereby training set as well as the predictions get done on the separate data set that constitutes similar features existing on test data.

In a quest to evaluate classification performance, different evaluation approaches can be employed. This project uses the classification accuracy.

On the other hand, confusion-matrix was considered as amongst the most suitable and precision approaches for analyzing classification accuracy. The method incorporates different elements that get categorized in respect to each class \cite{bookDT}. Each class can be categorized as either false-positive (F.P.), false negative (F.N.), True-Negative (T.N.), and True-Positives (T.P.). Typically, the T.P. is predicted-anomaly that was just an anomaly. A T.N. is merely a predicted-normal-event that was a normal event. On the other hand, the F.N. is merely a predicted normal event that was an anomaly \cite{8625228}.

The confusion matrix is that which is associated with a typical classifier. In this case, it is a n x n matrix whereby n depicts various classes. The row stipulates the real class whereas the matrix column illustrates the predicted class by its classifiers. The instances of observations in a given dataset belonging to the ci and per the classifier are observed to belong to cj class get denoted as n(I,j). The classification accuracy is depicted by the proportion which is a representative of the accurate numbers of predictions which is further divided by the total number of elements in a data set (N) {bookDM}:

There are a couple of prevalently used metrics that are derived from the confusion matrix \cite{bookDT}:

F-1 score is depicted as precision’s harmonic-mean. This implies that it assesses the highness and similarity of two values. Such an assessment determines the accuracy of a given model.

\subsection{Fault-injection}

The fault-injection method is employed in this research project to allow the gathering of the service monitoring data. The latter is used in detecting anomalies as well as the normal behaviors in a quest to train detection and more so, localization models. The two approaches are based upon the supervised-learning-algorithms. Notably, fault injection model requires a wide coverage range to detect potential anomalies. There are two classes of faults that are distinguished per the service resource that they affect. These include

Central processing unit fault- the c.p.u fault contributes towards the container’s anomalous behavior. For instance, if the container does not respond efficiently or keeps on hanging if operating heavy loads, it results in low throughput. Such a phenomenon is referred to as C.P.U. hog which may be as a result of increased user requests.

The memory-fault: Memory usage in a container increases in a short period. However, if a memory is entirely consumed, then there will be an out of memory error. Such a phenomenon is referred to as the memory leak which contributes towards a lengthy response and more so, low throughput.

\subsection{ML Programming Language}

In M.L., it is crucial to select the most suitable programming software for both M.L. as well as analysis. Python is a programming language that is commonly selected for most of the software environment that apply statistical analysis computing and M.L. Notably, python is a powerful tool that can be used in analysis of voluminous data. It furnishes the users with a broad spectrum of statistical analysis as well as statistical techniques. Classification is also part of the statistical technique that is provided by the python software tool. Furthermore, there are many reading resources online that can be used to learn the language. Some of the pages that may be used to learn the python language include stackoverflow.com as well as towardsdatascience.com. The latter and the former were consulted during the programming stage.

\section{The Implementation}

A framework implementation focuses on the detection and prediction of the anomalous container’s workload. Such is done through consideration of various performance measurement response. Some of the metrics that should be considered include response time, memory utilization or consumption, and central processing unit utilization. For this section, the project describes the design and implementation of the anomaly detecting framework.

\subsection{ The Monitoring of the dataset}

The M3-monitoring-agent furnished us with the data. It is designed to collect both the store performance of the micro-service cloud application \cite{8814569}. It collected the resource use as well as the performance monitoring data of the containers and after that, stored the data in the database in form of time-series. The M3 framework was done for the collection of real-time performance data for the book shop application. As earlier described, the targeted application was as a result of three micro-services. These micro-services include user interface, purchase services, and book-services. The user interface represented the micro-service that was tailored to process both the H.T.M.L. used in the building up of the web page content and java script. In a quest to simplify the application coding, the user interface service was not designed to accommodate shopping cart features and database for storage of the users. Such designs are usually stored in the memory for the experimentation purposes. The B.S. represented the service that both stored information about the respective stocks and books and its related processes. Therefore, it is designed with the M.Y.S.Q.L. database to obtain exclusive storage for the book’s domain’s contextual entity. Just like the B.S, P.s is involved in the storage of the purchase data and it has a MySQL database that contains both tables and purchases.

The Apache-Jmeter was used in generation of the HTTP requests in a quest to test the M3’s system capability (https://jmeter.apache.org/). The operations as well as the requests that were done during the experiment assessment were outlined in figure-6. 1. For the R-2 request, the user interface is required to receive a book list request R2, which further redirect the request to B.S. The latter receives the book list request R2.1 and after that returns the book list using JSON format that is converted further into H.T.M.L. format in user interface. After being returned to U.I., it returns back to JMeter.2. The R.1 Jmeter is required to send the information request on ten random books-R1. Consequently, BS is required to process the request by use of the M.Y.S.Q.L. bank after which it returns the JSON format list encapsulating the needed data.; 3. For the R.3, the Jmeter is needed to send the request which takes into account of the book-purchase for the p.s. The latter consults B.S. (R3.1) in its aim for checking if the book is available in the stock after which it saves a purchase on the M.Y.S.Q.L. bank. The three simulated requests get sent continually with ten end users initially which further increases to one hundred and fifty end-users in a 48 hours span.

Book shop micro-services got implemented by use of the Docker container’s technology (Version-18.09). This was facilitated by the Docker compose tool (version-1.24.1). The micro-services for book-shop application were deployed in the Amazon’s –virtual-machine-cloude-EC2services that operates 18.04 Ubuntu version. The fault injecting module as well as the workload generator functioned through execution of the bash scripts on an additional virtual-machine. The collected monitoring-data that was set at forty eight hours and stored in time-series type of database was used as the data processing input. To reduce any chance of influence resulting from the hardware differences and the virtualization of the performance results that were obtained, the containers were set with limited resources throught he use of the cgroup5 directives. The cgroup5directives included; memory. 256M, and cpus-‘0.50’.

\subsection{Fault injecting}

This is akin to the implementation fo the customizable sscript that regularly does the injections in the targeted micro-services. The injections are done through execution of the bash-scripts on an additional virtual-machine. There exist an injection-script for every type of fault.

The scripts get operated and ceased to actively simulate the central processing unit faults as well as the memory faults. Such is done in the following processes;

Central processing unit fault’s script: the anomaly of C.P.U. allocation or consumption stems from the programs that experience challenges in terminating conditions which lead to infinite loops, deadlocks, and busy waits.

The memory-fault-script: This results from the anomaly in the memory allocation. In this respect, some section of memory do not get freed even after getting used. Thus, when the unfreed memory accumulates, it may results in memory shortage.

Immediately after the completion of the injection procedure, the data that is collected get used in creation of the anomaly dataset. Memory utilization and the central processing unit get gathered from a web-server. On the other hand, the response time get measured in client’s end.

\subsection{Dataset structure and pre-processing}

Usually, two datasets get collected by use of the M.3 monitoring-agent. The very first set of data stores the micro-service response time as well as its throughput. The second set of data stores the central processing unit and the memory allocation of the micro-services. Through combination of the two sets of data, a third set of data is acquired.

Many machine-learning design models are algebraic. As such, the input data should be numerical. To use such models, names of the micro-service features get encoded to numerals. Such a technique has been widely used in one-hot-encoding. A basic strategy that is used is conversion of every categorical value into a new column. Consequently, a one or zero value is assigned to column. The one-hot type of encoding is helpful in this dataset since there are minimal categories which are essentially three micro-services. It specifies that the row data belongs to a specified micro-service which is assigned a value of 1 and not any other micro-service (value zero). For this reason, faults, can be localized with relative ease.

We never took into account the database layer during the anomaly detection nor identification. This is because the central processing unit and the memory allocation data was collected for the whole micro-service which in this case included the database.

Following pre-processing and labelling, the set of data is issued a structure as illustrated in table 1.

Data was collected in normal condition and when both aspects of faults were detected in micro-service. Therefore, there exist two different labels which exist in the datasets for detecting components. In this respect, the dataset constitute 1280 samples whereby half of them 630 samples did not have any aspect of anomaly. The second half, 630 samples had anomalies. As elaborated in the fault injection part, obtaining the anomaly samples requires injection of two faults types to three various types of services. As such, there will be seven different class labels for localizing component. The seven classes include; C.P.U User-Interface, C.P.U Book-service, C.P.U memory in the purchase-service, C.P.U. in purchase-service, C.P.U. and memory in the book-service, memory in the book-service as well as memory in the purchase-service, C.P.P.U. in the book-service. On the other hand, there are ninety samples of every class in the 630-samples of the anomalies of the dataset.

The anomalous throughput as well as response are referred to as the observed failure. They are usually used in identification of the type of the resource workload that may contribute to an anomalous operation. The detecting component is tailored to classify the response time into various classes. For instance, low-response-time encompasses the operation that varies from 501ms to 1second which informs that the low-throughput and the normal response time represented as N, reflects a normal operational time and a standard throughput.

For detection of the anomaly response, normal response time as well as throughput, a binary classification model is used. The binary classification model gets trained with four algorithms which include, Naïve bayes, linear-discriminate-analysis, support-vector-machine, and the decision-tree. The 1280 dataset sample were split into both test and training-set. The training set constituted 2/3 of the sample. This was 840 samples which encompassed 420 normal-samples. The one third constituted the test set which constituted 420 samples. The latter constituted 210 anomaly-samples as well as 210-normal-samples. In a quest to optimize the tree algorithym result, hyperparameter turning that constituted tenfold cross validation was used.

The results are illustrated in the subsequent section

\subsection{Localizing component}

Classification performance after the classifiers was assessed. The test set was presented and discussed in this part. The findings of the detection as well as the localization classification got summarized by use of the confusion matrices. There were three parameters that got established. These included, overall accuracy, sensitivity accuracy, and precision.

First off, it necessitated assessing and comparison of four different types of popularly used supervised machine-learning algorithms in analysis of the detection efficiency. The algorithms included decision tree, linear-discriminate-algorithms, support-vector-machine, and the naïve-Bayes. The fiture-1 demonstrates the confusion-matrix for the anomaly detection using the four algorithms. The testing uses the test set. On account that binary classification is a straightforward approach, and there is a small amount of features in terms of the response time or throughput, four algorithms performs efficient with 100 accuracy.

Consequently, we looked at the four classification algorithms as mentioned above. This was done to test the fault localizing task by use of the test set. The figure-3 illustrates the anomaly-localization confusion matrix for the four algorithms. The seven classes’ results showcased that a decision-tree with the highest precision with about all classes were correctly predicted. Also, S.V.M. was also illustrated to be the next feasible option. As earlier stipulated by information from confusion matrix, both L.D.A. &N.B never had the capacity of distinguishing between the localization classes.

It was further realized that from the confusion matrix, N.B. as well as the L.D.A. produced more F.N. while S.V.M. yielded more T.P. & T.N. when compared with F.P. and the F.N. values. The D.T. algorithm had a reliable and continuous performance as it yielded T.P. values.

For provision of insightful acumen about training as well as detection times of the above mentioned algorithms, the figure-4 was used in presentation of the detection processes as well as the localization processes. Such was done to categorize the test-set for every algorithm. The times were computed on 64-bits intel-core i-7 2.1GHZ processor.

It was further established that there was a significant difference between the S.V.M., N.B., and L.D.A. accuracy. They took considerably quite long to detect and more so, locate the test-set anomalies. It was further established that the decision tree algorithm greatly outperformed other algorithm in the aspect of classification accuracy. The decision-tree also scaled quite well in terms of the time it took to detect and locate test-set anomalies.

The figure-5 demonstrated the summary accuracy of every classification algorithm in the detection anomalies. It also showcases how various classification algorithms discriminated over the different fault types with their location in varied micro-services. The anomaly detection for the four algorithms exhibited a superb performance of 100 accuracy. In anomaly localization, the decision-trees illustrated an impressive score of 99-classification accuracy. It was further seconded by S.V.M. which had a 50 classification score. There was a low accuracy that was exhibited by L.D.A. and N.B. which posted a 6&5 accuracy score respectively.

When the criterion is considered in terms of performance in error detection and the training detection execution rate, it is valid to state that the decision-tree is the most efficient algorithm that should be used online. Specifically, due to the low detection time and more so, high error detecting efficiency, detection tree should be used.

Table-5 illustrates the classification standards metrics. It showcases that for better precision, D.T. should be chosen. It also illustrates that S.V.M. should be chosen if a better recall is the priority.

 

\section{Conclusions and further work}

The micro-service based web applications can be improved by tailoring the upcoming container technology. Furthermore, micro-service based web applications can be efficiently be deployed in the cloud in a more agile fashion. Nonetheless, it necessitates using various anomaly detecting techniques to determine whether there was an abnormality of the service behavior that could result in infringement of the service-level-agreements (S.L.A.’s) that regulate the users. In the project, we illustrated an anomaly detecting framework through the performance of response time as well as resource utilization assessment to determine if there is anomalous micro-service behavior. The design framework constituted three stages which included; localization, pre-processing, and detection. The input encompasses the monitoring-data of the targeted application from the monitoring system. First off, the pre-processing techniques ought to be tailored on the monitoring data. Consequently, the detection phase is designed to determine the anomalies in monitoring-data. If there are some anomalies, the localizing component should determine the fault type and the particular micro-service that such a fault exists. To get the output such as a central processing unit fault in the micro-service-x amongst others. To attain our project objectives, it necessitated using the machine-learning algorithm and after that, train the detection and localization component in a quest for classification anomalies during the detection phase. Consequently, we classified the fault type and further determined the fault location in the localization phase. For implementation of such a framework, the performance metrics were gathered from the monitoring system of micro-service based cloud system. The target-application constituted three micro-service which included; book-service, user-interface, and purchase-service. To train, scripts that regularly performed injection into the targeted micro-services were applied for simulation of the C.P.U. faults and more so, the memory faults. After that, two datasets were gathered with the assistance of the monitoring-agent. The very first dataset stored both the response time, their database, and micro-service throughput. The second dataset is required to store the memory allocation/consumption by the micro-service and the C.P.U.

First off, the monitoring data should be pre-processed prior to being used in detection training. The two datasets get combined together to get third dataset. The down sampling of the data set was done to achieve 1280 samples. Half of the dataset was normal as it did not have any anomaly. Therefore, 630 data set was a normal sample. The other half data set had various kinds of anomalies. A one hot encoding technique was used in encoding the micro-service name into a numeric quantity. During the detection training, an anomalous response time as well as the throughput were some of the observed failures that were initially obtained. We got 1280 samples whereby there were 630 normal samples and the other h 630 anomalous samples. The two classes were labelled as zero and one. For successful achievement of the detection training, binary classification design model got trained with the 2/3 of the dataset. Such is called the training set which implies that 840 samples. The other remaining 1/3 which encompasses 420 samples was primarily used as the test set. It was purposed in testing the model during the training. Just after detection of the anomalies, it is possible to determine the type of an anomaly and the anomalous services that yields such an error. In this respect, we got 630 anomalies of seven classes C.P.U. in U.I. C.P.U. in B.S., and C.P.U. in P.S., The rest include, Memory in the B.S., Memory in P.S., C.P.U. and Memory in B.S., and C.P.U. and memory in P.S. There were 90 samples from every fault-class. Thus, the multi-classification design models were trained with sixty samples as the training set whereas the remaining 30 samples were the test set.

During the detecting training and localizing training, various algorithms were deployed. The algorithms that were deployed included; D.T., N.B., S.V.M., and L.D.A. additionally, it necessitated application of some method for optimizing the algorithm training. Thus, the parameter tuning method was employed for testing the combination parameter and obtaining the most feasible accuracy. Cross-validation-splitting was also a technique that was used. The latter is a strategy used in resampling of the availed data to assess machine-learning design models thus avoiding overfitting. After completion of training, the models that encapsulated the test data were tested.

Binary-classification is a great technique since it is a straightforward strategy and constitutes small amount of characteristics. Every algorithm got a score of 100 accuracy for task detection. Task detection is an easier approach when compared with localization. Degradation of the results were obtained for every algorithm. Nevertheless, the most effective algorithm was the D.T., hence was chosen for detecting anomaly. In the future work, it will necessitate testing framework online. If anything, we will also research on possible ways of testing our design framework with the real public-datasets.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  Remember! This is just a sample.

Save time and get your custom paper from our expert writers

 Get started in just 3 minutes
 Sit back relax and leave the writing to us
 Sources and citations are provided
 100% Plagiarism free
error: Content is protected !!
×
Hi, my name is Jenn 👋

In case you can’t find a sample example, our professional writers are ready to help you with writing your own paper. All you need to do is fill out a short form and submit an order

Check Out the Form
Need Help?
Dont be shy to ask