This essay has been submitted by a student. This is not an example of the work written by professional essay writers.
Uncategorized

URBAN TRAFFIC PREDICTION

Pssst… we can write an original essay just for you.

Any subject. Any type of essay. We’ll even meet a 3-hour deadline.

GET YOUR PRICE

writers online

URBAN TRAFFIC PREDICTION

Introduction

The prediction of urban traffic is one of the essential items in all transportation systems globally. At the same time, predicting urban traffic is necessary since it emphasizes the relationship between the safety of a specific group of people operating their businesses, along with a specified transportation system. Besides, it allows researchers to develop a statistical model that best explains the spatial-correlations exhibited by transportation systems (Qin, 2013). The implication here is that the prediction of urban traffic can explain the geographical factors that influence other economic perspectives. The recent technologies offer an excellent opportunity to collect traffic data implying that modelling, analyzing and coming up with statistical knowledge cannot be disputed. However, predicting urban traffic is tedious and complex; thus, the researcher is subject to cleaning the data before carrying out any further analysis.

Exploratory Data Analysis of the Urban Traffic

Exploratory data analysis, abbreviated as EDA, is one of the basic analytical techniques applied in research work. The major focus of EDAs is to categorize and expound on the distribution of the underlying data set to come to qualitative inferential statistics. A good example of an experimental technique is analyzing means, standard deviations, variances, kurtosis and skewness of a given data set (Qin, 2013). However, it is equally important to explore the chosen data to develop a good predictive model. For the current case study for the urban traffic prediction, the first steps involved are extracting data from google search engines and cleaning it. To come up with a good methodological approach, data was extracted from the census bureau website. This domain website was appropriate because it consists of all the information for traffic control variables specified in the United States. The following is a preview of the data for the GPS that was put in place to track the traffic in the major states in the United States.

rideable_typeended_atstart_station_namestart_station_idend_station_nameend_station_idpstart_lngend_latend_lngmember_casual
docked_bike5/27/2020 10:16Franklin St & Jackson Blvd36Wabash Ave & Grand Ave19941.8777-87.635341.8915-87.6268member
docked_bike5/25/2020 11:05Clark St & Wrightwood Ave340Clark St & Leland Ave32641.9295-87.643141.9671-87.6674casual
docked_bike5/2/2020 15:48Kedzie Ave & Milwaukee Ave260Kedzie Ave & Milwaukee Ave26041.9296-87.707941.9296-87.7079casual
docked_bike5/2/2020 16:39Clarendon Ave & Leland Ave251Lake Shore Dr & Wellington Ave15741.968-87.6541.9367-87.6368casual
docked_bike5/29/2020 13:27Hermitage Ave & Polk St261Halsted St & Archer Ave20641.8715-87.669941.8472-87.6468member
docked_bike5/29/2020 14:14Halsted St & Archer Ave206May St & Taylor St2241.8472-87.646841.8695-87.6555member
docked_bike5/20/2020 13:46Hermitage Ave & Polk St261Hermitage Ave & Polk St26141.8715-87.669941.8715-87.6699member
docked_bike5/6/2020 19:07Ritchie Ct & Banks St180Ritchie Ct & Banks St18041.9069-87.626241.9069-87.6262casual

 

Data Cleaning

From the above data set, the data required cleaning and elimination of outliers. Thus the following steps were appropriate before coming up with a good statistical model

  1. The data is divided into smaller units that can be quantified easily.
  2. The units that share the same measurements and removing those that do not make meaning (Ho Yu, 2010)
  3. Identifying possible sources of variations and outliers and eliminate them.

The implication here is that the outside sources of variation and errors were eliminated.

Dimension Reduction

The dimensional scaling approach applies to short traffic urban prediction most common in busy urban roads. In analytical procedures, the dimension reduction approach follows three steps: correlation analysis, qualitative analysis and multivariate regression (Han & Huang, 2020). For the urban traffic prediction in our case study, the following steps will be applied.

  1. Selection of appropriate historical data without outliers based on qualitative analysis

In this step, specific road networks will be selected, and data filtered from depending on the selection criterion. Also, the traffic points where concentration is higher will be set as a target.

  1. Grouping the data using the multidimensional method

The selected data will be grouped into units representing different road networks and traffic variables based on the specified streets to ease analysis.

  1. Reducing the data completely

This is the last step under the dimension reduction method and will use the Pearson correlation coefficients to filter the data required. Since the model targeted requires a generally lower correlation, the target areas with high correlation values will be eliminated. As a result, this will improve the robustness of the model at large.

The following formula is essential in calculating the Pearson coefficient and will give the researcher an enhanced prediction model.

Feature Engineering

Feature engineering is a vital step in machine learning and involves transforming data and coding the given variables to fit the required model. The following steps will be incorporated into the methodology

  1. Frequency data filtering

In this step, the data will be filtered depending on the traffic concentration in the target areas. The main focus will be to eliminate variables with uninformative data.

  1. Encoding Categorical Variables

Under this step, bins will be counted, and the categorical variables transformed to meet the minimum required condition that the correlation must be lower.

  • Stacking the model based on the data transformed

This step will focus on stacking all the variables incorporated in the model in the log-transformed form.

  1. Extracting the Required categorical variables

After transforming the chosen variables into the log form, the variables defining the traffic data set will be extracted to meet an unbiased model’s minimum requirements.

Choice of modelling techniques

The accurate prediction of urban traffic will depend on the model incorporated in this case study. This implies that a sufficient model free from outside sources of variation must exist. Meeting this requirement is complex but needs thoroughly cleaned data without outliers (Zhao, Ukkusuri & Lu, 2018). The backpropagation neural will focus on bivariate correlation analysis and multiple linear regression models that are the most appropriate for predicting urban traffic in the target areas. The two models are sufficient and provide a good opportunity for statistical inference. Other models, such as descriptive analysis, are not appropriate since they offer limited resources.

Multivariate Analysis

Under multivariate analysis, regression analysis will be carried out. Independent (explanatory variables) and dependent (response variables) will be specified (Grohman, 2004). Here, the significance level will be taken at an alpha level of 0.05, thus implying that coming up with appropriate inferences. The model is as follows;

Bivariate Correlation

Under this modelling type, the Pearson coefficients of correlation will be determined for the log-transformed variables. Thus the strength of association for the urban traffic prediction data set will be obtained. However, it is worth noting that the modelling assumes that the Pearson coefficient ranges from -1 to +1. After the modelling, one can easily infer numerous patterns of urban traffic across many states.

Hyperparameter Optimisation and Model Evaluation

The Bayesian optimization technique is the most appropriate for the urban traffic prediction data set. This is because the number of estimators and explanatory variables is sufficient to optimize the chosen models fully. On the other hand, the Bayesian technique will allow the researcher to draw appropriate conclusions ensuring that the multicollinearity of the variables is minimal.

For the model Evaluation, the binary logistic regression and Bayes approach will provide a good insight into the entire model (Zhao, Ukkusuri & Lu, 2018). The level of relevance denoted as alpha for both the Bayesian and regression coefficient is 0.05. Therefore, to test the entire model’s significance, the researcher only needs to compare the obtained coefficients with the value of alpha and draw appropriate inferences.

Scalability Issues

In the chosen model, the model’s scalability is the measure of how its effectiveness can be reduced or increased. Most transportation systems use technological systems in controlling traffic. As such, the Artificial Neural Network for the chosen model will ensure that it is first tested before further implementation (Zhao, Ukkusuri & Lu, 2018). This implies that the following assumptions must be met.

  1. The data must be normally distributed with mean zero and variance 1
  2. The skewness and kurtosis must fall within the required positive or negative dimension.
  3. There are no missing values and outliers within the data set.

However, the researcher is most likely to encounter the following challenges in modelling a predictive statistic for urban traffic data.

  1. Extracting data from the required domain websites is complicated and time-consuming. In this case, the researcher might end up obtaining insufficient data.
  2. Cleaning the data is tedious and takes more time. Also, failure to eliminate outliers the results might be redundant, reducing the entire model’s effectiveness.
  3. The interpretation and modelling of regression and Bayes coefficients are complex and require excellent knowledge in machine learning.

Ethical considerations

For any research to be effective and yield the required results, it must follow the researchers’ stipulated ethical guidelines. This study is subject to requests for permission to access data from the domain website of traffic control in the chosen locale. At the same time, where the public will be involved, consent forms will be sent to assure them of the confidentiality of any information obtained from them.

 

 

References

Zhao, Y., Ukkusuri, S., & Lu, J. (2018). Multidimensional Scaling-Based Data Dimension Reduction Method for Application in Short-Term Traffic Flow Prediction for Urban Road Network. Journal Of Advanced Transportation2018, 1-10. doi: 10.1155/2018/3876841

Qin, Z. (2013). The Urban Road Short-Term Traffic Flow Prediction Research. Applied Mechanics And Materials423-426, 2954-2956. doi: 10.4028/www.scientific.net/amm.423-426.2954

Han, L., & Huang, Y. (2020). Short-term traffic flow prediction of road network based on deep learning. IET Intelligent Transport Systems14(6), 495-503. doi: 10.1049/iet-its.2019.0133

Ho Yu, C. (2010). Exploratory data analysis in the context of data mining and resampling. International Journal Of Psychological Research3(1), 9-22. doi: 10.21500/20112084.819

Grohman, W. (2004). Using Convex Sets for Exploratory Data Analysis and Visualization. Data Mining And Knowledge Discovery9(3), 275-295. doi: 10.1023/b:dami.0000040906.82842.b5

 

 

 

 

 

 

 

  Remember! This is just a sample.

Save time and get your custom paper from our expert writers

 Get started in just 3 minutes
 Sit back relax and leave the writing to us
 Sources and citations are provided
 100% Plagiarism free
error: Content is protected !!
×
Hi, my name is Jenn 👋

In case you can’t find a sample example, our professional writers are ready to help you with writing your own paper. All you need to do is fill out a short form and submit an order

Check Out the Form
Need Help?
Dont be shy to ask