Text Mining
- Data mining, Text mining and sentiment analysis
Data mining is the process of extracting information from big data is known as data mining. It’s managed here, analyzed, and presented to the user. The user then analyzes raw data so that one can conclude the given information (Bhushan, 2018).
Text mining is the process of transforming unstructured data into meaningful information that actions regarding decision making can be made easily using artificial intelligence.
Sentiment analysis the process of people’s opinion and altitudes towards a certain topic or product using various automated tools. Researchers business practitioners are brought together so as to determine the favorable and unfavorable opinions towards specific product and services using customers feedbacks (Sharda, Delen, & Turban, 2020).
The relationship between the three is that they all seek to extract information from big data and identify useful pattern that can be used in decision making. They are also semi-automated.
- Application of text mining
Text mining is the process of transforming unstructured data into meaningful information that actions regarding decision making can be made easily using artificial intelligence.
It’s mostly applied in risk management, knowledge management, cyber-crime prevention, customer care services, business intelligence and social media data analysis.
- Induced structure
Induce structure into a text based data refers to the process of getting unstructured data into a format that is easy and more effective for analysis so that the information got can be beneficial to a business. It enhances data capturing, time sensitivity of the data, and improvement of customer’s relation (Sharda et al., 2020).
Ways of inducing structures
The first way is to isolate key words so that you analyze the area of interest. It helps to know the words that are of great importance for analysis which helps in determining the acts of the key words (Andrew, Samia, & Ednre, 2014).
The second ways is to determine the topics so that you can categorize the subject matter since you already know the content, and the last way is to measure sentiments to gauge the tone whether the data is positive or negative (Andrew et al., 2014).
- Role of NLP
Natural language processing helps in structuring a collection of text from bag of words to data that can be easily classified, clustered and associated. It is a sub field of artificial intelligence and computational linguistics (Sharda et al., 2020).
Capabilities
Its goal moves beyond syntax driven text manipulation to true understanding and processing of natural language that considers grammatical and semantic constraints as well as the context (Sharda et al., 2020).
Limitations
NLP is faced with the following challenges:
- Part of speech tagging
- Text segmentation
- Word sense disambiguation
- Syntax ambiguity
- Imperfect and irregular input
- Speech acts
Exercise
- eBay Analytics
Internet exercise
- Three packages for data mining
- AdvancedMiner which provides wide range of tools for data transformations, data mining models, data analysis and reporting
- BioComp i-Suite which analyses nonlinear predictive modeling, data access and cleaning. It also conducts other tasks.
- CMSR Data Miner, built for business data with database focus, incorporating rule-engine, neural network, neural clustering (SOM), decision tree, hotspot drill-down, cross table deviation analysis, cross-sell analysis, visualization/charts, and more.
References
Andrew, S., Samia, T., & Ednre, T. (2014). Inducing Information Structures for Data-driven Text Analysis.
Bhushan, A. (2018). How Big Data Impact Smart Cities.
Eldersveld, D. (2016). Solutions to bring structure to your text data.