We can work on COMPARATIVE STUDIES OF BI TECHNIQUES

Abstract

Business Intelligence, BI is a concept that usually involves the delivery and integration of relevant and useful business information in an organization. Companies use BI to detect significant events and identify/monitor business trends in order to adapt quickly to their changing environment and a scenario. If you use effective business intelligence training in your organization, you can improve the decision making processes at all levels of management and improve your tactical strategic management processes. As a result, there has been a development of a number of strategies that have made business intelligence an element of everyday business. This task therefore seeks to compare two techniques used for business intelligence. The techniques being compared include RapidMiner and KNIME.

 

 

 

Introduction

Improvements in technology have resulted in the emergence of Business intelligence as an essential aspect in how people conduct everyday business activities. Today, business intelligence is used in all sectors of business to aid in decision making. There are a number of ways in which business intelligence is defined. However, simply put, business intelligence refers to the use of data from past experiences to make better decisions about the future, by studying trends evident from the data available. There is an increasing need for business intelligence within organizations, making the skill a requirement within in any business environment. Therefore, the subject has become an increasingly essential aspect in everyday learning.

Primarily, businesses have limited resources with which they can use to effect decision making within the company. Often, companies are forced to work with what they have, considering that most companies do not have the capital and resource to invest in making their businesses even more efficient. The need to make more of the resources they have results in businesses opting to invest in processes that promote easier and faster decision making. As a requirement of the completion of the module, students are required to undertake research and compare business intelligence techniques. Thus, at the end of the course, students are required to be well equipped to:

Evaluate the role and benefits of effective business intelligence in the organisation.
Demonstrate awareness and critical understanding of business intelligence front end tools and techniques.
Demonstrate competence in applying Business Intelligence concepts and techniques through implementation based on various existing technologies.
Report and present analysis of results obtained.
Use software packages for business Intelligence and Enterprise Management solutions.

In the course of completing the assignment, different technologies and techniques have to be explored to ensure that students can effective come to develop the required skills to make them better equip to handle challenges that may arise in the course of dealing with the different aspects of business intelligence and data analytics.

Methodology

The emergence of Business intelligence as an essential aspect of running any business has resulted in the emergence of different software and techniques that aid in the analysis of data. As a requirement of the assignment, two techniques are to be selected for comparison. With many data analytics tools available to choose from, the assignment shall focus on RapidMiner and KNIME.

RapidMiner

RapidMiner is an open source data mining solution which has been used widely across the globe. The idea was conceived in Dortmund at the University of Dortmund in 2001 (Land and Fischer, 2012, p.V). However, since 2007, the application development continued on the application by Rapid-I GmbH (Land and Fischer, 2012, p.V). Having an academic background, the application has been used by both businesses and for academic purposes (Land and Fischer, 2012, p.V). The application can be used for be used for various application domains such as text, image, audio and time series analysis (Land and Fischer, 2012, p.V). support is available from the RapidMiner’s team.

KNIME

KNIME is an open source application which is developed as a data analytics, reporting and integration platform (Reifer, 2015). The application is developed and supported by KNIME.com AG (Reifer, 2015). The use of a graphical user interface helps users to create and execute data flows, executing selected analysis (Reifer, 2015). The interface also makes it simple to review the results, models and interactive views from the resulting analysis (Reifer, 2015). Being an open source software means that the product is open for use to any interested party under an open license.

KNIME is written in Java and built on Eclipse. Eclipse makes it easier for the application to leverage plug-ins which makes it easier to extend the uses of the application (Reifer, 2015). Available plug-ins support processes such as integration, with methods for text mining, image mining and time series analysis (Reifer, 2015). The software also integrates with numerous other open source projects such as including machine learning algorithms from Weka, R and JFreeChart (Reifer, 2015). Nodes are also provided in the application to help user run Java, Python, Perl and other code fragments (Reifer, 2015). In addition, the Eclipse plug-in capability helps the software to develop connector extender nodes that are supported by a wide range of systems and platforms (Reifer, 2015). The plug-ins have also made it easier to add support for even more functionality.

Reason for the choice made

The two applications are developed as open source; this means that the two can be accessed by any party that is interested in using the application. Support for the two applications is also available from different sources on the internet. Tutorials can also be accessed from the developer websites. Considering that the two applications are available free of charge and there is support for the two, the choice for the two applications is relatively easy to make. However an underlying difference in the functionality the two software products offer makes it easier to compare them.

Simulations

Introduction of the dataset

The data available for analysis is as a result of direct marketing campaign initiative by a Portuguese banking institution (Moro et al., 2011). The campaign from which the data is retrieved was based on phone calls (Moro et al., 2011). Often, a client had more than one contacts to access products from the bank (Moro et al., 2011). Two data sets were found from the evaluation by the research. The two dataset include bank-full.csv with all examples, ordered by date (from May 2008 to November 2010) and Bank.csv with 10% of the examples (4521), randomly selected from bank-full.csv. From the above two datasets, the smallest dataset is provided as a benchmark to be used as a benchmark to test more computationally demanding machine learning algorithms such as SVM (Moro et al., 2011).

Input encoding / input representation

Input encoding defines an approach used to tell the program how to interpreted content from the input file being fed into the computer. It is essential to find an encoding strategy that assures that the two software programs being compared give the right feedback.

RapidMiner can be useful when using descriptive statistics. However, such statistical inputs require the development of strategies that can offer more compared to the normal approaches as the process metaphors involved in descriptive statistics a more complex. When presented with data and wish to transform it into column specific data, Dummy Encoding is utilized to perform this task. Dummy technology is used when one wishes to represent nominal values numerically.

As for KNIME, the software makes it possible to utilize traditional programming languages such as R, Python, and Java. As a result, the encoding style utilized in any case depends on the language one chooses to use with the technology.

Nonetheless, both software programs accept inputs in the form of spreadsheets. Furthermore, an analysis can begin by performing a drag and drop operation. As in this case, the datasheets available are found in spreadsheet with a .cvs extension to them. Therefore data is represented in rows and columns.

Procedures

First the software has to be downloaded. They can be found on the websites of the two companies that develop the two products. The two products are free; therefore, no fees are to be paid for their use. However, one is required to create a free account where an email address and a name are required. For the case of RapidMiner, one is required to specify what type of account they are creating. For the case of a business account, a user is required to provide the name of the business and its address.

KNIME

The application designed by KNIME provides one of the best analytical tools that helps to manipulate, analyze and model data. The program further utilizes intuitive visual programming to achieve what it is designed for. Furthermore, a number of components are integrated in the program to ensure data mining and machine learning can be achieved by the program by utilizing data pipelining.

Figure 1: general appearance of KNIME

Figure 2: description of the different components of KNIME

Figure 3: an example of a histogram generated from the data

RapidMiner

The application provides an integrated environment where users can perform business analytics, predictive analysis, text mining, data mining, and machine learning. When used in conjunction with other commercial and business applications, RapidMiner can be used to perform other functions such as application development, rapid prototyping, training, education, and research.

Nonetheless, to use the two software one has to have knowledge of how the two work. Having found the necessary information online, inserted the dataset I had found online and from the data was able to get graphs and was also able to manipulate some data too.

Figure 4: initial screen

Figure 5:  inserting data

Figure 6: a preview of inserted data

Figure 7: example of graph obtained

Reasons why the software were chosen

The internet readily provides information how the two software work. They provide two different approaches on how data can be analyzed. However, there are similarities in the way data can be represented when using both. For instance, in both case, data can be represented in graphs.

Results and Analysis

Results

From the dataset presented, the aim of the Portuguese banking institution from which the data was obtained was to determine whether a client had subscribed on a term deposit, based on the information provided when signing up to the promotion by the bank. A customer to the bank was supposed to respond with a yes or no answer.

Differences in the different software used

The main apparent differences that could be seen while using the two applications was there user interfaces. Apart from what could be seen, the two applications have similar functionality. As a result it was evident from the two applications that bank customers with more than one contact had signed up for the promotional program offered by the bank.

Problems encountered and the solutions.

The only problem encountered resulted from the fact that being new at interacting with the two technologies, resulted in too much time being spent on trying to find the necessary material to help facilitate learning. However, the school library and the worldwide web proved to be essential in helping through the problems that arose. Investing time to try out different options helped develop even better understanding of the different technologies available for data analytics and business intelligence.

Conclusion

It is essential to point out that in the course of the research; it is evident that business intelligence is increasingly becoming an important aspect of everyday business decision making. The importance attached to business intelligence has seen the emergence of different technologies that provide business intelligence and data analytics capabilities to organizations. However, the use of such technology highly depends on the interest one can elicit in trying to know how to use them because although many can be taught, the technologies are just too many for college and university curriculums to fully cover.

References

Land, S. and Fischer, S., (2012). RapidMiner 5. Rapid-I Gmbh.

 

Moro, S., Laureano, R., and Cortez, P. (2011). Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology.

 

Reifer, A. (2015). Examining the KNIME open source data analytics platform. Available from http://searchbusinessanalytics.techtarget.com/feature/Examining-the-KNIME-open-source-data-analytics-platform (Accessed 22 April 2017).

 

This entry was posted on February 1, 2018 at 3:27 pm and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed.
Both comments and pings are currently closed.

Is this question part of your Assignment?

We can help

Our aim is to help you get A+ grades on your Coursework.

We handle assignments in a multiplicity of subject areas including Admission Essays, General Essays, Case Studies, Coursework, Dissertations, Editing, Research Papers, and Research proposals

Header Button Label: Get Started NowGet Started Header Button Label: View writing samplesView writing samples