Final Year Practical
Here I am going to describe the practices that I performed today. In this practical, I have used software called Orange and PowerBI. I was given a dataset of COVID-19 https://www.kaggle.com/gpreda/coronavirus-2019ncov in which I have to preprocess the data and calculate the accuracy of classification before and after the preprocessing.
The dataset consists of data of patient death, patient recovery over different states and countries. Dataset consists of 8 different columns: Country, State, Latitude, Longitude, Deaths, Recovery, Total Cases, Date.
After preprocessing I also have to develop a dashboard in PowerBI to find maximum data insights by plotting Bar Chart, BoxPlot, Pie Chart, and Stack Plot.
I have to import one file into the orange simulation. Furthermore, This is distributed in two ways. First, this is for non- preprocessing, and the second is with preprocessing. The file is connected with a data table that shows the perfect data to run on this tool. Furthermore, this is attached with a data sampler and then this connects with test and score with algorithms like knn and logistic. This will calculate the accuracy of this data set.
So, in the end, there is one save data function is there which provides us pre-processed data and with the help of that, we have to plot the graph using powerBI desktop. So these are the graphs plotted in this simulation which are shown below.
Above dashboard shows 3 different types of graphs:
- Bar Graph showing death data of different states.
- Bar Graph showing recovery data of different states.
- Pie Chart showing death data of different states.
- Stacked Bar Graph showing the sum of death and recovered data of different states.
Done with this practical. There are lots of things to learn from this like analysis of the data, preprocessing of data, and plot graphs using powerBI desktop simulation and get to learn about the orange tool which helps to do stuff like this.