Prereqs: You will be given a model as a baseline. You will also be given 2 datasets â€“ a train dataset and a test dataset in which you will build new models.
Using the test dataset, which contains the true class in the last column of the dataset, test the performance of the â€œbaseline modelâ€. Comment on the results – number of identified classes and the accuracy of the model. Build a logistic regression model using the test and train dataset. Using the train data, run the logistic regression and determine the accuracy of the â€œnew modelâ€ with the new test data.
Next, use the random forest technique to classify the data. Calculate the accuracy of each learning. Assemble the learnings and determine the classification accuracy. Make a table to present the accuracy of each learning.
Next, use the gradient boosting technique to classify the data. Do the same as asked above for random forest.
Last, write which learning has the highest accuracy, the lowest, advantages and disadvantages of learnings tested.