Project 1 in the R languange: Predictions whether a user will download an app after clicking a mobile app advertisement

Diogo F. dos Santos

August 9th, 2020

The objective of this project is to predict whether a user will download an app after clicking a mobile app advertisement. The datasets are from Kaggle, click here to see. This project is part of the Data Science course formation of Data Science Academy from Brazil.

Pipeline of the given solution

The solution to this problem was divided into four parts. The first part deals with the data munging and the testing of many machine learning models using the train_sample.csv file and testing with 1E+07 rows of the train.csv. The data of this file was used as the test dataset because the provided dataset did not include the target variable. The second part of the solution got the main tidying lines of part one to tidy the full training dataset, nominated train.csv. In the third part, the tidying training dataset was taken with the best model acquired in part one to train the model, but the number of the trees of the random forest model was reduced due to my notebook capacity. In the fourth part, the trained model was applied to the provided test dataset, test.csv. Afterward, the predicted results were matched with the click_id to produce the submission file.

A script parts are below: