Popularity prediction on Instagram using machine learning
Tipus de documentTreball Final de Grau
Condicions d'accésAccés obert
In the last year, the research about new ways of using Machine Learning for the human profit has grown exponentially. At the same time, our society is evolving into new ways of communication and social interaction. Instagram is one of the faces of this change in the human evolution. Prediction systems and Artificial Intelligence are topics that are exploding right now. Companies are investing in challenges and experts to predict that product that a user will want, or that music that he will like, or that film that he will want to watch. There is a new challenging branch to be followed and that is Social Media. Predict the popularity of pictures, which one will be the most popular of the day, which is the best cut for a selfie to be more popular... This project tries to put these two realities together. The goal is to predict how many likes a post is gonna get before being posted on Instagram. This would only be possible right now thanks to Machine Learning. Machine Learning is this concept that we can train computers to identify patterns and data, and then use those patterns to predict off of new data. We will give samples of posts to our machine so it can find patterns and estimate and predict a result after being given some other input, based on the patterns that it learned. The system will consist of three sections: First, the input picture will be classified into one category depending on the theme of the picture using a retrained model based on Deep Learning. In this project we are going to take into account six categories: Animals, Food, Friends, Landscape, Quote and Selfie. Second, the input picture will be compared with a set of 200 pictures from the selected category that already have a score between 0 and 1 using another retrained model based on Deep Learning. An algorithm will make the comparison and will result in a histogram. The maximum point of the histogram will be the computed score of the input picture. Third, we will use that computed score from the second section as a variable in a regression, in order to get the final prediction of likes. Other variables are taken into account in this regression as well. As you can see in the description of the system, without Machine Learning it would be impossible, even for a human being, to identify all the necessary patterns and predict off new data accurately. Humans can typically create one or two good models a week; machine learning can create thousands of models a week.