m

Cosmetics Recommendation System

Content-Based Recommendation Approach

Business Problem
The idea comes from the fact that cosmetics ingredients are made from chemicals. These chemicals work differently for each skin type or skin problems. Unlike most products such as clothes and shoes, consumer is looking for products that will work best for their skin type or problem. And most products with similar ingredients tend to provide same results. By recommending products that has similar ingredients, company is actually recommending products that will most likely work for the consumer. By doing so, they can recommend products that will increase higher customer satisfaction and as a result increase their incremental sales. In this project I will be building content-based recommendation system for cosmetics based on the ingredients of the products.
Techniques
The recommendation was content-based. To match the ingredients, I transformed the text data of ingredients into tokens and then matrix of bag of words.  I used linear kernel to find cosine similarities of the ingredients matrix. I custom wrote an algorithm, that sorts and gets top 100 similar products for each product. then it generates top 5 products based on similarity score, but recommended based on score, price, and rank.
Future Applications
This model can further combine with the safety of ingredients data to recommend products based on safety score. Currently, there are no dataset that shows safety of the ingredients, however there are FDA database and cosmetic agency database that has reports on ingredients safety, but it is a text document and needs pre-processing to create a dataset. There are already tools online that takes ingredients of the products as input and then, recommend similar products based on the safety and similarity of ingredients.
Duration
This project lasted approximately 1-2 weeks. Most time was spent was on text cleaning, researching content-based recommendation and writing a recommendation algorithm.
Key Skills
Textual Data Cleaning, Textual Data Transformation, Data Visualization, Unsupervised Learning, Similarity Analysis. Natural Language Processing.
Tools
Python, Pandas, NumPy, Scikit-learn, NLTK, Matplotlib, Seaborn, Bokeh

m

Copyright © 2023 By Taniya D. Adhikari
All Rights Reserved