Portfolio

Movie Recommendation System

Developed a movie recommendation system based on cosine similarity. The model analyzes textual content (overview) to suggest thematically similar movies, using NLP techniques to transform text into vectors.

Python Scikit-learn NLP Cosine Similarity CountVectorizer Pandas

Business Intelligence Solution for Activity Analysis

Developed a complete business intelligence solution to analyze a company's activity. The project included data integration via an ETL pipeline (Talend), Data Warehouse modeling, and the creation of a Power BI dashboard to track key KPIs such as the conversion funnel and overall performance.

Power BI Talend ETL Data Warehouse Data Modeling

Customer Segmentation and Scoring

To optimize marketing strategies, I segmented a database of 225 customers into three profiles (loyal, occasional, distant) using MCA and clustering. I then developed a logistic regression model (AUC = 0.951) to predict loyalty program enrollment, enabling more effective targeting.

R FactoMineR factoextra GLM Excel

Vehicle Registration Forecasting

To assist with strategic planning, I designed a linear regression model to forecast vehicle registrations. The project involved time series analysis (trend, seasonality) and rigorous statistical validation to ensure the reliability of the forecasts.

SAS Excel

Shiny Dashboard: World Cup 2022

Developed a Shiny application from scratch to visualize data from the 2022 World Cup. The project includes a complete pipeline: from data collection and cleaning to the deployment of an interactive interface offering a rich user experience.

R Shiny ggplot2
View application

Analysis of Voter Abstention

To understand the factors behind voter abstention in France, I conducted an econometric analysis using socio-economic data. The study identified the key determinants of this phenomenon, offering insights for public policy.

SAS Excel

TruthLens: News Veracity Analysis

To combat misinformation, I developed TruthLens, a Streamlit application that analyzes the veracity of articles. The project integrates a complete NLP pipeline (NLTK, TF-IDF) and a Random Forest model. To overcome deployment challenges, I containerized the application with Docker and deployed it on Hugging Face Spaces.

Python Streamlit Scikit-learn NLTK Docker Hugging Face
View application