Text Mining & NLP Final Project
Applied NLP and text mining techniques to extract insights and patterns from unstructured text data.
Unstructured text data is abundant but underutilized. The challenge was to apply systematic NLP techniques to extract meaningful patterns and classifications from raw text corpora.
Implemented a complete text mining pipeline covering preprocessing (tokenization, stopword removal, stemming), feature extraction (TF-IDF, word embeddings), and model training for classification and topic discovery.
Demonstrated end-to-end NLP capability — from raw text ingestion to actionable insights — applicable to real-world use cases like sentiment analysis, document classification, and content recommendation.
A graduate-level applied NLP project covering the full text mining pipeline: data collection, preprocessing, tokenization, feature extraction, topic modeling, and classification. Built as the capstone for an Applied Text Mining course at the University of San Diego.