A modern, intelligent web application that detects duplicate questions on StackOverflow using deep learning. This project leverages Convolutional Neural Networks (CNN) combined with Natural Language Processing (NLP) to identify semantically similar questions with high accuracy.
- Smart Duplicate Detection: Uses CNN-based NLP models to identify duplicate questions
- Web UI: Interactive Flask-based web interface for easy question submission and analysis
- Batch Processing: Submit and analyze multiple questions at once
- Model Training: Train the model directly from the web interface with progress tracking
- Real-time Predictions: Get instant results for your queries
- Model Statistics: View detailed model performance and training metrics
- Sample Questions: Try the model with pre-loaded sample questions from StackOverflow
- Background Training: Train the model without blocking the UI using threading
Before you begin, ensure you have the following installed:
- Python 3.7 or higher
- pip (Python package manager)
- Git
git clone https://github.com/omsudhamsh/duplicate-question-detection-stackoverflow.git
cd duplicate-question-detection-stackoverflow