A Multifaceted Approach to Job Title Analysis
CSE 519 - Data Science Fundamentals
Project Description
Project consists of three parts:
- Salary Prediction
- Job Clustering
- Job Satisfaction Analysis
Installing libraries
pip install -r requirements.txt
File Descriptions
Web Scraping Job titles.ipynb - Code for Web Scraping Job titles from CareerBuilder.com
Salary Prediction.ipynb - Code for Salary Prediction using Machine Learning
Job_Satisfaction.ipynb - Code for Job Satisfaction Analysis and Graphs
run_app.py - Code for running Streamlit app (Salary Prediction and Job Clustering)
Datasets
Job Information.csv - Dataset built by scraping web data from CareerBuilder.com
WA_Fn-UseC_-HR-Employee-Attrition.csv - Dataset download from Kaggle
ML Model
salary_model_30_11.pkl - Weighted Model developed using a combination of Regressors (refer to Salary Prediction.ipynb)
How to run the code
.ipynb files (Jupyter Notebook Files) can be run either using the command jupyter notebook or jupyter lab, or can be run directly on Google Colab (after mounting the Google Drive).
To run the file run_app.py, run the following command in the terminal:
streamlit run run_app.py