CVParser
Python library for parsing resumes using natural language processing and machine learning.
Setup
Installation on Linux and Mac OS
-
Follow the guide here on how to clone or fork a repo
-
Follow the guide here on how to create virtualenv
-
To create a normal virtualenv (example myvenv) and activate it (see Code below).
$ virtualenv --python=python3 myvenv $ source myvenv/bin/activate (myvenv) $ pip install -r requirements.txt
Usage
from cvparser.parser import CVParser
CVParser.download_nlk_data()
parser = CVParser(file_path="path/to/file.[pdf|doc|docx|png|jpeg]")
parser.parse()
print(parser.json())
Re-training the Model
cdinto thetrainfolder.- Delete the folder
modeland the filetrain.json. - Copy your new training data into the
trainfolder. The train data must be injson. This can be generated using the data annotation tool calledDataturk. The file containing the training data must be namedtrain.json. - Then, start re-training the model by execute the python script in the
trainfolder namedmanual_training.py. - Then test your new model by #usage .