LeetComp - Background tasks powering the static content at LeetComp

Overview

LeetComp

Code style: black Checked with mypy

Analysing compensations mentioned on the Leetcode forums (https://kuutsav.github.io/LeetComp).

Note: only supports posts from India at the moment


Setup

# Install poetry following the instructions at https://python-poetry.org/docs or use pip
$ pip install poetry

# Setup the project (from the project root directory)
$ poetry install

# To run the commands for updating the data, go into the venv created by poetry
$ poetry shell

# Tested on python 3.9.7

Updating data (stores in Posts.db under the parent folder)

1. Fetching metadata for compensation posts

>>> from leetcomp.services import get_posts_meta_info
>>> get_posts_meta_info()

2022-02-08 | INFO | get_posts_meta_info:153 - Found 6628 posts(442 pages)
2022-02-08 | INFO | get_posts_meta_info:162 - 32 posts synced, skipping the rest ...

2. Updating Posts with the user content

>>> from leetcomp.services import update_posts_content_info
>>> update_posts_content_info()

2022-02-09 | INFO | update_posts_content_info:177 - Found 32 post ids without content, syncing ...
2022-02-09 | INFO | update_posts_content_info:189 - PostID 1757667;   0/32 posts done
2022-02-09 | INFO | update_posts_content_info:189 - PostID 1757212;  10/32 posts done
2022-02-09 | INFO | update_posts_content_info:189 - PostID 1755933;  20/32 posts done
2022-02-09 | INFO | update_posts_content_info:189 - PostID 1754969;  30/32 posts done
2022-02-09 | INFO | update_posts_content_info:190 - All post contents synced

3. Parsing results for the ui

>>> from leetcomp.ner_heuristic import parse_posts_and_save_tagged_info
>>> parse_posts_and_save_tagged_info()

2022-02-09 | INFO | parse_posts_and_save_tagged_info:191 - Total posts: 6663
2022-02-09 | INFO | parse_posts_and_save_tagged_info:192 - N posts dropped (missing data): 1380
2022-02-09 | INFO | _report:125 - Posts with all the info: 5294
2022-02-09 | INFO | _report:126 - Posts with Location: 4981
2022-02-09 | INFO | _report:127 - Posts with YOE: 5204
2022-02-09 | INFO | _report:128 - Posts from India: 3764
2022-02-09 | INFO | _filter_invalid_salaries:154 - Dropped 221/3764 records due to invalid pay

4. Updating the inverted index

>>> from leetcomp.inverted_index import build_inverted_index
>>> build_inverted_index()

2022-02-09 | INFO | __main__:build_inverted_index:58 - Keeping 1266/1266 tokens

Roadmap

  • Automate data refresh using aws lambda
  • Standardize Company and Role
  • Index Company and Role separately
  • Improve page nav
  • Global data support
Issues
Releases(v1.1.0)
Owner
Kumar Utsav
An engineer building scalable ML systems
Kumar Utsav
Custom SLURM wrapper scripts to make finding job histories and system resource usage more easily accessible

SLURM Wrappers Executables job-history A simple wrapper for grabbing data for completed and running jobs. nodes-busy Developed for the HPC systems at

Sara 2 Dec 12, 2021
Batch Python Program Verify

Batch Python Program Verify About As a TA(teaching assistant) of Programming Class, it is very annoying to test students' homework assignments one by

Han-Wei Li 6 Jan 17, 2022
The goal of this program was to find the most common color in my living room.

The goal of this program was to find the most common color in my living room. I found a dataset online with colors names and their corr

1 Nov 08, 2021
A simple calculator made with tkinter.

Simple Calculator A simple calculator made with tkinter. Requirements None, only you need to have windows 😉 ...Enjoy! Installation Clone this reposit

Abhyush 2 Jan 10, 2022
Taking the fight to the establishment.

Throwdown Taking the fight to the establishment. Wat? I wanted a simple markdown interpreter in python and/or javascript to output html for my website

Trevor van Hoof 1 Jan 31, 2022
Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a local folder

Ingestinator Ingestinator is my personal VFX pipeline tool for ingesting folders containing frame sequences that have been pulled and downloaded to a

Henry Wilkinson 1 Jan 23, 2022
A clock widget for linux ez to use no need for cmd line ;)

A clock widget in LINUX A clock widget for linux ez to use no need for cmd line ;) How to install? oh its ez just go to realese! what are the paltform

1 Jan 18, 2022
Curses frontend for Canto daemon

Canto Curses The curses (text) client for canto-daemon. Canto-daemon is required to work and is found at: http://github.com/themoken/canto-next Requir

Jack Miller 81 Sep 28, 2021
Predicting Global Crop Yield for World Hunger

Crop Yield And Global Famine - The fifth project I created during my time at General Assembly. I completed this project with three other classmates in the span of three weeks. Most of my work was dir

Adam Muhammad Klesc 1 Jan 01, 2022
Construção de um jogo Dominó na linguagem python com base em algoritmos personalizados.

Domino (projecto-python) Construção de um jogo Dominó na linguaguem python com base em algoritmos personalizados e na: Monografia apresentada ao curso

Nuninha-GC 1 Jan 11, 2022
Pykeeb - A small Python script that prints out currently connected keyboards

pykeeb 🐍 ⌨️ A small Python script that detects and prints out currently connect

Jordan Duabe 1 Feb 14, 2022
Camera track the tip of a pen to use as a drawing tablet

cablet Camera track the tip of a pen to use as a drawing tablet Setup You will need: Writing utensil with a colored tip (preferably blue or green) Bac

13 Dec 12, 2021
A simple program to recolour simple png icon-like pictures with just one colour + transparent or white background. Resulting images all have transparent background and a new colour.

A simple program to recolour simple png icon-like pictures with just one colour + transparent or white background. Resulting images all have transparent background and a new colour.

Anna Tůmová 0 Jan 18, 2022
An OBS script to fuze files together

OBS TEXT FUZE Fuze text files and inject the output into a text source. The Index file directory should be a list of file directorys for the text file

SuperZooper3 1 Dec 26, 2021
Anti VirusTotal written in Python.

How it works Most of the anti-viruses on VirusToal uses sandboxes or vms to scan and detect malicious activity. The code checks to see if the devices

cliphd 3 Dec 25, 2021
An Android app that runs Elm in a webview. And a Python script to build the app or install it on the device.

Requirements You need to have installed: the Android SDK Elm Python git Starting a project Clone this repo and cd into it: $ git clone https://github.

Benjamin Le Forestier 10 Nov 28, 2021
A demo Piccolo app - a movie database!

PyMDb Welcome to the Python Movie Database! Built using Piccolo, Piccolo Admin, and FastAPI. Created for a presentation given at PyData Global 2021. R

6 Jan 23, 2022
Uma versão em Python/Ursina do aplicativo Real Drum (android).

Real Drum Descrição Esta é uma versão alternativa feita em Python com a engine Ursina do aplicatio Real Drum (presente no Google Play Store). Como exe

hayukimori 3 Dec 19, 2021
Nag0mi ctf problem 2021 writeup

Nag0mi ctf problem 2021 writeup

3 Dec 09, 2021
A community based economy bot with python works only with python 3.7.8 as web3 requires cytoolz

A community based economy bot with python works only with python 3.7.8 as web3 requires cytoolz has some issues building with python 3.10

4 Dec 31, 2021