This is a project for analysis and estimation of House Prices in King County USA The .csv file contains the data of the house and the .ipynb file contians the analysis and code This project is done on Jupyter notebook The project uses Linear Regression and Pipeline() to fit and predict the prices.
This is an analysis and prediction project for house prices in King County, USA based on certain features of the house
Overview
Data Analytics on Genomes and Genetics
Data Analytics performed on On genomes and Genetics dataset to predict genetic disorder and disorder subclass. DONE by TEAM SIGMA!
In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift.
ETL Pipeline for AWS Project Description In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift. The data is loaded from S3 t
Udacity-api-reporting-pipeline - Udacity api reporting pipeline
udacity-api-reporting-pipeline In this exercise, you'll use portions of each of
This is a repo documenting the best practices in PySpark.
Spark-Syntax This is a public repo documenting all of the "best practices" of writing PySpark code from what I have learnt from working with PySpark f
Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
Data Scientist Learning Plan Demonstrate the breadth and depth of your data science skills by earning all of the Databricks Data Scientist credentials
Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.
topas-create-graphs A script to automatically plot the results of a topas simulation Works for percentage depth dose (pdd) and dose profiles (dp). Dep
Monitor the stability of a pandas or spark dataframe ⚙︎
Population Shift Monitoring popmon is a package that allows one to check the stability of a dataset. popmon works with both pandas and spark datasets.
Convert monolithic Jupyter notebooks into Ploomber pipelines.
Soorgeon Join our community | Newsletter | Contact us | Blog | Website | YouTube Convert monolithic Jupyter notebooks into Ploomber pipelines. soorgeo
My first Python project is a simple Mad Libs program.
Python CLI Mad Libs Game My first Python project is a simple Mad Libs program. Mad Libs is a phrasal template word game created by Leonard Stern and R
NumPy aware dynamic Python compiler using LLVM
Numba A Just-In-Time Compiler for Numerical Functions in Python Numba is an open source, NumPy-aware optimizing compiler for Python sponsored by Anaco
An extension to pandas dataframes describe function.
pandas_summary An extension to pandas dataframes describe function. The module contains DataFrameSummary object that extend describe() with: propertie
An easy-to-use feature store
A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.
Improving your data science workflows with
Make Better Defaults Author: Kjell Wooding [email protected] This is the git re
4CAT: Capture and Analysis Toolkit
4CAT: Capture and Analysis Toolkit 4CAT is a research tool that can be used to analyse and process data from online social platforms. Its goal is to m
ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.
ToeholdTools Category Status Repository Package Build Quality A library for the analysis of toehold switch riboregulators created by the iGEM team Cit
Senator Trades Monitor
Senator Trades Monitor This monitor will grab the most recent trades by senators and send them as a webhook to discord. Installation To use the monito
Kennedy Institute of Rheumatology University of Oxford Project November 2019
TradingBot6M Kennedy Institute of Rheumatology University of Oxford Project November 2019 Run Change api.txt to binance api key: https://www.binance.c
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
Datashader is a data rasterization pipeline for automating the process of creating meaningful representations of large amounts of data.
Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
Projects that implement various aspects of Data Engineering.
DATAWAREHOUSE ON AWS The purpose of this project is to build a datawarehouse to accomodate data of active user activity for music streaming applicatio