A crude Hy handle on Pandas library

Related tags

Data AnalysisHyenas
Overview

Quickstart

Hyenas is a curde Hy handle written on top of Pandas API to allow for more elegant access to data-scientist's powerhouse that is Pandas.

Installation

To quickstart with Hyenas you will need to:

  1. Install Hy
  2. Download Hyenas by pulling this repository and executing the following commands:
cd ./hyenas
pip install -r requirements.txt
python -m build
pip install ./dist/hyenas-0.0.1-py3-none-any.whl

Usage

Hyenas - while very humble in its Pandas API coverage as of now - currently boasts an SQL like interface to access dataframe structures.

(import [hyenas[*]])
(import [pandas [read_csv]])

(
    select  :cols [(count_agg "long_name") (mean_agg "weight_kg")] 
            :from_df (read_csv "/some/data/players_21.csv") 
            :group_by ["height_cm"]
)

Project Status

Hyenas is currently in its infancy. Collaboratos, Pull Requests and Issues are welcome.

Owner
Peter Výboch
Developer by profession and at heart. Close to data and/or close to metal is where I thrive.
Peter Výboch
In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Raster_Sampling_Demo (Resulting graph of this demo) Background Sampling values of a raster at specific geographic coordinates can be done with a numbe

2 Dec 13, 2022
Python reader for Linked Data in HDF5 files

Linked Data are becoming more popular for user-created metadata in HDF5 files.

The HDF Group 8 May 17, 2022
small package with utility functions for analyzing (fly) calcium imaging data

fly2p Tools for analyzing two-photon (2p) imaging data collected with Vidrio Scanimage software and micromanger. Loading scanimage data relies on scan

Hannah Haberkern 3 Dec 14, 2022
Import, connect and transform data into Excel

xlwings_query Import, connect and transform data into Excel. Description The concept is to apply data transformations to a main query object. When the

George Karakostas 1 Jan 19, 2022
4CAT: Capture and Analysis Toolkit

4CAT: Capture and Analysis Toolkit 4CAT is a research tool that can be used to analyse and process data from online social platforms. Its goal is to m

Digital Methods Initiative 147 Dec 20, 2022
Accurately separate the TLD from the registered domain and subdomains of a URL, using the Public Suffix List.

tldextract Python Module tldextract accurately separates the gTLD or ccTLD (generic or country code top-level domain) from the registered domain and s

John Kurkowski 1.6k Jan 03, 2023
BAyesian Model-Building Interface (Bambi) in Python.

Bambi BAyesian Model-Building Interface in Python Overview Bambi is a high-level Bayesian model-building interface written in Python. It's built on to

861 Dec 29, 2022
pipeline for migrating lichess data into postgresql

How Long Does It Take Ordinary People To "Get Good" At Chess? TL;DR: According to 5.5 years of data from 2.3 million players and 450 million games, mo

Joseph Wong 182 Nov 11, 2022
A Big Data ETL project in PySpark on the historical NYC Taxi Rides data

Processing NYC Taxi Data using PySpark ETL pipeline Description This is an project to extract, transform, and load large amount of data from NYC Taxi

Unnikrishnan 2 Dec 12, 2021
PyClustering is a Python, C++ data mining library.

pyclustering is a Python, C++ data mining library (clustering algorithm, oscillatory networks, neural networks). The library provides Python and C++ implementations (C++ pyclustering library) of each

Andrei Novikov 1k Jan 05, 2023
cLoops2: full stack analysis tool for chromatin interactions

cLoops2: full stack analysis tool for chromatin interactions Introduction cLoops2 is an extension of our previous work, cLoops. From loop-calling base

YaqiangCao 25 Dec 14, 2022
Mining the Stack Overflow Developer Survey

Mining the Stack Overflow Developer Survey A prototype data mining application to compare the accuracy of decision tree and random forest regression m

1 Nov 16, 2021
Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.

Analysis of a dataset of 10000 passwords to find common trends and mistakes people generally make while setting up a password.

Aryan Raj 7 Sep 04, 2022
Two phase pipeline + StreamlitTwo phase pipeline + Streamlit

Two phase pipeline + Streamlit This is an example project that demonstrates how to create a pipeline that consists of two phases of execution. In betw

Rick Lamers 1 Nov 17, 2021
Tkinter Izhikevich Neuron Model With Python

TKINTER IZHIKEVICH NEURON MODEL WITH PYTHON Hodgkin-Huxley Model It is a mathematical model for the generation and transmission of action potentials i

Rabia KOÇ 8 Jul 16, 2022
Vaex library for Big Data Analytics of an Airline dataset

Vaex-Big-Data-Analytics-for-Airline-data A Python notebook (ipynb) created in Jupyter Notebook, which utilizes the Vaex library for Big Data Analytics

Nikolas Petrou 1 Feb 13, 2022
Implementation in Python of the reliability measures such as Omega.

reliabiliPy Summary Simple implementation in Python of the [reliability](https://en.wikipedia.org/wiki/Reliability_(statistics) measures for surveys:

Rafael Valero Fernández 2 Apr 27, 2022
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

PandaPy "I came across PandaPy last week and have already used it in my current project. It is a fascinating Python library with a lot of potential to

Derek Snow 527 Jan 02, 2023
Repository created with LinkedIn profile analysis project done

EN/en Repository created with LinkedIn profile analysis project done. The datase

Mayara Canaver 4 Aug 06, 2022
Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production

Numerics Numerical Analysis toolkit centred around PDEs, for demonstration and understanding purposes not production Use procedure: Initialise a new i

George Whittle 1 Nov 13, 2021