Python & Julia port of codes in excellent R books

Overview

X4DS

This repo is a collection of

Python & Julia port of codes in the following excellent R books:

Python Stack Julia Stack
Language
Version
v3.9 v1.7
Data
Processing
  • Pandas
  • DataFrames
  • Visualization
  • Matplotlib
  • Seaborn
  • MakiE
  • AlgebraOfGraphics
  • Machine
    Learning
  • Scikit-Learn
  • MLJ
  • Probablistic
    Programming
  • PyMC
  • Turing
  • Code Styles

    2.1. Basics

    • prefer enumerate() over range(len())
    xs = range(3)
    
    # good
    for ind, x in enumerate(xs):
      print(f'{ind}: {x}')
    
    # bad
    for i in range(len(xs)):
      print(f'{i}: {xs[i]}')

    2.2. Matplotlib

    including seaborn

    • prefer Axes object over Figure object
    • use constrained_layout=True when draw subplots
    # good
    _, axes = plt.subplots(1, 2, constrained_layout=True)
    axes[0].plot(x1, y1)
    axes[1].hist(x2, y2)
    
    # bad
    plt.subplot(121)
    plt.plot(x1, y1)
    plt.subplot(122)
    plt.hist(x2, y2)
    • prefer axes.flatten() over plt.subplot() in cases where subplots' data is iterable
    • prefer zip() or enumerate() over range() for iterable objects
    # good
    _, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
    
    for ax, x, y in zip(axes.flatten(), xs, ys):
      ax.plot(x, y)
    
    # bad
    for i in range(4):
      ax = plt.subplot(2, 2, i+1)
      ax.plot(x[i], y[i])
    • prefer set() method over set_*() method
    # good
    ax.set(xlabel='x', ylabel='y')
    
    # bad
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    • Prefer despine() over ax.spines[*].set_visible()
    # good
    sns.despine()
    
    # bad
    ax.spines["top"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)

    2.3. Pandas

    • prefer df['col'] over df.col
    # good
    movies['duration']
    
    # bad
    movies.duration
    • prefer df.query over df[] or df.loc[] in simple-selection
    # good
    movies.query('duration >= 200')
    
    # bad
    movies[movies['duration'] >= 200]
    movies.loc[movies['duration'] >= 200, :]
    • prefer df.loc and df.iloc over df[] in multiple-selection
    # good
    movies.loc[movies['duration'] >= 200, 'genre']
    movies.iloc[0:2, :]
    
    # bad
    movies[movies['duration'] >= 200].genre
    movies[0:2]

    LaTeX Styles

    Multiple lines

    Reduce the use of begin{array}...end{array}

    • equations: begin{aligned}...end{aligned}
    $$
    \begin{aligned}
    y_1 = x^2 + 2*x \\
    y_2 = x^3 + x
    \end{aligned}
    $$
    • equations with conditions: begin{cases}...end{cases}
    $$
    \begin{cases}
    y = x^2 + 2*x & x > 0 \\
    y = x^3 + x & x ≤ 0
    \end{cases}
    $$
    • matrix: begin{matrix}...end{matrix}
    $$
    \begin{vmatrix}
      a + a^′ & b + b^′ \\ c & d
      \end{vmatrix}= \begin{vmatrix}
      a & b \\ c & d
      \end{vmatrix} + \begin{vmatrix}
      a^′ & b^′ \\ c & d
    \end{vmatrix}
    $$

    Brackets

    • prefer \Bigg...\Bigg over \left...\right
    $$
    A\Bigg[v_1\ v_2\ \ v_r\Bigg]
    $$
    • prefer \underset{}{} over \underset{}
    $$
    \underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
    $$

    Expressions

    • prefer ^{\top} over ^T for transpose

    $$ 𝐀^⊤ $$

    $$
    𝐀^{\top}
    $$
    • prefer \to over \rightarrow for limit

    $$ \lim_{n → ∞} $$

    $$
    \lim_{n\to \infty}
    $$
    • prefer underset{}{} over \limits_

    $$ \underset{w}{\rm argmin}\ (wx +b) $$

    $$
    \underset{w}{\rm argmin}\ (wx +b)
    $$

    Fonts

    • prefer \mathrm over \mathop or \operatorname
    $$
    θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
    $$

    ISLR

    References

    style <style> table { border-collapse: collapse; text-align: center; } </style>
    Owner
    Gitony
    Gitony
    Pyan3 - Offline call graph generator for Python 3

    Pyan takes one or more Python source files, performs a (rather superficial) static analysis, and constructs a directed graph of the objects in the combined source, and how they define or use each oth

    Juha Jeronen 235 Jan 02, 2023
    JupyterHub extension for ContainDS Dashboards

    ContainDS Dashboards for JupyterHub A Dashboard publishing solution for Data Science teams to share results with decision makers. Run a private on-pre

    Ideonate 179 Nov 29, 2022
    Generate a roam research like Network Graph view from your Notion pages.

    Notion Graph View Export Notion pages to a Roam Research like graph view.

    Steve Sun 214 Jan 07, 2023
    A toolkit to generate MR sequence diagrams

    mrsd: a toolkit to generate MR sequence diagrams mrsd is a Python toolkit to generate MR sequence diagrams, as shown below for the basic FLASH sequenc

    Julien Lamy 3 Dec 25, 2021
    A program that analyzes data from inertia measurement units installed in aircraft and generates g-exceedance curves.

    A program that analyzes data from inertia measurement units installed in aircraft and generates g-exceedance curves.

    Pooya 1 Dec 02, 2021
    A command line tool for visualizing CSV/spreadsheet-like data

    PerfPlotter Read data from CSV files using pandas and generate interactive plots using bokeh, which can then be embedded into HTML pages and served by

    Gino Mempin 0 Jun 25, 2022
    It's an application to calculate I from v and r. It can also plot a graph between V vs I.

    Ohm-s-Law-Visualizer It's an application to calculate I from v and r using Ohm's Law. It can also plot a graph between V vs I. Story I'm doing my Unde

    Sihab Sahariar 1 Nov 20, 2021
    Missing data visualization module for Python.

    missingno Messy datasets? Missing values? missingno provides a small toolset of flexible and easy-to-use missing data visualizations and utilities tha

    Aleksey Bilogur 3.4k Dec 29, 2022
    The repository is my code for various types of data visualization cases based on the Matplotlib library.

    ScienceGallery The repository is my code for various types of data visualization cases based on the Matplotlib library. It summarizes the code and cas

    Warrick Xu 2 Apr 20, 2022
    Use Perspective to create the chart for the trader’s dashboard

    Task Overview | Installation Instructions | Link to Module 3 Introduction Experience Technology at JP Morgan Chase Try out what real work is like in t

    Abdulazeez Jimoh 1 Jan 22, 2022
    📊 Charts with pure python

    A zero-dependency python package that prints basic charts to a Jupyter output Charts supported: Bar graphs Scatter plots Histograms 🍑 📊 👏 Examples

    Max Humber 54 Oct 04, 2022
    Easily configurable, chart dashboards from any arbitrary API endpoint. JSON config only

    Flask JSONDash Easily configurable, chart dashboards from any arbitrary API endpoint. JSON config only. Ready to go. This project is a flask blueprint

    Chris Tabor 3.3k Dec 31, 2022
    A python script to visualise explain plans as a graph using graphviz

    README Needs to be improved Prerequisites Need to have graphiz installed on the machine. Refer to https://graphviz.readthedocs.io/en/stable/manual.htm

    Edward Mallia 1 Sep 28, 2021
    Python Data. Leaflet.js Maps.

    folium Python Data, Leaflet.js Maps folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js

    6k Jan 02, 2023
    Bcc2telegraf: An integration that sends ebpf-based bcc histogram metrics to telegraf daemon

    bcc2telegraf bcc2telegraf is an integration that sends ebpf-based bcc histogram

    Peter Bobrov 2 Feb 17, 2022
    Cryptocurrency Centralized Exchange Visualization

    This is a simple one that uses Grafina to visualize cryptocurrency from the Bitkub exchange. This service will make a request to the Bitkub API from your wallet and save the response to Postgresql. G

    Popboon Mahachanawong 1 Nov 24, 2021
    Python package to Create, Read, Write, Edit, and Visualize GSFLOW models

    pygsflow pyGSFLOW is a python package to Create, Read, Write, Edit, and Visualize GSFLOW models API Documentation pyGSFLOW API documentation can be fo

    pyGSFLOW 21 Dec 14, 2022
    Simple Inkscape Scripting

    Simple Inkscape Scripting Description In the Inkscape vector-drawing program, how would you go about drawing 100 diamonds, each with a random color an

    Scott Pakin 140 Dec 27, 2022
    Small project demonstrating the use of Grafana and InfluxDB for monitoring the speed of an internet connection

    Speedtest monitor for Grafana A small project that allows internet speed monitoring using Grafana, InfluxDB 2 and Speedtest. Demo Requirements Docker

    Joshua Ghali 3 Aug 06, 2021
    Create HTML profiling reports from pandas DataFrame objects

    Pandas Profiling Documentation | Slack | Stack Overflow Generates profile reports from a pandas DataFrame. The pandas df.describe() function is great

    10k Jan 01, 2023