Python & Julia port of codes in excellent R books

Overview

X4DS

This repo is a collection of

Python & Julia port of codes in the following excellent R books:

Python Stack Julia Stack
Language
Version
v3.9 v1.7
Data
Processing
  • Pandas
  • DataFrames
  • Visualization
  • Matplotlib
  • Seaborn
  • MakiE
  • AlgebraOfGraphics
  • Machine
    Learning
  • Scikit-Learn
  • MLJ
  • Probablistic
    Programming
  • PyMC
  • Turing
  • Code Styles

    2.1. Basics

    • prefer enumerate() over range(len())
    xs = range(3)
    
    # good
    for ind, x in enumerate(xs):
      print(f'{ind}: {x}')
    
    # bad
    for i in range(len(xs)):
      print(f'{i}: {xs[i]}')

    2.2. Matplotlib

    including seaborn

    • prefer Axes object over Figure object
    • use constrained_layout=True when draw subplots
    # good
    _, axes = plt.subplots(1, 2, constrained_layout=True)
    axes[0].plot(x1, y1)
    axes[1].hist(x2, y2)
    
    # bad
    plt.subplot(121)
    plt.plot(x1, y1)
    plt.subplot(122)
    plt.hist(x2, y2)
    • prefer axes.flatten() over plt.subplot() in cases where subplots' data is iterable
    • prefer zip() or enumerate() over range() for iterable objects
    # good
    _, ax = plt.subplots(2, 2, figsize=[12,8],constrained_layout=True)
    
    for ax, x, y in zip(axes.flatten(), xs, ys):
      ax.plot(x, y)
    
    # bad
    for i in range(4):
      ax = plt.subplot(2, 2, i+1)
      ax.plot(x[i], y[i])
    • prefer set() method over set_*() method
    # good
    ax.set(xlabel='x', ylabel='y')
    
    # bad
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    • Prefer despine() over ax.spines[*].set_visible()
    # good
    sns.despine()
    
    # bad
    ax.spines["top"].set_visible(False)
    ax.spines["bottom"].set_visible(False)
    ax.spines["right"].set_visible(False)
    ax.spines["left"].set_visible(False)

    2.3. Pandas

    • prefer df['col'] over df.col
    # good
    movies['duration']
    
    # bad
    movies.duration
    • prefer df.query over df[] or df.loc[] in simple-selection
    # good
    movies.query('duration >= 200')
    
    # bad
    movies[movies['duration'] >= 200]
    movies.loc[movies['duration'] >= 200, :]
    • prefer df.loc and df.iloc over df[] in multiple-selection
    # good
    movies.loc[movies['duration'] >= 200, 'genre']
    movies.iloc[0:2, :]
    
    # bad
    movies[movies['duration'] >= 200].genre
    movies[0:2]

    LaTeX Styles

    Multiple lines

    Reduce the use of begin{array}...end{array}

    • equations: begin{aligned}...end{aligned}
    $$
    \begin{aligned}
    y_1 = x^2 + 2*x \\
    y_2 = x^3 + x
    \end{aligned}
    $$
    • equations with conditions: begin{cases}...end{cases}
    $$
    \begin{cases}
    y = x^2 + 2*x & x > 0 \\
    y = x^3 + x & x ≤ 0
    \end{cases}
    $$
    • matrix: begin{matrix}...end{matrix}
    $$
    \begin{vmatrix}
      a + a^′ & b + b^′ \\ c & d
      \end{vmatrix}= \begin{vmatrix}
      a & b \\ c & d
      \end{vmatrix} + \begin{vmatrix}
      a^′ & b^′ \\ c & d
    \end{vmatrix}
    $$

    Brackets

    • prefer \Bigg...\Bigg over \left...\right
    $$
    A\Bigg[v_1\ v_2\ \ v_r\Bigg]
    $$
    • prefer \underset{}{} over \underset{}
    $$
    \underset{θ}{\mathrm{argmax}}\ p(x_i|θ)
    $$

    Expressions

    • prefer ^{\top} over ^T for transpose

    $$ 𝐀^⊤ $$

    $$
    𝐀^{\top}
    $$
    • prefer \to over \rightarrow for limit

    $$ \lim_{n → ∞} $$

    $$
    \lim_{n\to \infty}
    $$
    • prefer underset{}{} over \limits_

    $$ \underset{w}{\rm argmin}\ (wx +b) $$

    $$
    \underset{w}{\rm argmin}\ (wx +b)
    $$

    Fonts

    • prefer \mathrm over \mathop or \operatorname
    $$
    θ_{\mathrm{MLE}}=\underset{θ}{\mathrm{argmax}}\ ∑_{i = 1}^{N}\log p(x_i|θ)
    $$

    ISLR

    References

    style <style> table { border-collapse: collapse; text-align: center; } </style>
    Owner
    Gitony
    Gitony
    Tools for calculating and visualizing Elo-like ratings of MLB teams using Retosheet data

    Overview This project uses historical baseball games data to calculate an Elo-like rating for MLB teams based on regular season match ups. The Elo rat

    Lukas Owens 0 Aug 25, 2021
    Fractals plotted on MatPlotLib in Python.

    About The Project Learning more about fractals through the process of visualization. Built With Matplotlib Numpy License This project is licensed unde

    Akeel Ather Medina 2 Aug 30, 2022
    This is simply repo for line drawing rendering using freestyle in Blender.

    blender_freestyle_line_drawing This is simply repo for line drawing rendering using freestyle in Blender. how to use blender2935 --background --python

    MaxLin 3 Jul 02, 2022
    Create a visualization for Trump's Tweeted Words Using Python

    Data Trump's Tweeted Words This plot illustrates twitter word occurences. We already did the coding I needed for this plot, so I was very inspired to

    7 Mar 27, 2022
    This is a Boids Simulation, written in Python with Pygame.

    PyNBoids A Python Boids Simulation This is a Boids simulation, written in Python3, with Pygame2 and NumPy. To use: Save the pynboids_sp.py file (and n

    Nik 17 Dec 18, 2022
    A Python toolbox for gaining geometric insights into high-dimensional data

    "To deal with hyper-planes in a 14 dimensional space, visualize a 3D space and say 'fourteen' very loudly. Everyone does it." - Geoff Hinton Overview

    Contextual Dynamics Laboratory 1.8k Dec 29, 2022
    This is a small program that prints a user friendly, visual representation, of your current bsp tree

    bspcq, q for query A bspc analyzer (utility for bspwm) This is a small program that prints a user friendly, visual representation, of your current bsp

    nedia 9 Apr 24, 2022
    A package for plotting maps in R with ggplot2

    Attention! Google has recently changed its API requirements, and ggmap users are now required to register with Google. From a user’s perspective, ther

    David Kahle 719 Jan 04, 2023
    MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

    MPL Plotter is a Matplotlib based Python plotting library built with the goal of delivering publication-quality plots concisely.

    Antonio López Rivera 162 Nov 11, 2022
    Mathematical learnings with Lean, for those of us who wish we knew more of both!

    Lean for the Inept Mathematician This repository contains source files for a number of articles or posts aimed at explaining bite-sized mathematical c

    Julian Berman 8 Feb 14, 2022
    Geospatial Data Visualization using PyGMT

    Example script to visualize topographic data, earthquake data, and tomographic data on a map

    Utpal Kumar 2 Jul 30, 2022
    These data visualizations were created for my introductory computer science course using Python

    Homework 2: Matplotlib and Data Visualization Overview These data visualizations were created for my introductory computer science course using Python

    Sophia Huang 12 Oct 20, 2022
    A set of useful perceptually uniform colormaps for plotting scientific data

    Colorcet: Collection of perceptually uniform colormaps Build Status Coverage Latest dev release Latest release Docs What is it? Colorcet is a collecti

    HoloViz 590 Dec 31, 2022
    A python script editor for napari based on PyQode.

    napari-script-editor A python script editor for napari based on PyQode. This napari plugin was generated with Cookiecutter using with @napari's cookie

    Robert Haase 9 Sep 20, 2022
    This is a super simple visualization toolbox (script) for transformer attention visualization ✌

    Trans_attention_vis This is a super simple visualization toolbox (script) for transformer attention visualization ✌ 1. How to prepare your attention m

    Mingyu Wang 3 Jul 09, 2022
    Lime: Explaining the predictions of any machine learning classifier

    lime This project is about explaining what machine learning classifiers (or models) are doing. At the moment, we support explaining individual predict

    Marco Tulio Correia Ribeiro 10.3k Dec 29, 2022
    Interactive plotting for Pandas using Vega-Lite

    pdvega: Vega-Lite plotting for Pandas Dataframes pdvega is a library that allows you to quickly create interactive Vega-Lite plots from Pandas datafra

    Altair 342 Oct 26, 2022
    A small tool to test and visualize protein embeddings and amino acid proportions.

    polyprotein_stats A small tool to test and visualize protein embeddings and amino acid proportions. Currently deployed on streamlit.io. Given a set of

    2 Jan 07, 2023
    Bokeh Plotting Backend for Pandas and GeoPandas

    Pandas-Bokeh provides a Bokeh plotting backend for Pandas, GeoPandas and Pyspark DataFrames, similar to the already existing Visualization feature of

    Patrik Hlobil 822 Jan 07, 2023
    An easy to use burndown chart generator for GitHub Project Boards.

    Burndown Chart for GitHub Projects An easy to use burndown chart generator for GitHub Project Boards. Table of Contents Features Installation Assumpti

    Joseph Hale 15 Dec 28, 2022