Useful tool for inserting DataFrames into the Excel sheet.

Overview

PyCellFrame

Insert Pandas DataFrames into the Excel sheet with a bunch of conditions

Install

pip install pycellframe

Usage

Examples

Let's suppose that we have an Excel file named "numbers.xlsx" with the sheet named "Dictionary" in which we would like to insert the pandas.DataFrame.

Import pandas and create an example DataFrame (which will be inserted into the Excel sheet):

import pandas as pd


ex = {
    'Num': [1, 2, 3, 4],
    'AfterFirstBlankCol': 'AfterFirstBlank',
    'Descr': ['One', 'Two', 'Three', 'Four'],
    'AfterSecondBlankCol': 'AfterSecondBlank.',
    'Squared': [1, 4, 9, 16],
    'Binary:': ['1', '10', '11', '100']
}

df = pd.DataFrame(ex)
  • Import openpyxl.load_workbook and open numbers.xlsx - Our Excel workbook;
  • Get - Dictionary our desired sheet:
from openpyxl import load_workbook


workbook = load_workbook('numbers.xlsx')
worksheet = workbook['Dictionary']

Functions

1. incell_style(cell_src, cell_dst)
  • Let's say, we have a cell in Excel Dictionary sheet that we would like to copy the style from, and it is O3;
  • Let O4 be our destination cell:

NOTE: If we wanted to copy that style to more than one cell, we would simply use the loop depending on the locations of the destination cells.

from pycellframe import incell_style


incell_style(cell_src=worksheet['O3'], cell_dst=worksheet['O4'])
2. sheet_to_sheet(filename_sheetname_src, filename_sheetname_dst, calculated)
  • Let's say that we have two Excel files, and we need specific sheet from one file to be completely copied to another file's specific sheet;
  • filename_sheetname_src is the parameter for one file -> sheet the data to be copied from (tuple(['FILENAME_SRC', 'SHEETNAME_SRC']));
  • worksheet_dst is the parameter for the destination Worksheet the data to be copied to (openpyxl.worksheet.worksheet.Worksheet);
  • Let's assume that we have file_src.xlsx as src file and for worksheet_src we can use its CopyThisSheet sheet.
  • We can use output.xlsx -> CopyToThisSheet sheet as the destination worksheet, for which we already declared the Workbook object above.

NOTE: We are assuming that we need all the formulas (where available) from the source sheet, not calculated data, so we set calculated parameter to False

from pycellframe import sheet_to_sheet


worksheet_to = workbook['CopyToThisSheet']

sheet_to_sheet(filename_sheetname_src=('file_src.xlsx', 'CopyThisSheet'),
               worksheet_dst=worksheet_to,
               calculated=False)
3. incell_frame(worksheet, dataframe, col_range, row_range, num_str_cols, skip_cols, headers)
  • From our package pycellframe import function incell_frame;
  • Insert ex - DataFrame into our sheet twice - with and without conditions:
from pycellframe import incell_frame


# 1 - Simple insertion
incell_frame(worksheet=worksheet, dataframe=df)

# 2 - Insertion with some conditions
incell_frame(worksheet=worksheet,
             dataframe=df,
             col_range=(3, 0),
             row_range=(6, 8),
             num_str_cols=['I'],
             skip_cols=['D', 'F'],
             headers=True)

In the first insertion, we did not give our function any arguments, which means the DataFrame ex will be inserted into the Dictionary sheet in the area A1:F4 (without the headers).

However, with the second insertion we define some conditions:

  • col_range=(3, 0) - This means that insertion will be started at the Excel column with the index 3 (column C) and will not be stopped until the very end, since we gave 0 as the second element of the tuple

  • row_range=(6, 8) - Only in between these rows (in Excel) will the DataFrame data be inserted, which means that only the first row (since the headers is set to True) from ex will be inserted into the sheet

  • num_str_cols=['F'] - Another condition here is to not convert Binary column values to int. If we count, this column will be inserted in the Excel column F, so we tell the function to leave the values in it as string

  • skip_cols=['D', 'F'] - D and F columns in Excel will be skipped and since our worksheet was blank in the beginning, these columns will be blank (that is why I named the columns in the DataFrame related names)

  • headers=True - This time, the DataFrame columns will be inserted, too, so the overall insertion area would be C6:J8

For really detailed description of the parameters, please see:
  1. incell_frame.__docs__
  2. sheet_to_sheet.__docs__
  3. incell_style.__docs__
  • Finally, let's save our changes to a new Excel file:
workbook.save('output.xlsx')

Full Code

import pandas as pd
from openpyxl import load_workbook
from pycellframe import incell_style, \
                        incell_frame, \
                        sheet_to_sheet


ex = {
    'Num': [1, 2, 3, 4],
    'AfterFirstBlankCol': 'AfterFirstBlank',
    'Descr': ['One', 'Two', 'Three', 'Four'],
    'AfterSecondBlankCol': 'AfterSecondBlank.',
    'Squared': [1, 4, 9, 16],
    'Binary:': ['1', '10', '11', '100']
}

df = pd.DataFrame(ex)

workbook = load_workbook('numbers.xlsx')
worksheet = workbook['Dictionary']


# Copy the cell style
incell_style(cell_src=worksheet['O3'], cell_dst=worksheet['O4'])


# Copy the entire sheet
worksheet_to = workbook['CopyToThisSheet']

sheet_to_sheet(filename_sheetname_src=('file_src.xlsx', 'CopyThisSheet'),
               worksheet_dst=worksheet_to,
               calculated=False)


# Insert DataFrame into the sheet

## 1 - Simple insertion
incell_frame(worksheet=worksheet, dataframe=df)

## 2 - Insertion with some conditions
incell_frame(worksheet=worksheet,
             dataframe=df,
             col_range=(3, 0),
             row_range=(6, 8),
             num_str_cols=['I'],
             skip_cols=['D', 'F'],
             headers=True)

workbook.save('output.xlsx')
Owner
Luka Sosiashvili
Luka Sosiashvili
Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

Streaming Data Pipeline - Kafka + ELK Stack Streaming weather data using Apache Kafka and Elastic Stack. Data source: https://openweathermap.org/api O

Felipe Demenech Vasconcelos 2 Jan 20, 2022
Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python

Driver Analysis with Factors and Forests: An Automated Data Science Tool using Python 📊

Thomas 2 May 26, 2022
Gaussian processes in TensorFlow

Website | Documentation (release) | Documentation (develop) | Glossary Table of Contents What does GPflow do? Installation Getting Started with GPflow

GPflow 1.7k Jan 06, 2023
Time ranges with python

timeranges Time ranges. Read the Docs Installation pip timeranges is available on pip: pip install timeranges GitHub You can also install the latest v

Micael Jarniac 2 Sep 01, 2022
PyNHD is a part of HyRiver software stack that is designed to aid in watershed analysis through web services.

A part of HyRiver software stack that provides access to NHD+ V2 data through NLDI and WaterData web services

Taher Chegini 23 Dec 14, 2022
Tablexplore is an application for data analysis and plotting built in Python using the PySide2/Qt toolkit.

Tablexplore is an application for data analysis and plotting built in Python using the PySide2/Qt toolkit.

Damien Farrell 81 Dec 26, 2022
An interactive grid for sorting, filtering, and editing DataFrames in Jupyter notebooks

qgrid Qgrid is a Jupyter notebook widget which uses SlickGrid to render pandas DataFrames within a Jupyter notebook. This allows you to explore your D

Quantopian, Inc. 2.9k Jan 08, 2023
Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code

Tuplex is a parallel big data processing framework that runs data science pipelines written in Python at the speed of compiled code. Tuplex has similar Python APIs to Apache Spark or Dask, but rather

Tuplex 791 Jan 04, 2023
A lightweight, hub-and-spoke dashboard for multi-account Data Science projects

A lightweight, hub-and-spoke dashboard for cross-account Data Science Projects Introduction Modern Data Science environments often involve many indepe

AWS Samples 3 Oct 30, 2021
Collections of pydantic models

pydantic-collections The pydantic-collections package provides BaseCollectionModel class that allows you to manipulate collections of pydantic models

Roman Snegirev 20 Dec 26, 2022
Full automated data pipeline using docker images

Create postgres tables from CSV files This first section is only relate to creating tables from CSV files using postgres container alone. Just one of

1 Nov 21, 2021
Option Pricing Calculator using the Binomial Pricing Method (No Libraries Required)

Binomial Option Pricing Calculator Option Pricing Calculator using the Binomial Pricing Method (No Libraries Required) Background A derivative is a fi

sammuhrai 1 Nov 29, 2021
statDistros is a Python library for dealing with various statistical distributions

StatisticalDistributions statDistros statDistros is a Python library for dealing with various statistical distributions. Now it provides various stati

1 Oct 03, 2021
Sensitivity Analysis Library in Python (Numpy). Contains Sobol, Morris, Fractional Factorial and FAST methods.

Sensitivity Analysis Library (SALib) Python implementations of commonly used sensitivity analysis methods. Useful in systems modeling to calculate the

SALib 663 Jan 05, 2023
Statistical Rethinking course winter 2022

Statistical Rethinking (2022 Edition) Instructor: Richard McElreath Lectures: Uploaded Playlist and pre-recorded, two per week Discussion: Online, F

Richard McElreath 3.9k Dec 31, 2022
Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data.

Hatchet Hatchet is a Python-based library that allows Pandas dataframes to be indexed by structured tree and graph data. It is intended for analyzing

Lawrence Livermore National Laboratory 14 Aug 19, 2022
peptides.py is a pure-Python package to compute common descriptors for protein sequences

peptides.py Physicochemical properties and indices for amino-acid sequences. 🗺️ Overview peptides.py is a pure-Python package to compute common descr

Martin Larralde 32 Dec 31, 2022
Data pipelines built with polars

valves Warning: the project is very much work in progress. Valves is a collection of functions for your data .pipe()-lines. This project aimes to host

14 Jan 03, 2023
Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine

Statistical Rethinking: A Bayesian Course Using CmdStanPy and Plotnine Intro This repo contains the python/stan version of the Statistical Rethinking

Andrés Suárez 3 Nov 08, 2022
This is a python script to navigate and extract the FSD50K dataset

FSD50K navigator This is a script I use to navigate the sound dataset from FSK50K.

sweemeng 2 Nov 23, 2021