Python-based Space Physics Environment Data Analysis Software

Overview

pySPEDAS

build Coverage Status Version Language grade: Python Status License

pySPEDAS is an implementation of the SPEDAS framework for Python.

The Space Physics Environment Data Analysis Software (SPEDAS) framework is written in IDL and contains data loading, data analysis and data plotting tools for various scientific missions (NASA, NOAA, etc.) and ground magnetometers.

Please see our documentation at:

https://pyspedas.readthedocs.io/

Projects Supported

Requirements

Python 3.7+ is required.

We recommend Anaconda which comes with a suite of packages useful for scientific data analysis. Step-by-step instructions for installing Anaconda can be found at: Windows, macOS, Linux

Installation

Setup your Virtual Environment

To avoid potential dependency issues with other Python packages, we suggest creating a virtual environment for pySPEDAS; you can create a virtual environment in your terminal with:

python -m venv pyspedas

To enter your virtual environment, run the 'activate' script:

Windows

.\pyspedas\Scripts\activate

macOS and Linux

source pyspedas/bin/activate

Using Jupyter notebooks with your virtual environment

To get virtual environments working with Jupyter, in the virtual environment, type:

pip install ipykernel
python -m ipykernel install --user --name pyspedas --display-name "Python (pySPEDAS)"

(note: "pyspedas" is the name of your virtual environment)

Then once you open the notebook, go to "Kernel" then "Change kernel" and select the one named "Python (pySPEDAS)"

Install

pySPEDAS supports Windows, macOS and Linux. To get started, install the pyspedas package using PyPI:

pip install pyspedas

Upgrade

To upgrade to the latest version of pySPEDAS:

pip install pyspedas --upgrade

Local Data Directories

The recommended way of setting your local data directory is to set the SPEDAS_DATA_DIR environment variable. SPEDAS_DATA_DIR acts as a root data directory for all missions, and will also be used by IDL (if you’re running a recent copy of the bleeding edge).

Mission specific data directories (e.g., MMS_DATA_DIR for MMS, THM_DATA_DIR for THEMIS) can also be set, and these will override SPEDAS_DATA_DIR

Usage

To get started, import pyspedas and pytplot:

import pyspedas
from pytplot import tplot

You can load data into tplot variables by calling pyspedas.mission.instrument(), e.g.,

To load and plot 1 day of THEMIS FGM data for probe 'd':

thm_fgm = pyspedas.themis.fgm(trange=['2015-10-16', '2015-10-17'], probe='d')

tplot(['thd_fgs_gse', 'thd_fgs_gsm'])

To load and plot 2 minutes of MMS burst mode FGM data:

mms_fgm = pyspedas.mms.fgm(trange=['2015-10-16/13:05:30', '2015-10-16/13:07:30'], data_rate='brst')

tplot(['mms1_fgm_b_gse_brst_l2', 'mms1_fgm_b_gsm_brst_l2'])

Note: by default, pySPEDAS loads all data contained in CDFs found within the requested time range; this can potentially load data outside of your requested trange. To remove the data outside of your requested trange, set the time_clip keyword to True

To load and plot 6 hours of PSP SWEAP/SPAN-i data:

spi_vars = pyspedas.psp.spi(trange=['2018-11-5', '2018-11-5/06:00'], time_clip=True)

tplot(['DENS', 'VEL', 'T_TENSOR', 'TEMP'])

To download 5 days of STEREO magnetometer data (but not load them into tplot variables):

stereo_files = pyspedas.stereo.mag(trange=['2013-11-1', '2013-11-6'], downloadonly=True)

Standard Options

  • trange: two-element list specifying the time range of interest. This keyword accepts a wide range of formats
  • time_clip: if set, clip the variables to the exact time range specified by the trange keyword
  • suffix: string specifying a suffix to append to the loaded variables
  • varformat: string specifying which CDF variables to load; accepts the wild cards * and ?
  • varnames: string specifying which CDF variables to load (exact names)
  • get_support_data: if set, load the support variables from the CDFs
  • downloadonly: if set, download the files but do not load them into tplot
  • no_update: if set, only load the data from the local cache
  • notplot: if set, load the variables into dictionaries containing numpy arrays (instead of creating the tplot variables)

Getting Help

To find the options supported, call help on the instrument function you're interested in:

help(pyspedas.themis.fgm)

You can ask questions by creating an issue or by joining the SPEDAS mailing list.

Contributing

We welcome contributions to pySPEDAS; to learn how you can contribute, please see our Contributing Guide

Code of Conduct

In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation. To learn more, please see our Code of Conduct.

Additional Information

For examples of pyspedas, see: https://github.com/spedas/pyspedas_examples

For MMS examples, see: https://github.com/spedas/mms-examples

For pytplot, see: https://github.com/MAVENSDC/PyTplot

For cdflib, see: https://github.com/MAVENSDC/cdflib

For SPEDAS, see http://spedas.org/

Comments
  • doesn't load in STATIC energy / phi / theta / etc data

    doesn't load in STATIC energy / phi / theta / etc data

    I was trying to use pyspedas to pull in STATIC data but I noticed large discrepancies in the data pulled by PySPEDAS vs pulling data in IDL. Looking at the CDF file, it seems like pyspedas is ignoring the file's metadata and just pulling variables/support variables. Unfortunately, for STATIC, important parameters (like energy, phi, theta, etc) are given in metadata rather than variables/support variables.

    Please enable reading in of the metadata parameters so STATIC data can be used to the full extent in python as well as in IDL.

    Thank you!

    opened by NeeshaRS 25
  • RBSP Hope file loading UnicodeDecodeError

    RBSP Hope file loading UnicodeDecodeError

    Trying to call pyspedas.rbsp.hope(trange = [date1, date2], probe = "a", datatype = "moments", level = "l3", notplot = True) throws a <class 'UnicodeDecodeError'> for SOME dates, but not all. For example, 2018-11-6 and 2018-11-9 fail but 2018-11-7 does not. 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128).

    I was using notplot = True to get a dictionary out, if that matters.

    opened by kaltensi 6
  • For poes data

    For poes data

    Hello all,

    I am trying to use POES-MEPED data using pyspedas. I did >>>pyspedas.poes.load(trange) But I found pyspedas does not load the POES flux data for example, 'mep_ele_flux'. When I read the individual CDF files, there is 'mep_ele_flux' variable.

    Could you check all the POES data is properly loaded?

    Best regards, Inchun

    opened by chondrite1230 5
  • ylim does not work for a log axis?

    ylim does not work for a log axis?

    Although this should be asked on the pytplot page, let me ask here as I found (and confirmed) it when plotting THEMIS data. Recently my colleagues and I realized that pytplot.ylim() does not work properly for a tplot variable with ylog=1. Looking into the problem by testing with THEMIS/ESA data, we found:

    • a spectrum-type data with a 2-D "v" array (e.g., (times, 32) ----> ylim works
    • a spectrum-type data with a 1-D "v" array (e.g., (32) ----> ylim NOT work as expected

    Isn't this a bug in pytplot or pyspedas?

    Copied below are the commands that reproduce the above latter case on my environment (macOS 10.14.6, Python 3.9.6, pytplot 1.7.28, pyspedas 1.2.8).

    `import pyspedas import pytplot pyspedas.themis.spacecraft.particles.esa.esa( varformat='peir') pytplot.options('thc_peir_en_eflux', 'zlog', 1) pytplot.options('thc_peir_en_eflux', 'ylog', 1) element_thc_peir_en_eflux = pytplot.get_data('thc_peir_en_eflux') thc_peir_flux_test_meta_data = pytplot.get_data('thc_peir_en_eflux', metadata=True) pytplot.store_data('thc_peir_flux_test', data={'x': element_thc_peir_en_eflux[0], 'y': element_thc_peir_en_eflux[1], 'v': element_thc_peir_en_eflux[2][0]}, attr_dict=thc_peir_flux_test_meta_data) pytplot.options('thc_peir_flux_test', 'zlog', 1) pytplot.zlim('thc_peir_flux_test', (0.00005)*1.e+6, (500.)*1.e+6) pytplot.options('thc_peir_flux_test', 'ytitle', 'thc_peir_flux_test') pytplot.tplot('thc_peir_flux_test') # plot

    pytplot.ylim('thc_peir_flux_test', 1000, 24000) pytplot.tplot('thc_peir_flux_test') # plot with a different yrange setting`

    opened by horit 5
  • How to specify data directory

    How to specify data directory

    I am loading OMNI data, and wish to set local_data_dir to D:/data/omni. In pyspedas/omni/config.py, it seems that local _data_dir can be set using os.environ['OMNI_DATA_DIR'] However, the following code will still download OMNI data in the current directory instead of D:/data/omni. This was because os.environ is not changed in config.py The following code will reproduce the problem. Thanks for your attention.

    import pyspedas import os

    os.environ['OMNI_DATA_DIR'] = "C:/data/omni/" omni_vars = pyspedas.omni.data(trange=['2013-11-5', '2013-11-6'])

    opened by xnchu 5
  • CDF time conversion to unix time using astropy instead of cdflib

    CDF time conversion to unix time using astropy instead of cdflib

    This improves performance for CDF time to unix time conversion originally performed using cdflib.cdfepoch.unixtime, which uses python's intrinsic for loops for conversion and is way too slow. Simple testing finds that conversion via astropy is more than ten times faster. Although this introduces additional module dependency, this will be useful from users' perspective.

    Note: cdflib developers seem to be considering the integration of astropy's time module in cdflib. https://github.com/MAVENSDC/cdflib/issues/14

    opened by amanotk 5
  • failure in http digest authentication

    failure in http digest authentication

    Could you modify Lines 61 and 185 in utilities/download.py, session.auth = (username, password) to, session.auth = requests.auth.HTTPDigestAuth(username, password)? This fixes a bug causing the failure in http digest authentication.

    opened by horit 4
  • Problem using cdflib in cdf_to_tplot

    Problem using cdflib in cdf_to_tplot

    The epoch is converted to unix_time at line 195 in cdf_to_tplot.py If cdflib is used, it is converted using function unixtime at line 192 in epochs.py. The problem is line 222: unixtime.append(datetime.datetime(*date).timestamp()) It assumes local time instead of UTC. Therefore, the time is offset by your local time. The codes to reproduce the error is attached below. The result should be 2020-01-01/00:00:00.

    import pyspedas import numpy as np import pandas as pd trange=['2010-01-01/00:00:00', '2010-01-02/00:00:00'] varname = 'BX_GSE' data_omni = pyspedas.omni.data(trange=['2010-01-01/00:00:00', '2010-01-02/00:00:00'],notplot=True,varformat=varname,time_clip=True) data = np.array(data_omni[varname]['y']) unix_time = np.array(data_omni[varname]['x']) date_time = pd.to_datetime(data_omni[varname]['x'],unit='s') print(date_time[0])

    opened by xnchu 4
  • Could not import pyspedas in google colab

    Could not import pyspedas in google colab

    I was trying to use pyspedas in google colab, the kernel crashed with warning: qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.

    opened by donglai96 4
  • pyspedas/pyspedas/mms/feeps/mms_feeps_remove_bad_data.py line 50

    pyspedas/pyspedas/mms/feeps/mms_feeps_remove_bad_data.py line 50

    Are October 2017 and October 2018 the starting time for bad eyes tables? If so, then you should not take the closest table, but take the table according to the time period.

    opened by PluckZK 3
  • Some updates for ERG

    Some updates for ERG

    Could you merge ERG-related scripts from the "erg" branch? The updates include:

    • Minor bug fixes for load routines for some particle data
    • Initial import of load routines for ground observation data
    • An experimental version of part_products routines for ERG satellite data

    And could you discard the "ergsc-devel" branch? At first I tried to merge the above updates to the old branch, but failed, due to an unknown error with analysis/time_domain_filter.py. That is why I have instead created a new branch "erg" from the master and put all the updates in it. Any future updates for the ERG module would be delivered through this new branch.

    opened by horit 3
  • MMS FPI DIS moms omni spin avg doesn't seem to be averaged

    MMS FPI DIS moms omni spin avg doesn't seem to be averaged

    The FPI DIS moments energy spectra data does not seem to be spin-averaging when the omni product is loaded. dis_data = pyspedas.mms.fpi(trange=[date_start_str, date_end_str], datatype=['dis-moms'], center_measurement=True, data_rate='fast') t, y, z = get_data('mms1_dis_energyspectr_omni_fast') A difference on the time stamps results in a 4.5 second interval, which is the non-omni sampling interval.

    opened by kileyy10 1
  • conda install pyspedas [enhancement]

    conda install pyspedas [enhancement]

    It would be very useful to be able to install pyspedas using conda, since it would make using this framework much more convenient in conda environments.

    opened by johncoxon 5
  • In the case of unavailble data...

    In the case of unavailble data...

    Consider the following code:

    import pyspedas as spd
    import pytplot as ptt
    
    trange = ['2017-09-01 09:58:30', '2017-09-01 09:59:30']
    _ = spd.mms.fpi(trange=trange, probe=3, data_rate='brst', level='l2',
                    datatype='dis-moms', varnames='mms3_dis_bulkv_gse_brst', time_clip=True,
                    latest_version=True)
    fpi_time_unix = ptt.get_data(mms_fpi_varnames[0])[0]
    fpi_v_gse = ptt.get_data(mms_fpi_varnames[0])[1:][0]
    
    fpi_time_utc = spd.time_datetime(fpi_time_unix)
    
    print(f"Start time: {fpi_time_utc[0].strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"End time: {fpi_time_utc[-1].strftime('%Y-%m-%d %H:%M:%S')}")
    

    Given that trange = ['2017-09-01 09:58:30', '2017-09-01 09:59:30'], one would expect the code to print those dates. Or, in the case that 'brst' data is not available, the code should should either throw an error or NaNs.

    However, in this case the dates are: Start time: 2017-09-01 09:17:03, End time: 2017-09-01 09:18:02.

    From what I can surmize, it appears that in case the burst data is not available for the specified time range. the function looks for the data closest to the specified time range and outputs those times and corresponding data, which was unexpected for me, specially because I specified the time_clip parameter.

    This made me wonder, if this is how the code is supposed to work, or if this is a bug in the code.

    opened by qudsiramiz 4
  • Please use logging instead of prints

    Please use logging instead of prints

    There was some conversation in the PyHC Element chat this morning about suppressing all the printouts pySPEDAS generates when you load data (Time clip was applied to..., Loading variables: ..., etc).

    The OP ended up using a hack like this to redirect stdout: https://gist.github.com/vikjam/755930297430091d8d8df70ac89ea9e2

    But it was brought up that if pySPEDAS used logging (https://docs.python.org/3/library/logging.html) instead of standard print(), it would allow messages to be printed by default but users would have some control over what is shown if they wanted. A few people immediately agreed this would be a good change.

    Please consider this a vote for this feature request?

    opened by sapols 1
  • Can I download latest version of PSP data?

    Can I download latest version of PSP data?

    I am trying to download PSP data, for example SPC data from the first encounter.

    On the SWEAP website: http://sweap.cfa.harvard.edu/data/sci/sweap/spc/L3/2018/11/

    I can see that there is a version 26 of the CDF file. However, when using:

            spcdata = pyspedas.psp.spc(trange=[t0, t1], datatype='l3i', level='l3', 
                                    varnames = [
                                        'np_moment',
                                        'wp_moment',
                                        'vp_moment_RTN',
                                        'vp_moment_SC',
                                        'sc_pos_HCI',
                                        'sc_vel_HCI',
                                        'carr_latitude',
                                        'carr_longitude'
                                    ], 
                                    time_clip=True)
    

    I am getting the first vesrion of the CDF file (V01). Is there a functionality that would allow me to download the latest version that I am not aware of?

    Thanks!

    opened by nsioulas 5
Releases(1.3)
  • 1.3(Jan 26, 2022)

    • First version to include PyTplot with a matplotlib backend
    • Added geopack wrappers for T89, T96, T01, TS04
    • Large updates to the MMS plug-in, includeing new tools for calculating energy and angular spectrograms, as well as moments from the FPI and HPCA plasma distribution data
    • Added the 0th (EXPERIMENTAL) version of the ERG plug-in from the Arase team in Japan
    • Added new tools for working with PyTplot variables, e.g., tkm2re, cross products, dot products, normalizing vectors
    • Added routines for wave polarization calculations
    • Added routines for field algined coordinate transformations
    • Added plug-in for Spherical Elementary Currents (SECS) and Equivalent Ionospheric Currents (EICS) from Xin Cao and Xiangning Chu at the University of Colorado Boulder
    • Added initial load routine for Heliophysics Application Programmer's Interface (HAPI) data
    • Added initial load routine for Kyoto Dst data
    • Added initial load routine for THEMIS All Sky Imager data
    • Added THEMIS FIT L1 calibration routine
    • Large updates to the documentation at: https://pyspedas.readthedocs.io/
    • Numerous other bug fixes and updates
    Source code(tar.gz)
    Source code(zip)
  • v1.2(Mar 25, 2021)

    Significant v1.2 updates:

    • Dropped support for Python 3.6; we now support only Python 3.7 and later
    • Added support for Python 3.9
    • Implemented performance enhancements for coordinate transformation routines
    • Made numerous updates and bug fixes to the MMS plug-in
    • Added initial support for Solar Orbiter data
    Source code(tar.gz)
    Source code(zip)
  • 1.1(Dec 7, 2020)

  • 1.0(Jun 16, 2020)

Owner
SPEDAS
Space Physics Environment Data Analysis Software
SPEDAS
Python for Data Analysis, 2nd Edition

Python for Data Analysis, 2nd Edition Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media Buy

Wes McKinney 18.6k Jan 08, 2023
Incubator for useful bioinformatics code, primarily in Python and R

Collection of useful code related to biological analysis. Much of this is discussed with examples at Blue collar bioinformatics. All code, images and

Brad Chapman 560 Jan 03, 2023
.npy, .npz, .mtx converter.

npy-converter Matrix Data Converter. Expand matrix for multi-thread, multi-process Divid matrix for multi-thread, multi-process Support: .mtx, .npy, .

taka 1 Feb 07, 2022
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Dec 31, 2022
Calculate multilateral price indices in Python (with Pandas and PySpark).

IndexNumCalc Calculate multilateral price indices using the GEKS-T (CCDI), Time Product Dummy (TPD), Time Dummy Hedonic (TDH), Geary-Khamis (GK) metho

Dr. Usman Kayani 3 Apr 27, 2022
Get mutations in cluster by querying from LAPIS API

Cluster Mutation Script Get mutations appearing within user-defined clusters. Usage Clusters are defined in the clusters dict in main.py: clusters = {

neherlab 1 Oct 22, 2021
Common bioinformatics database construction

biodb Common bioinformatics database construction 1.taxonomy (Substance classification database) Download the database wget -c https://ftp.ncbi.nlm.ni

sy520 2 Jan 04, 2022
Projeto para realizar o RPA Challenge . Utilizando Python e as bibliotecas Selenium e Pandas.

RPA Challenge in Python Projeto para realizar o RPA Challenge (www.rpachallenge.com), utilizando Python. O objetivo deste desafio é criar um fluxo de

Henrique A. Lourenço 1 Apr 12, 2022
An Aspiring Drop-In Replacement for NumPy at Scale

Legate NumPy is a Legate library that aims to provide a distributed and accelerated drop-in replacement for the NumPy API on top of the Legion runtime. Using Legate NumPy you do things like run the f

Legate 502 Jan 03, 2023
Single-Cell Analysis in Python. Scales to >1M cells.

Scanpy – Single-Cell Analysis in Python Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It inc

Theis Lab 1.4k Jan 05, 2023
DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis.

DaDRA (day-druh) is a Python library for Data-Driven Reachability Analysis. The main goal of the package is to accelerate the process of computing estimates of forward reachable sets for nonlinear dy

2 Nov 08, 2021
PandaPy has the speed of NumPy and the usability of Pandas 10x to 50x faster (by @firmai)

PandaPy "I came across PandaPy last week and have already used it in my current project. It is a fascinating Python library with a lot of potential to

Derek Snow 527 Jan 02, 2023
MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.

MetPy MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data. MetPy follows semantic versioni

Unidata 971 Dec 25, 2022
Tokyo 2020 Paralympics, Analytics

Tokyo 2020 Paralympics, Analytics Thanks for checking out my app! It was built entirely using matplotlib and Tokyo 2020 Paralympics data. This applica

Petro Ivaniuk 1 Nov 18, 2021
First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we want to understand column level lineage and automate impact analysis.

dbt-osmosis First and foremost, we want dbt documentation to retain a DRY principle. Every time we repeat ourselves, we waste our time. Second, we wan

Alexander Butler 150 Jan 06, 2023
Find exposed data in Azure with this public blob scanner

BlobHunter A tool for scanning Azure blob storage accounts for publicly opened blobs. BlobHunter is a part of "Hunting Azure Blobs Exposes Millions of

CyberArk 250 Jan 03, 2023
An ETL framework + Monitoring UI/API (experimental project for learning purposes)

Fastlane An ETL framework for building pipelines, and Flask based web API/UI for monitoring pipelines. Project structure fastlane |- fastlane: (ETL fr

Dan Katz 2 Jan 06, 2022
SparseLasso: Sparse Solutions for the Lasso

SparseLasso: Sparse Solutions for the Lasso Introduction SparseLasso provides a Scikit-Learn based estimation of the Lasso with cross-validation tunin

Gabriel Okasa 1 Nov 08, 2021
Top 50 best selling books on amazon

It's a dashboard that shows the detailed information about each book in the top 50 best selling books on amazon over the last ten years

Nahla Tarek 1 Nov 18, 2021
Python script to automate the plotting and analysis of percentage depth dose and dose profile simulations in TOPAS.

topas-create-graphs A script to automatically plot the results of a topas simulation Works for percentage depth dose (pdd) and dose profiles (dp). Dep

Sebastian Schäfer 10 Dec 08, 2022