Performance analysis of predictive (alpha) stock factors

Overview

https://media.quantopian.com/logos/open_source/alphalens-logo-03.png

Alphalens

GitHub Actions status

Alphalens is a Python Library for performance analysis of predictive (alpha) stock factors. Alphalens works great with the Zipline open source backtesting library, and Pyfolio which provides performance and risk analysis of financial portfolios. You can try Alphalens at Quantopian -- a free, community-centered, hosted platform for researching and testing alpha ideas. Quantopian also offers a fully managed service for professionals that includes Zipline, Alphalens, Pyfolio, FactSet data, and more.

The main function of Alphalens is to surface the most relevant statistics and plots about an alpha factor, including:

  • Returns Analysis
  • Information Coefficient Analysis
  • Turnover Analysis
  • Grouped Analysis

Getting started

With a signal and pricing data creating a factor "tear sheet" is a two step process:

import alphalens

# Ingest and format data
factor_data = alphalens.utils.get_clean_factor_and_forward_returns(my_factor,
                                                                   pricing,
                                                                   quantiles=5,
                                                                   groupby=ticker_sector,
                                                                   groupby_labels=sector_names)

# Run analysis
alphalens.tears.create_full_tear_sheet(factor_data)

Learn more

Check out the example notebooks for more on how to read and use the factor tear sheet. A good starting point could be this

Installation

Install with pip:

pip install alphalens

Install with conda:

conda install -c conda-forge alphalens

Install from the master branch of Alphalens repository (development code):

pip install git+https://github.com/quantopian/alphalens

Alphalens depends on:

Usage

A good way to get started is to run the examples in a Jupyter notebook.

To get set up with an example, you can:

Run a Jupyter notebook server via:

jupyter notebook

From the notebook list page(usually found at http://localhost:8888/), navigate over to the examples directory, and open any file with a .ipynb extension.

Execute the code in a notebook cell by clicking on it and hitting Shift+Enter.

Questions?

If you find a bug, feel free to open an issue on our github tracker.

Contribute

If you want to contribute, a great place to start would be the help-wanted issues.

Credits

For a full list of contributors see the contributors page.

Example Tear Sheet

Example factor courtesy of ExtractAlpha

https://github.com/quantopian/alphalens/raw/master/alphalens/examples/table_tear.png

https://github.com/quantopian/alphalens/raw/master/alphalens/examples/returns_tear.png

https://github.com/quantopian/alphalens/raw/master/alphalens/examples/ic_tear.png

Comments
  • Feature request: Use alphalens with returns instead of prices

    Feature request: Use alphalens with returns instead of prices

    Is there a way to run Alphalens using my own custom input matrix of returns (ie. custom factor adjusted) rather than inputting Prices (ie. the "pricing" table in the examples) for each period?

    enhancement help wanted 
    opened by neman018 57
  • Integration with pyfolio and Quantopian's new Risk Model

    Integration with pyfolio and Quantopian's new Risk Model

    I have been thinking about the nice "Alpha decomposition" #180 feature. The information it provides is actually a small part of what already available in pyfolio and the Quantopian's new Risk Model. On one side we cannot replicate all the information provided by those two tools but on the other side it would be great to have all that analysis without having to build an algorithm and run a backtest, something that could be integrated into Alphalens.

    Then, why don't we create a function in Alphalens that builds the input required by pyfolio and the Quantopian's new Risk Model? Alphalens already simulates the cumulative returns of a portfolio weighted by factor values, so we only need to format those information in a way that is compatible with the other two tools. That would be a pure theoretical analysis, but except for commissions and slippage the results would be realistic and it also would serve as benchmark for the algorithm (users can compare the algorithm results, after setting commission and slippage to 0, with these theoretical results and check if they implemented the algorithm correctly).

    I haven't looked at pyfolio in details so I don't know the details of the input, but if @twiecki can help me with those details I can work on this feature and the same for Quantopian's new Risk Mode (I don't know if that is part of pyfolio or a separate project).

    enhancement api 
    opened by luca-s 32
  •  ENH: added positions computation in 'performance.create_pyfolio_input'

    ENH: added positions computation in 'performance.create_pyfolio_input'

    'performance.create_pyfolio_input' now computes positions too. Also it is now possible to select the 'period' to be used in benchmark computation and for factor returns/positions is now possible to select equal weighing instead of factor weighing.

    opened by luca-s 30
  • new api

    new api

    Here is where I'm headed with the new api thoughts from https://github.com/quantopian/alphalens/pull/110 so to review creating a tear sheet would be a two step process.

    1. format the data
    2. call for the tear sheet

    where all of the tear sheets take a factor_data dataframe that looks something like screen shot 2017-01-13 at 11 53 54 am

    Right now the main blocking issue is the event-study-esque plots as that function requires raw prices. I think that the number of plots and the uniqueness of what they are saying probably merits them getting a separate tear sheet (which would be able to take raw prices).

    opened by jameschristopher 23
  • API: run Alphalens with returns instead of prices (utils.get_clean_factor)

    API: run Alphalens with returns instead of prices (utils.get_clean_factor)

    For issue #253 refactor compute_forward_returns add get_clean_factor API refactor get_clean_factor_and_forward_returns as compose of compute_forward_returns and get_clean_factor

    opened by HereticSK 22
  • Create sub tearsheets

    Create sub tearsheets

    This breaks down the full tear sheet into multiple smaller ones covering: returns, information, and turnover analysis.

    This PR is aimed mainly at issue #106 so Thomas your thoughts would be great!

    This is a first start so things are pretty rough, and it definitely won't pass tests but I think there will be a lot of discussion so I'm not too worried.

    opened by jameschristopher 21
  • Two Factor Interaction Development: Initial Data Structure, Test Module, and Plot

    Two Factor Interaction Development: Initial Data Structure, Test Module, and Plot

    This begins the development of the "factors_interaction_tearsheet" from issue #219. The goal of this pull request is to get feedback on whether this branch seems to be going in the right direction.

    Description of Changes

    1. Create join_factor_with_factor_and_forward_returns function
      • Creates a function complimentary to get_clean_factor_and_forward_returns that joins an additional factor to the factor_data dataframe returned by get_clean_factor_and_forward_returns.
      • This new dataframe returned, call it "multi_factor_data", will be the core source/data structure providing the necessary data for the factors_interaction_tearsheet computations.
    2. Create an associated test module.
    3. Modify perf.mean_return_by_quantile to take an additional parameter so that it can group by multiple factor quantiles.
    4. Add first plotting function, plot_multi_factor_quantile_returns, to create an annotated heatmap of mean returns by two-factor quantile bins.
    5. Create the tears.create_factors_interaction_tear_sheet as the entry point to the multi-factor tearsheet.

    Requesting Feedback

    1. Comments and suggestions on the utils.join_factor_with_factor_and_forward_returns function
      1. Should there be a wrapper that builds the multi_factor_data dataframe in one step. (i.e. wrap this function with get_clean_factor_and_forward_returns?
    2. I'm not too familiar with creating effective unit tests, so any feedback on this module is appreciated.
    3. In regards to Change 3 above:
      1. My first thought, following suggestion of @luca-s, was to create a separate performance module which would contain all functions for this sort of computation.
      2. Since the existing performance module already contains a lot of the needed functionality, I thought maybe I would create a wrapper function in this new module that added the necessary functionality.
      3. However, in perf.mean_return_by_quantile, I needed to add a parameter to this function to make it work in a clean manner. Not sure how I could have done that with a wrapper.
      4. So I guess my question is, what are the community's thoughts on how I dealt with this particular issue, and also what are thoughts on related considerations going forward?
    4. Any other comments/guidance on path of development going forward is greatly appreciated.
    5. Also, let me know if there are too many changes in this pull request for efficient/easy review.
    opened by MichaelJMath 19
  • Added support for group neutral factor analysis

    Added support for group neutral factor analysis

    I am working on a sector neutral factor and I discovered that Alphalens analysis on a group neutral factor is currently limited. With group neutral factor I mean a factor intended to rank stocks across the same group, so that it makes sense to compare performance of top vs bottom stocks in the same group but it doesn't make sense to compare performance of a stock in one group with performance of another group.

    The main shortcoming is the return analysis: as there is no way to demean the returns by group, the statistics shown are of little use. Also when the plots are broken down by group, the results are not useful either as there are 2 bugs:

    • API documentation claims that perf.mean_return_by_quantile was performing is performing the demeaning at group level when splitting the plots by group, but even in that case it is not true
    • The same goes for perf.mean_information_coefficient

    Also changed the API arguments names in a consistent way:

    • the tears.* functions use 'long_short' and 'group_neutral' all around, as those functions are intended to be the top level ones. 'by_group' is used if the function support multiple plots output, one for each group
    • the remaining API (mostry performance.*) use 'demeaned' and 'group_adjust' as they used to be
    enhancement 
    opened by luca-s 19
  • ENH Pyfolio integration

    ENH Pyfolio integration

    Issue #225

    Added performance.create_pyfolio_input, which create input suitable for pyfolio. Intended use:

    factor_data = alphales.utils.get_clean_factor_and_forward_returns(factor, prices)
    
    pf_returns = alphales.performance.create_pyfolio_input(factor_data,'1D',
                         long_short=True,
                         group_neutral=False,
                         quantiles=None,
                         groups=None)
    
    pyfolio.tears.create_returns_tear_sheet(pf_returns)
    

    Also, I greatly improved the function that computes the cumulative returns as that is the base on which the pyfolio integration feature is built on. Mainly I removed the legacy assumptions which required the factor to be computed at specific frequency (daily, weekly etc) and also the periods had to be multiple of this frequency (2 days, 4 days etc). The function is now able to properly compute returns when there are gaps in the factor data or when we analyze an intraday factor.

    opened by luca-s 18
  • BUG: Calculate mean returns by date before mean return by quantile

    BUG: Calculate mean returns by date before mean return by quantile

    Compute mean return by date. If by_date flag is false, then compute and return the average of the daily mean returns (and also the standard error of these daily mean returns).

    Resolves: Issue #309

    opened by MichaelJMath 17
  • Will alphalens support multi-factor models in the future?

    Will alphalens support multi-factor models in the future?

    alphalens is awesome! I have been used alphalens to filter several effective factors from many factors in stock market for some time. However, by the previous step I just got several effective factors independently. In practice, the more common scenario is to generate a multi-factor linear model and regression the multi-factor model (for example Fama-French 3-factor model) other than single-factor models and do hypothesis testing in this multi-factor model. Will this multi-factor model be considered into adding up to alphalens in the future?

    enhancement question 
    opened by huaiweicheng 17
  • fix 'Index' object has no attribute 'get_values' bug

    fix 'Index' object has no attribute 'get_values' bug

    When invoke the function alphalens.tears.create_turnover_tear_sheetalphalens.tears.create_turnover_tear_sheet() without allocating the param turnover_period, may cause the AttributeError: 'Index' object has no attribute 'get_values'.It is because the utils.get_forward_returns_columns() returns columns as an object of Index instead of pd.Series.Hence, Index object hasn't have function get_values().

    opened by MiaoMiaoKuangFei 0
  • importing alphalens 0.3.6 gets error

    importing alphalens 0.3.6 gets error "No module named 'pandas.util._decorators'"

    Problem Description

    I've tried importing alphalens but encountered the issue of "No module named 'pandas.util._decorators'" right the first command of "import alphalens" Please provide a minimal, self-contained, and reproducible example:

    import alphalens
    

    Please provide the full traceback:

    ModuleNotFoundError                       Traceback (most recent call last)
    <ipython-input-21-6e4fa055c088> in <module>
    ----> 1 import alphalens
    
    e:\temp\Python36\lib\site-packages\alphalens\__init__.py in <module>
    ----> 1 from . import performance
          2 from . import plotting
          3 from . import tears
          4 from . import utils
          5 
    
    e:\temp\Python36\lib\site-packages\alphalens\performance.py in <module>
         20 from pandas.tseries.offsets import BDay
         21 from scipy import stats
    ---> 22 from statsmodels.regression.linear_model import OLS
         23 from statsmodels.tools.tools import add_constant
         24 from . import utils
    
    e:\temp\Python36\lib\site-packages\statsmodels\regression\__init__.py in <module>
    ----> 1 from .linear_model import yule_walker
          2 
          3 from statsmodels.tools._testing import PytestTester
          4 
          5 __all__ = ['yule_walker', 'test']
    
    e:\temp\Python36\lib\site-packages\statsmodels\regression\linear_model.py in <module>
         34 
         35 from statsmodels.compat.python import lrange, lzip
    ---> 36 from statsmodels.compat.pandas import Appender
         37 
         38 import numpy as np
    
    e:\temp\Python36\lib\site-packages\statsmodels\compat\__init__.py in <module>
    ----> 1 from statsmodels.tools._testing import PytestTester
          2 
          3 from .python import (
          4     PY37,
          5     asunicode, asbytes, asstr,
    
    e:\temp\Python36\lib\site-packages\statsmodels\tools\__init__.py in <module>
          1 from .tools import add_constant, categorical
    ----> 2 from statsmodels.tools._testing import PytestTester
          3 
          4 __all__ = ['test', 'add_constant', 'categorical']
          5 
    
    e:\temp\Python36\lib\site-packages\statsmodels\tools\_testing.py in <module>
          9 
         10 """
    ---> 11 from statsmodels.compat.pandas import assert_equal
         12 
         13 import os
    
    e:\temp\Python36\lib\site-packages\statsmodels\compat\pandas.py in <module>
          3 import numpy as np
          4 import pandas as pd
    ----> 5 from pandas.util._decorators import deprecate_kwarg, Appender, Substitution
          6 
          7 __all__ = ['assert_frame_equal', 'assert_index_equal', 'assert_series_equal',
    
    ModuleNotFoundError: No module named 'pandas.util._decorators'
    

    Please provide any additional information below: I'm using VScode Win10. I tried updating pip, pandas, np, alphalens to newer versions, still no hope. Please give me a hand on that, thanks in advance

    Versions

    • Alphalens version: 0.3.6
    • Python version: 3.6.8
    • Pandas version: 0.18.1
    • Matplotlib version: 3.3.4
    • Numpy: 1.17.0
    • Scipy: 1.0.0
    • Statsmodels: 0.12.2
    • Zipline: 1.2.0
    opened by BinhKieu82 1
  • MissingDataError: exog contains inf or nans

    MissingDataError: exog contains inf or nans

    I am getting the MissingDataError: exog contains inf or nans when I am trying to get the returns tearsheet. The input data from get_clean_factor_and_forward_returns does not have any nans or infs in it. I saw mentions of this error on the Quantopian forum (https://quantopian-archive.netlify.app/forum/threads/alphalens-giving-exog-contains-inf-or-nans.html) and a couple of other places, but no solution. Any thoughts? Thank you!

    opened by sokol11 1
  • Alphalens

    Alphalens

    Problem Description

    UnboundLocalError: local variable 'period_len' referenced before assignment, line 319

    days_diffs = []
            for i in range(30):
                if i >= len(forward_returns.index):
                    break
                p_idx = prices.index.get_loc(forward_returns.index[i])
                if p_idx is None or p_idx < 0 or (
                        p_idx + period) >= len(prices.index):
                    continue
                start = prices.index[p_idx]
                end = prices.index[p_idx + period]
                period_len = diff_custom_calendar_timedeltas(start, end, freq)
                days_diffs.append(period_len.components.days)
    
            delta_days = period_len.components.days - mode(days_diffs).mode[0]
            period_len -= pd.Timedelta(days=delta_days)
            label = timedelta_to_string(period_len)
    

    Please provide the full traceback:

    [Paste traceback here]
    

    Please provide any additional information below:

    Versions

    • Alphalens version:
    • Python version: 3.7
    • Pandas version:
    • Matplotlib version:
    opened by Swordsman-T 0
  • Problem:create_event_returns_tear_sheet

    Problem:create_event_returns_tear_sheet

    Problem Description

    When I use the alphalens.tears.create_event_returns_tear_sheet, it shows one error: unsupported operand type(s) for -: 'slice' and 'int', besides other functions work well. Looking forwards someone could help me. Thank you.

    Please provide a minimal, self-contained, and reproducible example:

    [Paste code here]
    **alphalens.tears.create_event_returns_tear_sheet(data,
                                                    returns,
                                                    avgretplot=(5, 15),
                                                    long_short=True,
                                                    group_neutral=False,
                                                    std_bar=True,
                                                    by_group=False)**
    
    
    
    **Please provide the full traceback:**
    ```python
    [Paste traceback here]
    ```---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    D:\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
        734             try:
    --> 735                 result = self._python_apply_general(f)
        736             except TypeError:
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
        750     def _python_apply_general(self, f):
    --> 751         keys, values, mutated = self.grouper.apply(f, self._selected_obj, self.axis)
        752 
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in apply(self, f, data, axis)
        205             group_axes = group.axes
    --> 206             res = f(group)
        207             if not _is_indexed_like(res, group_axes):
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in f(g)
        718                     with np.errstate(all="ignore"):
    --> 719                         return func(g, *args, **kwargs)
        720 
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in average_cumulative_return(q_fact, demean_by)
        802     def average_cumulative_return(q_fact, demean_by):
    --> 803         q_returns = cumulative_return_around_event(q_fact, demean_by)
        804         q_returns.replace([np.inf, -np.inf], np.nan, inplace=True)
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in cumulative_return_around_event(q_fact, demean_by)
        798             mean_by_date=True,
    --> 799             demean_by=demean_by,
        800         )
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in common_start_returns(factor, returns, before, after, cumulative, mean_by_date, demean_by)
        701 
    --> 702         starting_index = max(day_zero_index - before, 0)
        703         ending_index = min(day_zero_index + after + 1,
    
    TypeError: unsupported operand type(s) for -: 'slice' and 'int'
    
    During handling of the above exception, another exception occurred:
    
    TypeError                                 Traceback (most recent call last)
    <ipython-input-46-6fc348201897> in <module>
          5                                                 group_neutral=False,
          6                                                 std_bar=True,
    ----> 7                                                 by_group=False)
    
    D:\Anaconda\lib\site-packages\alphalens\plotting.py in call_w_context(*args, **kwargs)
         43             with plotting_context(), axes_style(), color_palette:
         44                 sns.despine(left=True)
    ---> 45                 return func(*args, **kwargs)
         46         else:
         47             return func(*args, **kwargs)
    
    D:\Anaconda\lib\site-packages\alphalens\tears.py in create_event_returns_tear_sheet(factor_data, returns, avgretplot, long_short, group_neutral, std_bar, by_group)
        573         periods_after=after,
        574         demeaned=long_short,
    --> 575         group_adjust=group_neutral,
        576     )
        577 
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in average_cumulative_return_by_quantile(factor_data, returns, periods_before, periods_after, demeaned, group_adjust, by_group)
        858         elif demeaned:
        859             fq = factor_data['factor_quantile']
    --> 860             return fq.groupby(fq).apply(average_cumulative_return, fq)
        861         else:
        862             fq = factor_data['factor_quantile']
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\generic.py in apply(self, func, *args, **kwargs)
        222     )
        223     def apply(self, func, *args, **kwargs):
    --> 224         return super().apply(func, *args, **kwargs)
        225 
        226     @Substitution(
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in apply(self, func, *args, **kwargs)
        744 
        745                 with _group_selection_context(self):
    --> 746                     return self._python_apply_general(f)
        747 
        748         return result
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in _python_apply_general(self, f)
        749 
        750     def _python_apply_general(self, f):
    --> 751         keys, values, mutated = self.grouper.apply(f, self._selected_obj, self.axis)
        752 
        753         return self._wrap_applied_output(
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\ops.py in apply(self, f, data, axis)
        204             # group might be modified
        205             group_axes = group.axes
    --> 206             res = f(group)
        207             if not _is_indexed_like(res, group_axes):
        208                 mutated = True
    
    D:\Anaconda\lib\site-packages\pandas\core\groupby\groupby.py in f(g)
        717                 def f(g):
        718                     with np.errstate(all="ignore"):
    --> 719                         return func(g, *args, **kwargs)
        720 
        721             elif hasattr(nanops, "nan" + func):
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in average_cumulative_return(q_fact, demean_by)
        801 
        802     def average_cumulative_return(q_fact, demean_by):
    --> 803         q_returns = cumulative_return_around_event(q_fact, demean_by)
        804         q_returns.replace([np.inf, -np.inf], np.nan, inplace=True)
        805 
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in cumulative_return_around_event(q_fact, demean_by)
        797             cumulative=True,
        798             mean_by_date=True,
    --> 799             demean_by=demean_by,
        800         )
        801 
    
    D:\Anaconda\lib\site-packages\alphalens\performance.py in common_start_returns(factor, returns, before, after, cumulative, mean_by_date, demean_by)
        700             continue
        701 
    --> 702         starting_index = max(day_zero_index - before, 0)
        703         ending_index = min(day_zero_index + after + 1,
        704                            len(returns.index))
    
    TypeError: unsupported operand type(s) for -: 'slice' and 'int'
    
    <Figure size 432x288 with 0 Axes>
    ​
    
    **Please provide any additional information below:**
    
    
    ## Versions
    
    * Alphalens version: 0.4.0
    * Python version: 3.7.6
    * Pandas version: 1.0.1
    * Matplotlib version: 3.1.3
    
    opened by RemnantLi 0
Releases(v0.4.0)
  • v0.4.0(Apr 30, 2020)

    This is a minor release from 0.3.6 that includes bugfixes, performance improvements, and build changes.

    Bug Fixes

    https://github.com/quantopian/alphalens/pull/334 Pandas 1.0 fix: https://github.com/quantopian/alphalens/pull/364

    New Features

    Turnover tearsheet improvement: https://github.com/quantopian/alphalens/pull/354 CI builds now run on GitHub Actions: https://github.com/quantopian/alphalens/pull/363

    Performance + API Change

    https://github.com/quantopian/alphalens/pull/361 simplified the cumulative returns calculation, making it much faster. Other function signatures and behaviors were modified as well.

    Docs + Miscellaneous

    https://github.com/quantopian/alphalens/pull/332 README updates: https://github.com/quantopian/alphalens/pull/345, https://github.com/quantopian/alphalens/pull/337 A new tutorial notebook.

    Credits

    The following people contributed to this release: @eigenfoo, @luca-s, @twiecki, @ivigamberdiev, @fawce, @jbredeche, @jimportico, @gerrymanoim, @jmccorriston, @altquant

    Source code(tar.gz)
    Source code(zip)
  • v0.3.6(Jan 7, 2019)

  • v0.3.5(Dec 17, 2018)

    This is a minor release from 0.3.4 that includes bugfixes, speed enhancement and compatibility with more recent pandas versions. We recommend that all users upgrade to this version.

    Bugfixes

    • Issue 323 factor_rank_autocorrelation infers turnover period in calendar space while periods could have different time space
    • PR 324 avoid crashing Alphalens when autocorrelation or turnover data contains only NaNs

    Performance

    • PR 327 Speed up compute_forward_returns and get_clean_factor

    Compatibility with new pandas versions

    • PR 328 improved compatibility with pandas 0.23.4
    Source code(tar.gz)
    Source code(zip)
  • v0.3.4(Oct 11, 2018)

    This is a minor release from 0.3.3 that includes bugfixes, small enhancements and backward compatibility breakages. We recommend that all users upgrade to this version.

    Bugfixes

    • PR 317 Fix date conversion in newer versions of pandas
    • Issue 309 Biased Mean Quantile Returns for Non-Equal Bins

    New features

    • PR 306 added zero aware quantiles option

    API change

    • PR 268 All functions deprecated in version v0.2.0 are no longer available

    Credits

    The following people contributed to this release:

    @eigenfoo - George Ho @MichaelJMath - Mike Matthews @freddiev4 - Freddie Vargus @vikram-narayan - Vikram Narayan @twiecki - Thomas Wiecki @luca-s - Luca Scarabello

    Source code(tar.gz)
    Source code(zip)
  • v0.3.2(May 14, 2018)

    This is a minor release from 0.3.1 that includes bugfixes and small enhancements. We recommend that all users upgrade to this version.

    Bugfixes

    • PR297 BUG: create_pyfolio_input doesn't work with frequency higher than 1 day
    • PR302 BUG: compute_mean_return_spread returns error if no std_err argument

    New features

    • PR298 ENH: added rate of return option in 'create_event_study_tear_sheet'
    • PR300 ENH: added n_bars option in 'create_event_study_tear_sheet'
    Source code(tar.gz)
    Source code(zip)
  • v0.3.1(Apr 24, 2018)

    This is a minor release from 0.3.0 that includes bugfixes and performance improvement. We recommend that all users upgrade to this version.

    Bugfixes

    • PR 287 utils.get_clean_factor crashes with malformed 'groupby' data
    • PR 287 perf.average_cumulative_return_by_quantile crahes in certain scenarios
    • PR 288 monthly IC heatmap plot has inverted colors (red for positive and blue for negative IC)
    • PR 295 Issue 292 utils.compute_forward_returns fails to detect the correct period length

    Performance

    • PR 294 computation of cumulative returns is very slow
    Source code(tar.gz)
    Source code(zip)
  • v0.3.0(Mar 14, 2018)

    This is a major release from 0.2.1, we recommend that all users upgrade to this version.

    New features

    Bugfixes

    • Alphalens now works with both tz-aware and tz-naive data (but not mixed)
    • "Cumulative Returns by Quantile" plot used a different color scheme for quantiles than "Average Cumulative Returns by Quantile" plot
    • Many small but useful bug fixes that avoid sporadic crashes and memory leaks. Please see the git history for more details

    Documentation

    Maintenance

    • Removed deprecated pandas.TimeGrouper
    • Migrated tests from deprecated nose-parameterized (#251)
    • Fixed compatibility with matplotlib 2.2.0
    • Alphalens is now available via conda-forge. Install via conda install -c conda-forge alphalens

    Credits

    The following people contributed to this release:

    @luca-s - Luca Scarabello @twiecki - Thomas Wiecki @mmargenot - Max Margenot @MichaelJMath @HereticSK @TimShawver - Tim Shawver @alen12345 - Alessio Nava

    Source code(tar.gz)
    Source code(zip)
  • v0.2.1(Nov 18, 2017)

    This is a bugfix release from v0.2.0. All users are recommended to upgrade.

    Bugfixes

    • tears.create_information_tear_sheet: argument group_adjust was erroneously removed without a replacement. From this release argumentgroup_adjust is still deprecated but group_neutral can be used instead
    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Nov 17, 2017)

    This is a major new release since v0.1.0. It contains small API breakage, several new features and many bug fixes. All users are recommended to upgrade.

    New since v0.1.0

    New features

    • Added event study analysis: an event study is a statistical method to assess the impact of a particular event on the value of equities and it is now possible to perform this analysis through the API alphalens.tears.create_event_study_tear_sheet. Check out the relative NoteBook in the example folder.

    • Added support for group neutral factor analysis (group_neutral argument): this affects the return analysis that is now able to compute returns statistics for each group independently and aggregate them together assuming a portfolio where each group has equal weight.

    • utils.get_clean_factor_and_forward_returns has a new parameter max_loss that controls how much data the function is allowed to drop due to not having enough price data or due to binning errors (pandas.qcut). This gives the users more control on what is happening and also avoid the function to raise an exception if the binning doesn't go well on some values.

    • Greatly improved API documentation

    Bugfixes

    API change

    • Removed deprecated alphalens.tears.create_factor_tear_sheet
    • tears.create_summary_tear_sheet: added argument group_neutral.
    • tears.create_returns_tear_sheet: added argument group_neutral. Please consider using keyword arguments to avoid API breakage
    • tears.create_information_tear_sheet: group_adjust is now deprecated and group_neutral should be used instead
    • tears.create_full_tear_sheet: group_adjust is now deprecated and group_neutral should be used instead
    • tears.create_event_returns_tear_sheet: added argument group_neutral. Please consider using keyword arguments to avoid API breakage
    • Several small changes to lower level API (alphalens.performance)

    Maintenance

    • Depends on pandas>=0.18.0
    • Changed deprecated pd.rolling_mean() to use the new *.rolling().mean() API
    • Changed deprecated pd.rolling_apply() to use the new *.rolling().apply() API
    • Use versioneer to pull version from git tag
    Source code(tar.gz)
    Source code(zip)
  • v0.1.2(Oct 3, 2017)

    New release v0.1.2

    • Removed deprecated API 'alphalens.tears.create_factor_tear_sheet'

    • Added event study API 'alphalens.tears.create_event_study_tear_sheet' and relative example NB

    • Added Long only option to 'alphalens.performance.factor_alpha_beta'

    • Improved docstrings all around

    • Small bug fixes

    Source code(tar.gz)
    Source code(zip)
Owner
Quantopian, Inc.
Quantopian builds software tools and libraries for quantitative finance.
Quantopian, Inc.
A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data struc

Zed(Zijun) Chen 40 Dec 12, 2022
An easy-to-use feature store

A feature store is a data storage system for data science and machine-learning. It can store raw data and also transformed features, which can be fed straight into an ML model or training script.

ByteHub AI 48 Dec 09, 2022
scikit-survival is a Python module for survival analysis built on top of scikit-learn.

scikit-survival scikit-survival is a Python module for survival analysis built on top of scikit-learn. It allows doing survival analysis while utilizi

Sebastian Pölsterl 876 Jan 04, 2023
Pyspark project that able to do joins on the spark data frames.

SPARK JOINS This project is to perform inner, all outer joins and semi joins. create_df.py: load_data.py : helps to put data into Spark data frames. d

Joshua 1 Dec 14, 2021
bigdata_analyse 大数据分析项目

bigdata_analyse 大数据分析项目 wish 采用不同的技术栈,通过对不同行业的数据集进行分析,期望达到以下目标: 了解不同领域的业务分析指标 深化数据处理、数据分析、数据可视化能力 增加大数据批处理、流处理的实践经验 增加数据挖掘的实践经验

Way 2.4k Dec 30, 2022
Python reader for Linked Data in HDF5 files

Linked Data are becoming more popular for user-created metadata in HDF5 files.

The HDF Group 8 May 17, 2022
PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).

PCAfold is an open-source Python library for generating, analyzing and improving low-dimensional manifolds obtained via Principal Component Analysis (PCA).

Burn Research 4 Oct 13, 2022
ForecastGA is a Python tool to forecast Google Analytics data using several popular time series models.

ForecastGA is a tool that combines a couple of popular libraries, Atspy and googleanalytics, with a few enhancements.

JR Oakes 36 Jan 03, 2023
A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Disclaimer This project is stable and being incubated for long-term support. It may contain new experimental code, for which APIs are subject to chang

Uber Open Source 1.6k Dec 29, 2022
In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift.

ETL Pipeline for AWS Project Description In this project, ETL pipeline is build on data warehouse hosted on AWS Redshift. The data is loaded from S3 t

Mobeen Ahmed 1 Nov 01, 2021
Statistical package in Python based on Pandas

Pingouin is an open-source statistical package written in Python 3 and based mostly on Pandas and NumPy. Some of its main features are listed below. F

Raphael Vallat 1.2k Dec 31, 2022
Automated Exploration Data Analysis on a financial dataset

Automated EDA on financial dataset Just a simple way to get automated Exploration Data Analysis from financial dataset (OHLCV) using Streamlit and ta.

Darío López Padial 28 Nov 27, 2022
Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g

9 Nov 16, 2022
Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video.

Datashredder is a simple data corruption engine written in python. You can corrupt anything text, images and video. You can chose the cha

2 Jul 22, 2022
sportsdataverse python package

sportsdataverse-py See CHANGELOG.md for details. The goal of sportsdataverse-py is to provide the community with a python package for working with spo

Saiem Gilani 37 Dec 27, 2022
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (Parquet, CSV, JSON and EXCEL).

AWS Data Wrangler Pandas on AWS Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretMana

Amazon Web Services - Labs 3.3k Jan 04, 2023
Average time per match by division

HW_02 Unzip matches.rar to access .json files for matches. Get an API key to access their data at: https://developer.riotgames.com/ Average time per m

11 Jan 07, 2022
WaveFake: A Data Set to Facilitate Audio DeepFake Detection

WaveFake: A Data Set to Facilitate Audio DeepFake Detection This is the code repository for our NeurIPS 2021 (Track on Datasets and Benchmarks) paper

Chair for Sys­tems Se­cu­ri­ty 27 Dec 22, 2022
Containerized Demo of Apache Spark MLlib on a Data Lakehouse (2022)

Spark-DeltaLake-Demo Reliable, Scalable Machine Learning (2022) This project was completed in an attempt to become better acquainted with the latest b

8 Mar 21, 2022
A stock analysis app with streamlit

StockAnalysisApp A stock analysis app with streamlit. You select the ticker of the stock and the app makes a series of analysis by using the price cha

Antonio Catalano 50 Nov 27, 2022