Simple, fast, and parallelized symbolic regression in Python/Julia via regularized evolution and simulated annealing

Overview

PySR

(pronounced like py as in python, and then sur as in surface)

Documentation Status PyPI version .github/workflows/CI.yml

Parallelized symbolic regression built on Julia, and interfaced by Python. Uses regularized evolution, simulated annealing, and gradient-free optimization.

Cite this software

Documentation

Check out SymbolicRegression.jl for the pure-Julia backend of this package.

Symbolic regression is a very interpretable machine learning algorithm for low-dimensional problems: these tools search equation space to find algebraic relations that approximate a dataset.

One can also extend these approaches to higher-dimensional spaces by using a neural network as proxy, as explained in 2006.11287, where we apply it to N-body problems. Here, one essentially uses symbolic regression to convert a neural net to an analytic equation. Thus, these tools simultaneously present an explicit and powerful way to interpret deep models.

Backstory:

Previously, we have used eureqa, which is a very efficient and user-friendly tool. However, eureqa is GUI-only, doesn't allow for user-defined operators, has no distributed capabilities, and has become proprietary (and recently been merged into an online service). Thus, the goal of this package is to have an open-source symbolic regression tool as efficient as eureqa, while also exposing a configurable python interface.

Installation

PySR uses both Julia and Python, so you need to have both installed.

Install Julia - see downloads, and then instructions for mac and linux. (Don't use the conda-forge version; it doesn't seem to work properly.)

You can install PySR with:

pip install pysr

The first launch will automatically install the Julia packages required.

Quickstart

Here is some demo code (also found in example.py)

import numpy as np
from pysr import pysr, best

# Dataset
X = 2*np.random.randn(100, 5)
y = 2*np.cos(X[:, 3]) + X[:, 0]**2 - 2

# Learn equations
equations = pysr(X, y, niterations=5,
    binary_operators=["plus", "mult"],
    unary_operators=[
      "cos", "exp", "sin", #Pre-defined library of operators (see https://pysr.readthedocs.io/en/latest/docs/operators/)
      "inv(x) = 1/x"]) # Define your own operator! (Julia syntax)

...# (you can use ctl-c to exit early)

print(best(equations))

which gives:

x0**2 + 2.000016*cos(x3) - 1.9999845

One can also use best_tex to get the LaTeX form, or best_callable to get a function you can call. This uses a score which balances complexity and error; however, one can see the full list of equations with:

print(equations)

This is a pandas table, with additional columns:

  • MSE - the mean square error of the formula
  • score - a metric akin to Occam's razor; you should use this to help select the "true" equation.
  • sympy_format - sympy equation.
  • lambda_format - a lambda function for that equation, that you can pass values through.
Comments
  • Add Support for Arbitrary Precision Arithmetic with BigFloat

    Add Support for Arbitrary Precision Arithmetic with BigFloat

    Is your feature request related to a problem? Please describe. I tried running 'pysr' on a 1,000 row array with 4 integer input variables and one integer output variable - a Goedel Number.

    From Mathematica:

    GoedelNumber[l_List] := Times @@ MapIndexed[Prime[First[#2]]^#1 &, l]
    

    E.g.

    Data file:
    # 7	1	5	8	6917761200000
    
    julia> 2^7*3^1*5^5*7^8
    6917761200000
    

    The model returned:

    Complexity  Loss       Score     Equation
    1           Inf       NaN       0.22984365
    
    

    I am just learning 'pysr' and maybe it's just 'user error'. However, Inf and Nan suggest that Goedel numbers may exceed Float64.

    Screenshot 2022-12-01 at 8 33 44 AM

    Describe the solution you'd like Not sure what happened, because the largest Goedel number in the input is: 1.6679880978201e+23

    Additional context I didn't see any parameters to set 'verbose' mode or 'debugging' information.

    GoedelTableFourParameters.txt

    enhancement good first issue 
    opened by dbl001 35
  • [Windows] : Couldn't find equation file!

    [Windows] : Couldn't find equation file!

    Hi Miles,

    I've been installing PySR in parallel to Julia under win10. It runs... till the moment it crashes with the following message:

    File "C:\Users\Matthieu\anaconda3\lib\site-packages\pysr\sr.py", line 774, in get_hof raise RuntimeError("Couldn't find equation file! The equation search likely exited before a single iteration completed.")

    RuntimeError: Couldn't find equation file! The equation search likely exited before a single iteration completed.

    In the last case, I've been to 38% of progress.

    I have to say that, sometime (not often), the process gets completed.

    What is the reason for this?

    Also... is there a forum or I posted at the right place?

    I thank you for your help.

    Regards

    Magaud

    bug 
    opened by Magaud59 27
  • JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions / prior installation with conda

    JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions / prior installation with conda

    Describe the bug

    JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions [a40a106e]:
     DynamicExpressions [a40a106e] log:
     ├─DynamicExpressions [a40a106e] has no known versions!
     └─restricted to versions 0.4 by SymbolicRegression [8254be44] — no versions left
       └─SymbolicRegression [8254be44] log:
         ├─possible versions are: 0.14.4 or uninstalled
         └─SymbolicRegression [8254be44] is fixed to version 0.14.4' occurred while calling julia code:
    Pkg.add([sr_spec, clustermanagers_spec], io=stderr)
    

    Version (please include the following information): MacOS Ventura 13.0.1 (22A400)

    • Julia version [Run julia --version in the terminal]

    • julia --version julia version 1.8.3

    • Python version [Run python --version in the terminal]

    • Python 3.8.13

    • Did you install with pip or conda?

    • pip

    $ conda list pysr
    # packages in environment at /Users/davidlaxer/anaconda3/envs/ai:
    #
    # Name                    Version                   Build  Channel
    pysr                      0.11.11                  pypi_0    pypi
    
    % pip show pysr
    Name: pysr
    Version: 0.11.11
    Summary: Simple and efficient symbolic regression
    Home-page: https://github.com/MilesCranmer/pysr
    Author: Miles Cranmer
    Author-email: [email protected]
    License: 
    Location: /Users/davidlaxer/anaconda3/envs/ai/lib/python3.8/site-packages
    Requires: julia, numpy, pandas, scikit-learn, sympy
    Required-by: 
    
    • PySR version [Run python -c 'import pysr; print(pysr.__version__)']
    • 0.9.1
    • Does the bug still appear with the latest version of PySR?

    Configuration

    • What are your PySR settings?
    • What dataset are you running on?
    • If possible, please share a minimal code example that produces the error.

    Error message Add the error message here, or whatever other information would be useful for debugging.

    If the error is "Couldn't find equation file...", this error indicates something went wrong with the backend. Please scroll up and copy the output of Julia, rather than the output of python.

    Additional context Add any other context about the problem here.

    Julia Version 1.8.3
    Commit 0434deb161e (2022-11-14 20:14 UTC)
    Platform Info:
      OS: macOS (x86_64-apple-darwin21.4.0)
      uname: Darwin 22.1.0 Darwin Kernel Version 22.1.0: Sun Oct  9 20:14:54 PDT 2022; root:xnu-8792.41.9~2/RELEASE_X86_64 x86_64 i386
      CPU: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz: 
                     speed         user         nice          sys         idle          irq
           #1-16  3800 MHz    7543546 s          0 s    3955434 s   72076495 s          0 s
      Memory: 128.0 GB (32470.4921875 MB free)
      Uptime: 951050.0 sec
      Load Avg:  8.20068359375  5.13525390625  4.3212890625
      WORD_SIZE: 64
      LIBM: libopenlibm
      LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
      Threads: 1 on 16 virtual cores
    Environment:
      JULIA_DEPOT_PATH_BACKUP = 
      JULIA_PROJECT_BACKUP = 
      JULIA_LOAD_PATH_BACKUP = 
      JULIA_DEPOT_PATH = /Users/davidlaxer/anaconda3/envs/ai/share/julia:
      JULIA_SSL_CA_ROOTS_PATH_BACKUP = 
      JULIA_SSL_CA_ROOTS_PATH = 
      JULIA_PROJECT = @pysr-0.11.11
      TERM = xterm-color
      PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/bin:/Users/davidlaxer/.juliaup/bin:/Users/davidlaxer/.cabal/bin:/Users/davidlaxer/.ghcup/bin:/Users/davidlaxer/anaconda3/envs/ai/bin:/Users/davidlaxer/anaconda3/condabin:/opt/local/bin:/opt/local/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Library/Apple/usr/bin:/Users/davidlaxer/.cargo/bin:/Users/jetbrains/.local/bin
      XPC_FLAGS = 0x0
      HOME = /Users/davidlaxer
      JAVA_HOME = :-
      JAVA_LD_LIBRARY_PATH = :-
      CAML_LD_LIBRARY_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/stublibs:/Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/ocaml/stublibs:/Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/ocaml
      OCAML_TOPLEVEL_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/toplevel
      PKG_CONFIG_PATH = /Users/davidlaxer/.opam/_coq-platform_.2021.02.1/lib/pkgconfig:
      CONDA_BACKUP_FFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
      CONDA_BACKUP_FORTRANFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
      CONDA_BACKUP_DEBUG_FFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
      CONDA_BACKUP_DEBUG_FORTRANFLAGS = -march=nocona -mtune=core2 -ftree-vectorize -fPIC -fstack-protector -O2 -pipe
    [ Info: Julia version info
    [ Info: Julia executable: /Users/davidlaxer/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/bin/julia
    [ Info: Trying to import PyCall...
    ┌ Info: PyCall is already installed and compatible with Python executable.
    │ 
    │ PyCall:
    │     python: /Users/davidlaxer/anaconda3/envs/ai/bin/python
    │     libpython: /Users/davidlaxer/anaconda3/envs/ai/lib/libpython3.8.dylib
    │ Python:
    │     python: /Users/davidlaxer/anaconda3/envs/ai/bin/python
    └     libpython: 
       Resolving package versions...
    ---------------------------------------------------------------------------
    JuliaError                                Traceback (most recent call last)
    Input In [5], in <cell line: 4>()
          1 get_ipython().system('export JULIA_SSL_CA_ROOTS_PATH=""')
          2 import pysr
    ----> 4 pysr.install()
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/pysr/julia_helpers.py:87, in install(julia_project, quiet)
         83 io_arg = _get_io_arg(quiet)
         85 if is_shared:
         86     # Install SymbolicRegression.jl:
    ---> 87     _add_sr_to_julia_project(Main, io_arg)
         89 Main.eval("using Pkg")
         90 Main.eval(f"Pkg.instantiate({io_arg})")
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/pysr/julia_helpers.py:240, in _add_sr_to_julia_project(Main, io_arg)
        230 Main.sr_spec = Main.PackageSpec(
        231     name="SymbolicRegression",
        232     url="https://github.com/MilesCranmer/SymbolicRegression.jl",
        233     rev="v" + __symbolic_regression_jl_version__,
        234 )
        235 Main.clustermanagers_spec = Main.PackageSpec(
        236     name="ClusterManagers",
        237     url="https://github.com/JuliaParallel/ClusterManagers.jl",
        238     rev="14e7302f068794099344d5d93f71979aaf4fbeb3",
        239 )
    --> 240 Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})")
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:627, in Julia.eval(self, src)
        625 if src is None:
        626     return None
    --> 627 ans = self._call(src)
        628 if not ans:
        629     return None
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:555, in Julia._call(self, src)
        553 # logger.debug("_call(%s)", src)
        554 ans = self.api.jl_eval_string(src.encode('utf-8'))
    --> 555 self.check_exception(src)
        557 return ans
    
    File ~/anaconda3/envs/ai/lib/python3.8/site-packages/julia/core.py:609, in Julia.check_exception(self, src)
        607 else:
        608     exception = sprint(showerror, self._as_pyobj(res))
    --> 609 raise JuliaError(u'Exception \'{}\' occurred while calling julia code:\n{}'
        610                  .format(exception, src))
    
    JuliaError: Exception 'Unsatisfiable requirements detected for package DynamicExpressions [a40a106e]:
     DynamicExpressions [a40a106e] log:
     ├─DynamicExpressions [a40a106e] has no known versions!
     └─restricted to versions 0.4 by SymbolicRegression [8254be44] — no versions left
       └─SymbolicRegression [8254be44] log:
         ├─possible versions are: 0.14.4 or uninstalled
         └─SymbolicRegression [8254be44] is fixed to version 0.14.4' occurred while calling julia code:
    Pkg.add([sr_spec, clustermanagers_spec], io=stderr)
    
     % julia
                   _
       _       _ _(_)_     |  Documentation: https://docs.julialang.org
      (_)     | (_) (_)    |
       _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
      | | | | | | |/ _` |  |
      | | |_| | | | (_| |  |  Version 1.8.3 (2022-11-14)
     _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
    |__/                   |
    
    julia> using Pkg
    
    julia> Pkg.add("DynamicExpressions")
    ERROR: The following package names could not be resolved:
     * DynamicExpressions (not found in project, manifest or registry)
    Stacktrace:
      [1] pkgerror(msg::String)
        @ Pkg.Types ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/Types.jl:67
      [2] ensure_resolved(ctx::Pkg.Types.Context, manifest::Pkg.Types.Manifest, pkgs::Vector{Pkg.Types.PackageSpec}; registry::Bool)
        @ Pkg.Types ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/Types.jl:952
      [3] add(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; preserve::Pkg.Types.PreserveLevel, platform::Base.BinaryPlatforms.Platform, kwargs::Base.Pairs{Symbol, Base.TTY, Tuple{Symbol}, NamedTuple{(:io,), Tuple{Base.TTY}}})
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:264
      [4] add(pkgs::Vector{Pkg.Types.PackageSpec}; io::Base.TTY, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:156
      [5] add(pkgs::Vector{Pkg.Types.PackageSpec})
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:145
      [6] #add#27
        @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
      [7] add
        @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:144 [inlined]
      [8] #add#26
        @ ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:143 [inlined]
      [9] add(pkg::String)
        @ Pkg.API ~/anaconda3/envs/ai/share/julia/juliaup/julia-1.8.3+0.x64/share/julia/stdlib/v1.8/Pkg/src/API.jl:143
     [10] top-level scope
        @ REPL[2]:1
    
    julia> 
    
    

    The code works properly on Google CoLab.

    bug 
    opened by dbl001 26
  • Refactor of PySRRegressor

    Refactor of PySRRegressor

    Re Issue #143

    Compatibility with scikit-learn should be improved.

    Noteable breaking changes for users: PySRRegressor.equations is now called PySRRegressor.equations_

    Tests have been updated to allow compatibility with the refactored code but still assess the same functionality. All tests should pass.

    Please let me know if there are any concerns or if you would like me to document/explain any of the changes in detail.

    opened by tttc3 24
  • [BUG] conda version breaking

    [BUG] conda version breaking

    Edit: If you are seeing issues with the conda version, try updating PySR with conda update pysr. The new version fixes an issue related to automatic updating of Julia packages.


    The conda-forge jobs which test conda install -c conda-forge pysr are currently breaking. This is even with repeat attempts: https://github.com/MilesCranmer/PySR/actions/workflows/CI_conda_forge.yml. The error:

    ImportError: 
        Required dependencies are not installed or built.  Run the following code in the Python REPL:
    

    I find this strange, since underlying feedstock has not changed in the meantime, and it seems like the julia feedstock hasn't been updated recently either.

    FYI @mkitti @ngam. I will try to look into this a bit later today.

    bug 
    opened by MilesCranmer 23
  • [Errno 2] No such file or directory

    [Errno 2] No such file or directory

    I have installed pysr-0.6.12.post1 and I have been try to run the example.py but after solve some previous closed bug reports a FileNotFoundError occurs. I'm using Windows 10 and Python 3.7 the version of Julia is 1.6.2. The error message is the following.

    FileNotFoundError: [Errno 2] No such file or directory: 'hall_of_fame_2021-08-04_230410.180.csv.bkup'

    bug 
    opened by jzsmoreno 21
  • Performance speed-up options?

    Performance speed-up options?

    Hello Miles! Thank you for open-sourcing this powerful tool! I am working on including PySR in my own research, and running into some performance bottlenecks.

    I found regressing a simple equation (e.g. the quick-start example) takes roughly 2 minutes. Ideally, I am aiming to reduce that time to ~30 seconds. Would you give me some pointers on this? Meanwhile, I will try break down the challenge in several pieces:

    1. Activating a new environment at each API call: I noticed that a new Julia (?) environment is created each time I call pysr() api (see terminal output below). Could we keep the environment up so we can skip this process for subsequent calls?
    Running on julia -O3 /tmp/tmpe5qmgemh/runfile.jl
      Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
        Updating registry at `~/.julia/registries/General`
      No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      No Changes to `~/anaconda3/envs/rw/lib/python3.7/site-packages/Manifest.toml`
    Activating environment on workers.
      Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      Activating  Activating  environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
    environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
      Activating  Activating environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml` 
    environment at `~/anaconda3/envs/rw/lib/python3.7/site-packages/Project.toml`
    Importing installed module on workers...Finished!
    Started!
    
    1. If the above wouldn't work, then allowing y to be vector-valued (as mentioned in #35) would be a second-best option! Even better, if we could create a "batched" version of pysr(X, y) api pysr_batched(X, y), such that X and y are python lists, and we return the results in a list as well, so that we only generate one Julia script, and call os.system() once to keep the Julia environment up.

    2. Multi-threading: I noticed that increasing procs from 4 to 8 resulted in slightly longer running time. I am running on a 8-core 16-tread CPU. Did I do something dumb?

    3. I went into pysr/sr.py and added runtests=false flag in line 438 and 440. That saved ~20 seconds.

    opened by yxie20 20
  • [Feature] LaTeX table generator

    [Feature] LaTeX table generator

    This generates a booktabs-style LaTeX table for a subset of equations. Here is an example:

    import numpy as np
    from pysr import PySRRegressor
    
    X = 2 * np.random.randn(100, 5)
    y = 2.5382 * np.cos(X[:, 3]) + X[:, 0] ** 2 - 0.5
    
    model = PySRRegressor(
        niterations=80,
        binary_operators=["+", "*"],
        unary_operators=["cos"],
        model_selection="best",
        loss="loss(x, y) = (x - y)^2",  # Custom loss function (julia syntax)
        maxsize=11,
    )
    
    model.fit(X, y)
    
    print(model.latex_table(precision=3, include_score=True))
    

    The output of this is:

    \begin{table}[h]
    \begin{center}
    \begin{tabular}{@{}lccc@{}}
    \toprule
    Equation & Complexity & Loss & Score \\
    \midrule
    $3.9$ & 1 & 38.9 & 0 \\
    $x_{0}^{2}$ & 3 & 3.16 & 1.26 \\
    $x_{0}^{2} - 0.257$ & 5 & 3.09 & 0.0105 \\
    $x_{0}^{2} + \cos{\left(x_{3} \right)}$ & 6 & 1.26 & 0.898 \\
    $x_{0}^{2} + 2.44 \cos{\left(x_{3} \right)}$ & 8 & 0.245 & 0.818 \\
    $x_{0}^{2} + 2.54 \cos{\left(x_{3} \right)} - 0.5$ & 10 & 2.28e-13 & 13.9 \\
    \bottomrule
    \end{tabular}
    \end{center}
    \end{table}
    

    which renders as: image

    Leaving include_score set to False will leave out the Score column. Precision can be adjusted to have more or less precise constants.

    One can render only a subset of equations by using latex_table([1, 4]) which only includes the 1st and 4th equation in model.equations_.


    Edit: it now renders the e-13 as \cdot 10^{-13}

    opened by MilesCranmer 19
  • Set JULIA_PROJECT, use Pkg.add once

    Set JULIA_PROJECT, use Pkg.add once

    • Sets JULIA_PROJECT before loading pyjulia so that PyCall.jl can be contained within the pysr environment
    • Also use Pkg.add in a single step to add both SymbolicRegression.jl and ClusterManagers.jl to the environment at the same time

    I likely advised against using the environment variable JULIA_PROJECT in the past. However, I think this may be necessary to avoid interference from other projects if installed within the same environment.

    opened by mkitti 15
  • Windows support

    Windows support

    Hi Miles,

    first of all, this is awesome. Thanks so much for making this.

    A student I'm working with is trying to run PySR under Windows. Is that in principle supported?

    PySR's dependencies don't seem to have any issues with Windows, but pysr.pysr throws a FileNotFoundError when accessing /tmp/.hyperparams_{rand_string}.hl'. Seems to be because of the different file system structure under Windows. If this is the only issue, how would you feel about using something like tempfile to generate temporary files in a more OS-independent way?

    I am happy to try this and open a PR once it works.

    Cheers, Johann

    implemented 
    opened by johannbrehmer 15
  • [Windows] Always returning the same equation?

    [Windows] Always returning the same equation?

    I don't know if this is a Windows issue or what (I work on a Linux partition, but I just wanted to play around with this - I haven't actually done serious work Windows for 7 years or so, so I'm at a loss), but after fitting one equation, it's always returning that equation. Even with different data, in a different notebook.

    I've looked to see if I could find the julia file it creates - nope. And they're different files every time.

    Any ideas?

    opened by JQVeenstra 14
  • [BUG] Pickling error on use of ReLU

    [BUG] Pickling error on use of ReLU

    I see this error when I try to use the ReLU operator:

    PicklingError: Can't pickle relu: attribute lookup relu on __main__ failed
    

    seems like it's implemented in a way that can't be pickled. Should be an easy fix.

    bug 
    opened by MilesCranmer 1
  • [BUG] *Windows SystemError: <PyCall.jlwrap on basic example*

    [BUG] *Windows SystemError:

    I have done a fresh installation on windows (with pip) and I am running the basic example provided in the Introduction. I am getting a JULIA error. Thanks in advance for any help!

    Version:

    • Julia version [1.8.3]
    • Python version [3.10.6]
    • PySR version [0.11.11]

    Error message

    C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py:1257: UserWarning: Note: it looks like you are running in Jupyter. The progress bar will be turned off. warnings.warn( Traceback (most recent call last):

    File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec exec(code, globals, locals)

    File "c:\users\gorth\untitled0.py", line 25, in model.fit(X, y)

    File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py", line 1792, in fit self._run(X, y, mutated_params, weights=weights, seed=seed)

    File "C:\tools\Anaconda3\envs\env_ai\lib\site-packages\pysr\sr.py", line 1652, in run self.raw_julia_state = SymbolicRegression.EquationSearch(

    SystemError: <PyCall.jlwrap (in a Julia function called from Python) JULIA: SystemError: opening file "hall_of_fame_2022-12-17_011150.694.csv": Invalid argument Stacktrace: [1] systemerror(p::String, errno::Int32; extrainfo::Nothing) @ Base .\error.jl:176 [2] #systemerror#80 @ .\error.jl:175 [inlined] [3] systemerror @ .\error.jl:175 [inlined] [4] open(fname::String; lock::Bool, read::Nothing, write::Nothing, create::Nothing, truncate::Bool, append::Nothing) @ Base .\iostream.jl:293 [5] open(fname::String, mode::String; lock::Bool) @ Base .\iostream.jl:356 [6] open(fname::String, mode::String) @ Base .\iostream.jl:355 [7] open(::SymbolicRegression.var"#48#77"{Options{typeof(loss), Int64, 0.86, 10}, Vector{PopMember{Float32}}, SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}, ::String, ::Vararg{String}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}) @ Base .\io.jl:382 [8] open @ .\io.jl:381 [inlined] [9] EquationSearch(::SymbolicRegression.CoreModule.ProgramConstantsModule.SRThreaded, datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{typeof(loss), Int64, 0.86, 10}, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:751 [10] EquationSearch(datasets::Vector{SymbolicRegression.CoreModule.DatasetModule.Dataset{Float32}}; niterations::Int64, options::Options{typeof(loss), Int64, 0.86, 10}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:383 [11] EquationSearch(X::Matrix{Float32}, y::Matrix{Float32}; niterations::Int64, weights::Nothing, varMap::Vector{String}, options::Options{typeof(loss), Int64, 0.86, 10}, parallelism::String, numprocs::Nothing, procs::Nothing, addprocs_function::Nothing, runtests::Bool, saved_state::Nothing, multithreaded::Nothing) @ SymbolicRegression C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:320 [12] #EquationSearch#21 @ C:\Users\gorth.julia\packages\SymbolicRegression\37l4B\src\SymbolicRegression.jl:345 [inlined] [13] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Any, NTuple{8, Symbol}, NamedTuple{(:weights, :niterations, :varMap, :options, :numprocs, :parallelism, :saved_state, :addprocs_function), Tuple{Nothing, Int64, Vector{String}, Options{typeof(loss), Int64, 0.86, 10}, Nothing, String, Nothing, Nothing}}}) @ Base .\essentials.jl:731 [14] pyjlwrap_call(f::Function, args::Ptr{PyCall.PyObject_struct}, kw::Ptr{PyCall.PyObject_struct}) @ PyCall C:\Users\gorth.julia\packages\PyCall\ygXW2\src\callback.jl:32 [15] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{PyCall.PyObject_struct}) @ PyCall C:\Users\gorth.julia\packages\PyCall\ygXW2\src\callback.jl:44>

    bug 
    opened by trifinos 13
  • Repeated CI failures on Windows

    Repeated CI failures on Windows

    Many of the Windows tests are now failing with various segmentation faults, which appear to be randomly triggered:

    • Nightly action: https://github.com/MilesCranmer/PySR/actions/workflows/CI_large_nightly.yml
    • PR action: https://github.com/MilesCranmer/PySR/pull/237

    They seem to occur more frequently on older versions of Julia, and rarely on Julia 1.8.3. Regardless, a segfault anywhere is cause for concern and should be tracked down.

    The errors include:

    1. Early segmentation fault (Julia 1.6.7) at first run, segfault during noise test (Julia 1.6.7 and others), as well as segfaults during warm start test.

    e.g., Windows:

     D:\a\_temp\221410f9-8bf7-4099-901d-eb9813d86c45.sh: line 1:  1098 Segmentation fault      python -m pysr.test main
    Started!
    
    also occurs on Ubuntu sometimes:
    signal (11): Segmentation fault
    in expression starting at none:0
    unknown function (ip: 0x7fd6a19bc215)
    unknown function (ip: 0x7fd6a19947ff)
    macro expansion at /home/runner/.julia/packages/PyCall/ygXW2/src/exception.jl:95 [inlined]
    convert at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:94
    pyjlwrap_getattr at /home/runner/.julia/packages/PyCall/ygXW2/src/pytype.jl:378
    unknown function (ip: 0x7fd68d30b1bd)
    unknown function (ip: 0x7fd6a19babda)
    unknown function (ip: 0x7fd6a198e9d4)
    pyisinstance at /home/runner/.julia/packages/PyCall/ygXW2/src/PyCall.jl:170 [inlined]
    pysequence_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:752
    pytype_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:773
    pytype_query at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:806 [inlined]
    convert at /home/runner/.julia/packages/PyCall/ygXW2/src/conversions.jl:831
    julia_kwarg at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:19 [inlined]
    #57 at ./none:0 [inlined]
    iterate at ./generator.jl:47 [inlined]
    collect_to! at ./array.jl:728
    unknown function (ip: 0x7fd68d341d9a)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    collect_to! at ./array.jl:736
    unknown function (ip: 0x7fd68d33e35a)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    collect_to! at ./array.jl:736
    collect_to_with_first! at ./array.jl:706
    unknown function (ip: 0x7fd68d33d775)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    collect at ./array.jl:687
    unknown function (ip: 0x7fd68d33afb4)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    _pyjlwrap_call at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:31
    unknown function (ip: 0x7fd68d3348d5)
    _jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2237 [inlined]
    jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2419
    pyjlwrap_call at /home/runner/.julia/packages/PyCall/ygXW2/src/callback.jl:44
    unknown function (ip: 0x7fd68d30aeee)
    unknown function (ip: 0x7fd6a19980c7)
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined]
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined]
    PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined]
    call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined]
    _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3537
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a199a1e0)
    unknown function (ip: 0x7fd6a19ed97b)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a19ecdf6)
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a199a1e0)
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a19ecdf6)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a199a28d)
    unknown function (ip: 0x7fd6a19ef9b1)
    unknown function (ip: 0x7fd6a19ebbb7)
    unknown function (ip: 0x7fd6a1997d4c)
    unknown function (ip: 0x7fd6a1998f2b)
    unknown function (ip: 0x7fd6a1a46421)
    unknown function (ip: 0x7fd6a199802f)
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined]
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined]
    PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined]
    call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined]
    _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3520
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a199a28d)
    unknown function (ip: 0x7fd6a19ef9b1)
    unknown function (ip: 0x7fd6a19ebbb7)
    unknown function (ip: 0x7fd6a1997d4c)
    unknown function (ip: 0x7fd6a1998f2b)
    unknown function (ip: 0x7fd6a1a46421)
    unknown function (ip: 0x7fd6a199802f)
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:116 [inlined]
    _PyObject_VectorcallTstate at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:103 [inlined]
    PyObject_Vectorcall at /home/runner/work/_temp/SourceCode/./Include/cpython/abstract.h:127 [inlined]
    call_function at /home/runner/work/_temp/SourceCode/Python/ceval.c:5077 [inlined]
    _PyEval_EvalFrameDefault at /home/runner/work/_temp/SourceCode/Python/ceval.c:3520
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a19ecdf6)
    unknown function (ip: 0x7fd6a1998972)
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyEval_EvalCodeWithName at /home/runner/work/_temp/SourceCode/Python/ceval.c:4361
    unknown function (ip: 0x7fd6a19eb876)
    PyEval_EvalCode at /home/runner/work/_temp/SourceCode/Python/ceval.c:828
    unknown function (ip: 0x7fd6a1a6399f)
    cfunction_vectorcall_FASTCALL at /home/runner/work/_temp/SourceCode/Objects/methodobject.c:430
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a19ecb12)
    unknown function (ip: 0x7fd6a19ebbb7)
    _PyFunction_Vectorcall at /home/runner/work/_temp/SourceCode/Objects/call.c:396
    unknown function (ip: 0x7fd6a1a7fdd6)
    unknown function (ip: 0x7fd6a1a7faae)
    Py_BytesMain at /home/runner/work/_temp/SourceCode/Modules/main.c:731
    unknown function (ip: 0x7fd6a1642d8f)
    __libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
    _start at python (unknown line)
    Allocations: 185387713 (Pool: 185351460; Big: 36253); GC: 470
    /home/runner/work/_temp/bdd49862-48fd-4e82-bed8-685329606248.sh: line 1:  2324 Segmentation fault      (core dumped) python -m pysr.test main
    
    1. Git errors: (Julia 1.8.2)
    PyCall is installed and built successfully.
         Cloning git-repo `[https://github.com/MilesCranmer/SymbolicRegression.jl`](https://github.com/MilesCranmer/SymbolicRegression.jl%60)
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/Users/runner/work/PySR/PySR/pysr/julia_helpers.py", line 87, in install
        _add_sr_to_julia_project(Main, io_arg)
      File "/Users/runner/work/PySR/PySR/pysr/julia_helpers.py", line 240, in _add_sr_to_julia_project
        Main.eval(f"Pkg.add([sr_spec, clustermanagers_spec], {io_arg})")
      File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 627, in eval
        ans = self._call(src)
      File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 555, in _call
        self.check_exception(src)
      File "/Users/runner/hostedtoolcache/Python/3.9.14/x64/lib/python3.9/site-packages/julia/core.py", line 609, in check_exception
        raise JuliaError(u'Exception \'{}\' occurred while calling julia code:\n{}'
    julia.core.JuliaError: Exception 'failed to clone from https://github.com/MilesCranmer/SymbolicRegression.jl, error: GitError(Code:ERROR, Class:Net, SecureTransport error: connection closed via error)' occurred while calling julia code:
    Pkg.add([sr_spec, clustermanagers_spec], io=stderr)
    
    1. Access errors during scikit-learn tests (these ones don't even fail the CI, which is a bit worrisome)

    e.g.,

    Failed check_fit2d_predict1d with:
        Traceback (most recent call last):
          File "D:\a\PySR\PySR\pysr\test\test.py", line 671, in test_scikit_learn_compatibility
            check(model)
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\sklearn\utils\_testing.py", line 188, in wrapper
            return fn(*args, **kwargs)
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\sklearn\utils\estimator_checks.py", line 1300, in check_fit2d_predict1d
            estimator.fit(X, y)
          File "D:\a\PySR\PySR\pysr\sr.py", line 1792, in fit
            self._run(X, y, mutated_params, weights=weights, seed=seed)
          File "D:\a\PySR\PySR\pysr\sr.py", line 1493, in _run
            Main = init_julia(self.julia_project, julia_kwargs=julia_kwargs)
          File "D:\a\PySR\PySR\pysr\julia_helpers.py", line 180, in init_julia
            Julia(**julia_kwargs)
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\julia\core.py", line 519, in __init__
            self._call("const PyCall = Base.require({0})".format(PYCALL_PKGID))
          File "C:\hostedtoolcache\windows\Python\3.9.13\x64\lib\site-packages\julia\core.py", line 554, in _call
            ans = self.api.jl_eval_string(src.encode('utf-8'))
        OSError: exception: access violation reading 0x000001BC1C501000
    
    1. Torch errors.

    One other curious thing is that this error is raised on some Windows tests (https://github.com/MilesCranmer/PySR/actions/runs/3664894286/jobs/6195713513). But, this should not take place...

    Run python -m pysr.test torch
    D:\a\PySR\PySR\pysr\julia_helpers.py:139: UserWarning: `torch` was loaded before the Julia instance started. This may cause a segfault when running `PySRRegressor.fit`. To avoid this, please run `pysr.julia_helpers.init_julia()` *before* importing `torch`. For updates, see https://github.com/pytorch/pytorch/issues/78829
      warnings.warn(
    D:\a\_temp\8727c9f4-d0f6-4345-84e6-e774762771ab.sh: line 1:   258 Segmentation fault      python -m pysr.test torch
    Started!
    
    opened by MilesCranmer 11
  • Raise warning on statically-linked Python binaries

    Raise warning on statically-linked Python binaries

    Time-to-first-search is very slow on statically-linked versions of Python (such as packaged with conda), as precompiled code cannot be used, so things are compiled from scratch. I think this adds some friction to the user experience, so this PR introduces a warning that recommends the user try pyenv if startup time is important.

    When https://github.com/JuliaPy/pyjulia/issues/496 is solved, this warning is no longer needed.

    See https://github.com/conda-forge/python-feedstock/issues/222 for the discussion on the conda page.

    opened by MilesCranmer 3
  • [Feature] Install with CLI

    [Feature] Install with CLI

    Right now you install SymbolicRegression.jl using python -c 'import pysr; pysr.install()'. However, this is a bit of spooky action at a distance, because you can't quite be sure which pysr is actually being called. Thus, it would be great if there was a CLI, similar to how testing is done with python -m pysr.test main. For example:

    python -m pysr.install
    

    If anybody wants to add this, I'd be more than happy to accept a PR!

    enhancement 
    opened by MilesCranmer 0
Releases(v0.11.11)
  • v0.11.11(Nov 22, 2022)

    What's Changed

    • Make Julia startup options configurable; set optimize=3 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/228

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.10...v0.11.11

    Source code(tar.gz)
    Source code(zip)
  • v0.11.10(Nov 21, 2022)

    What's Changed

    • Clean up dockerfile by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/223
    • Update backend version with improved resource monitoring by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/227

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.9...v0.11.10

    Source code(tar.gz)
    Source code(zip)
  • v0.11.9(Nov 5, 2022)

    What's Changed

    • Refactor testing suite to have CLI by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/221

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.8...v0.11.9

    Source code(tar.gz)
    Source code(zip)
  • v0.11.8(Nov 4, 2022)

    What's Changed

    • Fix PyCall not giving traceback by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/218
    • Fixed safe operators; make progress bar print to stderr by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/219

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.7...v0.11.8

    Source code(tar.gz)
    Source code(zip)
  • v0.11.7(Nov 4, 2022)

    What's Changed

    • Expand nightly conda-forge tests to other Python versions by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/212
    • Clean up parameter groupings in docs by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/214
    • Add optimization-as-mutation, and adaptive parsimony by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/217

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.6...v0.11.7

    Source code(tar.gz)
    Source code(zip)
  • v0.11.6(Oct 31, 2022)

    What's Changed

    • Speed up evaluation with turbo parameter by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/208

    https://user-images.githubusercontent.com/7593028/199054602-7ad19e87-19ff-4440-aa09-da6d7b6175d5.mp4

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.5...v0.11.6

    Source code(tar.gz)
    Source code(zip)
  • v0.11.5(Oct 24, 2022)

    What's Changed

    • 30-50% Faster evaluation, and perform explicit version assertion for backend by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/205

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.4...v0.11.5

    Source code(tar.gz)
    Source code(zip)
  • v0.11.4(Oct 10, 2022)

    What's Changed

    • Fix conda forge installs by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/202

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.3...v0.11.4

    Source code(tar.gz)
    Source code(zip)
  • v0.11.3(Oct 6, 2022)

    What's Changed

    • Faster evaluation for constant sub-expressions (SymbolicRegression.jl#129)
    • Will now check variable names for spaces and other non-alphanumeric characters, aside from underscores. Before this would only raise an issue after a search, when trying to pickle the saved data.

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.2...v0.11.3

    Source code(tar.gz)
    Source code(zip)
  • v0.11.2(Sep 28, 2022)

  • v0.11.1-1(Sep 26, 2022)

    What's Changed

    • Added Customization page in the docs for tweaking the backend's loss function and constraints.
    • Adding two entries to papers.yml by @JayWadekar in https://github.com/MilesCranmer/PySR/pull/192
    • Explicitly deprecate Julia <= 1.5 by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/194
    • Allow custom shared projects for julia_project by @MilesCranmer @mkitti in https://github.com/MilesCranmer/PySR/pull/197
      • e.g., this would allow you to run with @my-project and it will set up a shared Julia project under my-project (in the environments dir)

    New Contributors

    • @JayWadekar made their first contribution in https://github.com/MilesCranmer/PySR/pull/192

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.11.0...v0.11.1-1

    Source code(tar.gz)
    Source code(zip)
  • v0.11.0(Sep 11, 2022)

    What's Changed

    • Update backend https://github.com/MilesCranmer/PySR/pull/191
      • Includes high-precision constants when precision=64
      • Enables datasets with zero variance (to allow fitting a constant)
      • Changes, e.g., abs(x)^y to x^y, with expressions avoided altogether for invalid input. This is because the former would sometimes give weird functional forms by exploiting the cusp at x=0. Thanks to @johanbluecreek.

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.4...v0.11.0

    Source code(tar.gz)
    Source code(zip)
  • v0.10.4-1(Sep 8, 2022)

    What's Changed

    • Fix install for Julia <=1.6 by @MilesCranmer @mkitti in https://github.com/MilesCranmer/PySR/pull/188
      • PyJulia will now launch directly into the shared pysr-{version} environment, rather than activating it later.

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.3...v0.10.4

    Source code(tar.gz)
    Source code(zip)
  • v0.10.3(Sep 6, 2022)

    What's Changed

    • Displays a warning message when PyTorch is imported before PyJulia starts. See https://github.com/pytorch/pytorch/issues/78829. The only current solution is to start Julia beforehand.
    • New docs! Using Material-Mkdocs:
    Screen Shot 2022-09-06 at 6 06 49 PM Source code(tar.gz)
    Source code(zip)
  • v0.10.2(Sep 6, 2022)

    What's Changed

    • Set JULIA_PROJECT, use Pkg.add once by @mkitti in https://github.com/MilesCranmer/PySR/pull/186

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.10.1...v0.10.2

    Source code(tar.gz)
    Source code(zip)
  • v0.10.1(Sep 6, 2022)

  • v0.10.0(Aug 14, 2022)

    What's Changed

    • Easy loading from auto-generated checkpoint files by @MilesCranmer w/ review @tttc3 @Pablo-Lemos in https://github.com/MilesCranmer/PySR/pull/167
      • Use .from_file to load from the auto-generated .pkl file.
    • LaTeX table generator by @MilesCranmer w/ review @tttc3 @kazewong in https://github.com/MilesCranmer/PySR/pull/156
      • Generate a LaTeX table of discovered equations with .latex_table()
    • Improved default model selection strategy by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/177
      • Old strategy is available as model_selection="score"
    • Add opencontainers image-spec to Dockerfile by @SauravMaheshkar w/ review @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/166
    • Switch to comma-based csv format by @MilesCranmer in https://github.com/MilesCranmer/PySR/pull/176

    Bug fixes

    • Fixed conversions to torch and JAX when a rational number appears in the sympy expression (https://github.com/MilesCranmer/PySR/commit/17c9b1a1762efbd8e021d275491f75cc6dcea8f1, https://github.com/MilesCranmer/PySR/commit/f119733698e4517e34cc902c78dcb95d450c0c80)
    • Fixed pickle saving when trained with multi-output (https://github.com/MilesCranmer/PySR/commit/3da0df512ee295f446ceb0ae6e2c39fb0e380618)
    • Fixed pickle saving when using custom operators with defined sympy -> jax/torch/numpy mappings
    • Backend fix avoids use of Julia's cp which is buggy for some file systems (e.g., EOS)

    New Contributors

    • @SauravMaheshkar made their first contribution in https://github.com/MilesCranmer/PySR/pull/166

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.9.0...v0.10.0

    Source code(tar.gz)
    Source code(zip)
  • v0.9.0(Jun 4, 2022)

    What's Changed

    • Refactor of PySRRegressor by @tttc3 in https://github.com/MilesCranmer/PySR/pull/146
      • PySRRegressor is now completely compatible with scikit-learn.
      • PySRRegressor can be stored in a pickle file, even after fitting, and then be reloaded and used with .predict()
      • PySRRegressor.equations -> PySRRegressor.equations_

    New Contributors

    • @tttc3 made their first contribution in https://github.com/MilesCranmer/PySR/pull/146

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.8.7...v0.9.0

    Source code(tar.gz)
    Source code(zip)
  • v0.8.5(May 20, 2022)

    What's Changed

    • Custom complexities for operators, constants, and variables (https://github.com/MilesCranmer/PySR/pull/138)
    • Early stopping conditions (https://github.com/MilesCranmer/PySR/pull/134)
      • Based on a certain loss value being achieved
      • Max number of evaluations (for theoretical studies of genetic algorithms, rather than anything practical).
    • Work with specified expression rather than the one given by model_selection, by passing index to the function you wish to use (e.g,. model.predict(X, index=5) would use the 5th equation.).

    Full Changelog since v0.8.1: https://github.com/MilesCranmer/PySR/compare/v0.8.1...v0.8.5

    Source code(tar.gz)
    Source code(zip)
  • v0.8.1(May 8, 2022)

    What's Changed

    • Enable distributed processing with ClusterManagers.jl from https://github.com/MilesCranmer/PySR/pull/133

    Full Changelog: https://github.com/MilesCranmer/PySR/compare/v0.8.0...v0.8.1

    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(May 8, 2022)

    This new release updates the entire set of default PySR parameters according to the ones presented in https://github.com/MilesCranmer/PySR/discussions/115. These parameters have been tuned over nearly 71,000 trials. See the discussion for further info.

    Additional changes:

    • Nested constraints implemented. For example, you can now prevent sin and cos from being repeatedly nested, by using the argument: nested_constraints={"sin": {"sin": 0, "cos": 0}, "cos": {"sin": 0, "cos": 0}}. This argument states that within a sin operator, you can only have a max depth of 0 for other sin or cos. The same is done for cos. The argument nested_constraints={"^": {"+": 2, "*": 1, "^": 0}} states that within a pow operator, you can only have 2 things added, or 1 use of multiplication (i.e., no double products), and zero other pow operators. This helps a lot with finding interpretable expressions!
    • New parsimony algorithm (backend change). This seems to help searches quite a bit, especially when one is searching for more complex expressions. This is turned on by use_frequency_in_tournament which is now the default.
    • Many backend improvements: speed, bug fixes, etc.
    • Improved stability of multi-processing (backend change). Thanks to @CharFox1.
    • Auto-differentiation implemented (backend change). This isn't used by default in any instances right now, but could be used by optimization later. Thanks to @kazewong.
    • Improved testing coverage of weird edge cases.
    • All parameters to PySRRegressor have been cleaned up to be in snake_case rather than CamelCase. The backend is also now almost entirely snake_case for internal functions. +Other readability improvements. Thanks to @bstollnitz and @patrick-kidger for the suggestions.
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Jun 1, 2021)

    PySR Version 0.6.0

    Large changes:

    • Exports to JAX, PyTorch, NumPy. All exports have a similar interface. JAX and PyTorch allow the equation parameters to be trained (e.g., as part of some differentiable model). Read https://pysr.readthedocs.io/en/latest/docs/options/#callable-exports-numpy-pytorch-jax for details. Thanks Patrick Kidger for the PyTorch export.
    • Multi-output y input is allowed, and the backend will efficiently batch over each output. A list of dataframes is returned by pysr for these cases. All best_* functions return a list as well.
    • BFGS optimizer introduced + more stable parameter search due to back tracking line search.

    Smaller changes since 0.5.16:

    • Expanded tests, coverage calculation for PySR
    • Improved (pre-processing) feature selection with random forest
    • New default parameters for search:
      • annealing=False (no annealing works better with the new code. This is equivalent to alpha=infinity)
      • useFrequency=True (deals with complexity in a smarter way)
      • npopulations = 20 ~~procs*4~~
      • progress=True (show a progress bar)
      • optimizer_algorithm="BFGS"
      • optimizer_iterations=10
      • optimize_probability=1
      • binary_operators default = ["+", "-", "/", "*"]
      • unary_operators default = []
    • Warnings:
      • Using maxsize > 40 will trigger a warning mentioning how it will be slow and use a lot of memory. Will mention to turn off useFrequency, and perhaps also use warmupMaxsizeBy.
    • Deprecated nrestarts -> optimizer_nrestarts
    • Printing fixed in Jupyter
    Source code(tar.gz)
    Source code(zip)
  • v0.4.0(Feb 1, 2021)

    With versions v0.4.0/v0.4.0, SymbolicRegression.jl and PySR have now been completely disentangled: PySR is 100% Python code (with some Julia meta-programming), and SymbolicRegression.jl is 100% Julia code.

    PySR now works by activating a Julia env that has SymbolicRegression.jl as a dependency, and making calls to it! By default it will set up a Julia project inside the pip install location, and install requirements at the user's confirmation, though you can pass an arbitrary project directory as well (e.g., if you want to use PySR but also tweak the backend). The nice thing about this is that for Python users, all you need to do is install a Julia binary somewhere, and they should be good to go. And for Julia users, you never need to touch the Python side.

    The SymbolicRegression.jl backend also sets up workers automatically & internally now, so one never needs to call @everywhere when setting things up. The same is true even with locally-defined functions - these get passed to workers!

    With PySR importing the latest Julia code, this also means it gets new simplification routines powered by SymbolicUtils.jl, which seem to help improve the equations discovered.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.8(Sep 27, 2020)

    Populations don't block eachother, which gives a large speedup especially for large numbers of populations. This was fixed by using RemoteChannel() in Julia.

    Some populations happen to take longer than others - perhaps they have very complex equations - and can therefore block others that have finished early. This lets the processor work on the next population to be finished.

    Source code(tar.gz)
    Source code(zip)
  • v0.3.5(Sep 27, 2020)

    Uses equation from Cranmer et al. (2020) https://arxiv.org/abs/2006.11287 to score equations, and prints this alongside MSE. This makes symbolic regression more robust to noise.

    Source code(tar.gz)
    Source code(zip)
Owner
Miles Cranmer
Astro PhD candidate @princeton trying to accelerate astrophysics with AI. I build interpretable ML algorithms.
Miles Cranmer
LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRerank, Seq2Slate.

LibRerank LibRerank is a toolkit for re-ranking algorithms. There are a number of re-ranking algorithms, such as PRM, DLCM, GSF, miDNN, SetRank, EGRer

126 Dec 28, 2022
A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

2.3k Jan 05, 2023
A simple application that calculates the probability distribution of a normal distribution

probability-density-function General info An application that calculates the probability density and cumulative distribution of a normal distribution

1 Oct 25, 2022
YouTube Spam Detection with python

YouTube Spam Detection This code deletes spam comment on youtube videos based on two characteristics (currently) If the author of the comment has a se

MohamadReza Taalebi 5 Sep 27, 2022
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile

matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo

Target 696 Dec 26, 2022
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.

Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.

2 Jun 14, 2022
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 03, 2022
MLflow App Using React, Hooks, RabbitMQ, FastAPI Server, Celery, Microservices

Katana ML Skipper This is a simple and flexible ML workflow engine. It helps to orchestrate events across a set of microservices and create executable

Tom Xu 8 Nov 17, 2022
It is a forest of random projection trees

rpforest rpforest is a Python library for approximate nearest neighbours search: finding points in a high-dimensional space that are close to a given

Lyst 211 Dec 29, 2022
A logistic regression model for health insurance purchasing prediction

Logistic_Regression_Model A logistic regression model for health insurance purchasing prediction This code is using these packages, so please make sur

ShawnWang 1 Nov 29, 2021
Confidence intervals for scikit-learn forest algorithms

forest-confidence-interval: Confidence intervals for Forest algorithms Forest algorithms are powerful ensemble methods for classification and regressi

272 Dec 01, 2022
Implementations of Machine Learning models, Regularizers, Optimizers and different Cost functions.

Linear Models Implementations of LinearRegression, LassoRegression and RidgeRegression with appropriate Regularizers and Optimizers. Linear Regression

Keivan Ipchi Hagh 1 Nov 22, 2021
This project has Classification and Clustering done Via kNN and K-Means respectfully

This project has Classification and Clustering done Via kNN and K-Means respectfully. It later tests its efficiency via F1/accuracy/recall/precision for kNN and Davies-Bouldin Index for Clustering. T

Mohammad Ali Mustafa 0 Jan 20, 2022
NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

SUN Group @ UMN 28 Aug 03, 2022
Bayesian optimization in JAX

Bayesian optimization in JAX

Predictive Intelligence Lab 26 May 11, 2022
Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

Na'aman Hirschfeld 396 Dec 28, 2022
PySurvival is an open source python package for Survival Analysis modeling

PySurvival What is Pysurvival ? PySurvival is an open source python package for Survival Analysis modeling - the modeling concept used to analyze or p

Square 265 Dec 27, 2022
A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

imbalanced-learn imbalanced-learn is a python package offering a number of re-sampling techniques commonly used in datasets showing strong between-cla

6.2k Jan 01, 2023
Python module for machine learning time series:

seglearn Seglearn is a python package for machine learning time series or sequences. It provides an integrated pipeline for segmentation, feature extr

David Burns 536 Dec 29, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques

Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learn

Vowpal Wabbit 8.1k Dec 30, 2022