Predictive AI layer for existing databases.

Overview

MindsDB

MindsDB workflow Python supported PyPi Version PyPi Downloads MindsDB Community MindsDB Website

MindsDB is an open-source AI layer for existing databases that allows you to effortlessly develop, train and deploy state-of-the-art machine learning models using SQL queries. Tweet

Predictive AI layer for existing databases
MindsDB

Try it out

Contributing

To contribute to mindsdb, please check out our Contribution guide.

Current contributors

Made with contributors-img.

Report Issues

Please help us by reporting any issues you may have while using MindsDB.

License

Comments
  • [Bug]: Exception

    [Bug]: Exception "Lightgbm mixer not supported for type: tags" when following process plant tutorial

    Is there an existing issue for this?

    • [x] I have searched the existing issues

    Current Behaviour

    I'm following the Manufacturing process quality tutorial and during training I get the error Exception: Lightgbm mixer not supported for type: tags, raised at: /opt/conda/lib/python3.7/site-packages/mindsdb/interfaces/model/learn_process.py#177.

    Expected Behaviour

    I'd expect AutoML to train the model successfully

    Steps To Reproduce

    Follow the tutorial steps until training.
    

    Anything else?

    The tutorials seem to be of mixed quality, some resulting in errors, some in low model performance or "null" predictions (for the bus ride tutorial).

    bug documentation 
    opened by philfuchs 22
  • install issues on windows 10

    install issues on windows 10

    Describe the bug When installing mindsdb, the following error message is output with the following command.

    command: pip install --requirement reqs.txt

    error message: ERROR: Could not find a version that satisfies the requirement torch>=1.0.1.post2 (from lightwood==0.6.4->-r reqs.txt (line 25)) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2) ERROR: No matching distribution found for torch>=1.0.1.post2 (from lightwood==0.6.4->-r reqs.txt (line 25))

    Desktop (please complete the following information):

    • OS: windows 10

    Additional context I think pytorch for Windows is currently not available through PYPI, so using the commands in https://pytorch.org is a better way.

    bug 
    opened by YottaGin 22
  • Integration merlion issue2377

    Integration merlion issue2377

    1. Modify sql_query.py to support to define customized return columns and dtype of ml_handlers;
    2. [issue2377] Merlion integrated, forecaster: default, sarima, prophet, mses, detector: default, isolation forest, windstats, prophet;
    3. Corresponding test cases added, /mindsdb/tests/unit/test_merlion_handler.
    opened by rfsiten 21
  • [Bug]:  metadata-generation-failed

    [Bug]: metadata-generation-failed

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    I keep on trying to pip install mindsdb. Screenshot 2022-05-10 11 13 31

    This is the message that it comes up with.

    Things tried:

    • Installing wheel
    • Upgrading pip
    • Installing sktime
    • googling/StackOverflow

    Expected Behavior

    I anticipated that I would be able to download the package.

    Steps To Reproduce

    pip install mindsdb
    

    Anything else?

    No response

    bug 
    opened by TyanBr 20
  • [Bug]: Not able to preview data due to internal server error HTTP 500

    [Bug]: Not able to preview data due to internal server error HTTP 500

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    I was trying the mindsdb get started tutorial, that time i encountered HTTP 500 error, is it due to URL authorization scope not found ? or anything else. I tried googling why this happens but couldn't get a proper explanation. I can't do further steps it keeps on giving me a HTTP 500 error. Is it a common issue, coz i keep getting it on the website while running the query for the tutorial and can't preview my tables. Says that the query is not proper, i ran the exact code from tutorial. Is it is some issue from my side...anyway do help me overcome it. I really like mindsdb concept and an AI/ML enthusiast :)

    Here is the ss of what im getting while running the queries:

    Screenshot 2022-07-14 012410

    It says query with error but i am running the code from the tutorial the syntax is intact. Is there any changes to be made in the syntax?

    Expected Behavior

    I should be able to preview the home rentals data that i will train in the further steps.

    Steps To Reproduce

    Go to the mindsdb website and then start the home rentals demo tutorial. Run the demo code given in the step 1. Try to preview the data
    

    Anything else?

    nothing

    bug 
    opened by prathikshetty2002 19
  • Could not load module ModelInterface

    Could not load module ModelInterface

    hello can someone help to solve this :

    • ERROR:mindsdb-logger-f7442ec0-574d-11ea-a55e-106530eaf271:c:\users\drpbengrir\appdata\local\programs\python\python37\lib\site-packages\mindsdb\libs\controllers\transaction.py:188 - Could not load module ModelInterface
    bug 
    opened by simofilahi 18
  • FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'

    FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'

    Describe the bug A FileNotFoundError occurs when running the predict.py script described below.

    The full traceback is the following:

    Traceback (most recent call last):
      File "predict.py", line 12, in <module>
        result = Predictor(name='suicide_rates').predict(when={'country':'Greece','year':1981,'sex':'male','age':'35-54','population':300000})
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/predictor.py", line 472, in predict
        transaction = Transaction(session=self, light_transaction_metadata=light_transaction_metadata, heavy_transaction_metadata=heavy_transaction_metadata, breakpoint=breakpoint)
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 53, in __init__
        self.run()
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 259, in run
        self._execute_predict()
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 157, in _execute_predict
        with open(CONFIG.MINDSDB_STORAGE_PATH + '/' + self.lmd['name'] + '_light_model_metadata.pickle', 'rb') as fp:
    FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'
    

    To Reproduce Steps to reproduce the behavior:

    1. Create a train.py script using the dataset: https://www.kaggle.com/russellyates88/suicide-rates-overview-1985-to-2016#master.csv. The train.py script is the one below:
    from mindsdb import Predictor
    
    Predictor(name='suicide_rates').learn(
        to_predict='suicides_no', # the column we want to learn to predict given all the data in the file
        from_data="master.csv" # the path to the file where we can learn from, (note: can be url)
    )
    
    1. Run the train.py script.
    2. Create and run the predict.py script:
    from mindsdb import Predictor
    
    # use the model to make predictions
    result = Predictor(name='suicide_rates').predict(when={'country':'Greece','year':1981,'sex':'male','age':'35-54','population':300000})
    
    # you can now print the results
    print(result)
    
    1. See error

    Expected behavior What was expected was to see the results.

    Desktop (please complete the following information):

    • OS: Ubuntu 18.04.2 LTS
    • mindsdb 1.0.5
    • python 3.6.7
    bug 
    opened by mlliarm 18
  • MySQL / Singlestore DB SSL support

    MySQL / Singlestore DB SSL support

    Problem I cannot connect my Singlestore DB (MySQL driver) because Mindsdb doesn't support SSL options.

    Describe the solution you'd like Full support for MySQL SSL (key, cert, ca).

    Describe alternatives you've considered No alternative is possible at the moment, security first.

    enhancement question 
    opened by pierre-b 17
  • Caching historical data for streams

    Caching historical data for streams

    I'll be using an example here and generalize whenever needed, let's say we have the following dataset we train a timeseries predictor on:

    time,gb,target,aux
    1     ,A, 7,        foo
    2     ,A, 10,      foo
    3     ,A,  12,     bar
    4     ,A,  14,     bar
    2     ,B,  5,       foo
    4     ,B,  9,       foo
    

    In this case target is what we are predicting, gb is the column we are grouping on and we are ordering by time. aux is an unrelated column that's not timeseries in nature and just used "normally".

    We train a predictor with a window of n

    Then let's say we have an input stream that looks something like this:

    time, gb, target, aux
    6,      A ,   33,     foo
    7,      A,    54,     foo
    

    Caching

    First, we will need to store, for each value of the column gb n recrods.

    So, for example, if n==1 we would save the last row in the data above, if n==2 we would save both, whem new rows come in, we un-cache the older rows`.

    Infering

    Second, when a new datapoint comes into the input stream we'll need to "infer" that the prediction we have to make is actually for the "next" datapoint. Which is to say that when: 7, A, 54, foo comes in we need to infer that we need to actually make predictions for:

    8, A, <this is what we are predicting>, foo

    The challenge here is how do we infer that the next timestamp is 8, one simple way to do this is to just subtract from the previous record, but that's an issue for the first observation (since we don't have a previous record to substract from, unless we cache part of the training data) or we could add a feature to native, to either:

    a) Provide a delta argument for each group by (representing by how much we increment the order column[s]) b) Have an argument when doing timeseries prediction that tells it to predict for the "next" row and then do the inferences under the cover.

    @paxcema let me know which of these features would be easy to implement in native, since you're now the resident timeseries expert.

    enhancement 
    opened by George3d6 16
  • Data extraction with mindsdb v.2

    Data extraction with mindsdb v.2

    This is a followup to #334, which was about the same use case and dataset, but different version of MindsDB with different errors.

    Your Environment

    Google Colab.

    • Python version: 3.6
    • Pip version: 19.3.1
    • Mindsdb version you tried to install: 2.13.8

    Describe the bug Running .learn() fails.

    [nltk_data]   Package stopwords is already up-to-date!
    
    /usr/local/lib/python3.6/dist-packages/lightwood/mixers/helpers/ranger.py:86: UserWarning: This overload of addcmul_ is deprecated:
    	addcmul_(Number value, Tensor tensor1, Tensor tensor2)
    Consider using one of the following signatures instead:
    	addcmul_(Tensor tensor1, Tensor tensor2, *, Number value) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
      exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
    
    Downloading: 100%
    232k/232k [00:00<00:00, 1.31MB/s]
    
    
    Downloading: 100%
    442/442 [00:06<00:00, 68.0B/s]
    
    
    Downloading: 100%
    268M/268M [00:06<00:00, 43.1MB/s]
    
    
    Token indices sequence length is longer than the specified maximum sequence length for this model (606 > 512). Running this sequence through the model will result in indexing errors
    ERROR:mindsdb-logger-ac470732-3303-11eb-bbe9-0242ac1c0002---eb4b7352-566f-4a1b-aef2-c286163e1a10:/usr/local/lib/python3.6/dist-packages/mindsdb_native/libs/controllers/transaction.py:173 - Could not load module ModelInterface
    
    ERROR:mindsdb-logger-ac470732-3303-11eb-bbe9-0242ac1c0002---eb4b7352-566f-4a1b-aef2-c286163e1a10:/usr/local/lib/python3.6/dist-packages/mindsdb_native/libs/controllers/transaction.py:239 - index out of range in self
    
    ---------------------------------------------------------------------------
    
    IndexError                                Traceback (most recent call last)
    
    <ipython-input-13-a5e3bd095e46> in <module>()
          7 mdb.learn(
          8     from_data=train,
    ----> 9     to_predict='birth_year' # the column we want to learn to predict given all the data in the file
         10 )
    
    20 frames
    
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
       1812         # remove once script supports set_grad_enabled
       1813         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
    -> 1814     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
       1815 
       1816 
    
    IndexError: index out of range in self
    

    https://colab.research.google.com/drive/1a6WRSoGK927m3eMkVdlwBhW6-3BtwaAO?usp=sharing

    To Reproduce

    1. Rerun this notebook on Google Colab https://github.com/opendataby/vybary2019/blob/e08c32ac51e181ddce166f8a4fbf968f81bd2339/canal03-parsing-with-mindsdb.ipynb
    bug 
    opened by abitrolly 15
  • Upload new Predictor

    Upload new Predictor

    Your Environment Scout The Scout upload Prediction does not function when a zip file is trying to upload.

    Errror is "Just .zip files are allowed" even though the zip file was selected. 
    
    

    Question on prediction, is there any way to upload a model of Tensorflow, or do we need to Convert TensorFlow model into the MindDB prediction model?

    bug question 
    opened by Winthan 15
  • [Bug]: TS dates not interpreted correctly

    [Bug]: TS dates not interpreted correctly

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    Predictions from a timeseries model (using >LATEST) gives dates starting Dec 1st, however there are Dec dates (e.g. 6th) within the training data.

    Expected Behavior

    Predictions using ">LATEST" should start 1 day after the latest date in the training data.

    Steps To Reproduce

    CREATE MODEL mindsdb.callsdata_time
    FROM files
    (SELECT * from CallData)
    PREDICT CallsOfferedTo5
    ORDER BY DateTime
    GROUP BY PrecisionQueue
    WINDOW 8    
    HORIZON 4;
    
    
    SELECT m.DateTime as date,
      m.CallsOfferedTo5 as forecast
      FROM mindsdb.callsdata_time as m
      JOIN files.CallData as t
      WHERE t.DateTime > LATEST
      AND t.PrecisionQueue = 5000;
    
    
    
    ### Anything else?
    
    **Training data**
    [export (3) (1).csv](https://github.com/mindsdb/mindsdb/files/10335508/export.3.1.csv)
    
    **Example of dates in Dec**
    ![image](https://user-images.githubusercontent.com/34073127/210332080-d2471ece-5c3a-4a3c-942c-ea1e41a2d4cd.png)
    
    **Model output**
    ![image](https://user-images.githubusercontent.com/34073127/210332293-418f2a5a-c6d7-4bba-98a7-1effb8e1bca2.png)
    
    bug 
    opened by tomhuds 1
  • [ENH] Override dtypes - LW handler

    [ENH] Override dtypes - LW handler

    Description

    Enables support for overriding automatically inferred data types when using the Lightwood handler.

    Type of change

    (Please delete options that are not relevant)

    • [ ] πŸ› Bug fix (non-breaking change which fixes an issue)
    • [x] ⚑ New feature (non-breaking change which adds functionality)
    • [ ] πŸ“’ Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] πŸ“„ This change requires a documentation update

    What is the solution?

    If the user provides:

    CREATE ...
    USING 
    dtype_dict = {
       'col1': 'type',
        ...
    };
    

    The handler will extract this mapping and use it to build a modified problem definition, which will trigger the default encoder belonging to each new dtype (note: manually specifying encoders is out of scope for this PR and will be added at a later date).

    Checklist:

    • [x] My code follows the style guidelines(PEP 8) of MindsDB.
    • [x] I have commented my code, particularly in hard-to-understand areas.
    • [ ] I have updated the documentation, or created issues to update them.
    • [x] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [x] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    opened by paxcema 0
  • updated all links

    updated all links

    Description

    Please include a summary of the change and the issue it solves.

    Fixes #4263 Fixes copied modified content of databases.mdx into its copy databases2.mdx Fixes removed https://docs.mindsdb.com from links

    Type of change

    (Please delete options that are not relevant)

    • [ ] πŸ› Bug fix (non-breaking change which fixes an issue)
    • [ ] ⚑ New feature (non-breaking change which adds functionality)
    • [ ] πŸ“’ Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] πŸ“„ Documentation update

    What is the solution?

    (Describe at a high level how the feature was implemented) As in Description.

    Checklist:

    • [ ] My code follows the style guidelines(PEP 8) of MindsDB.
    • [ ] I have commented my code, particularly in hard-to-understand areas.
    • [x] I have updated the documentation.
    • [ ] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [ ] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    documentation 
    opened by martyna-mindsdb 0
  • updated links

    updated links

    Description

    Please include a summary of the change and the issue it solves.

    Fixes #4261

    Type of change

    (Please delete options that are not relevant)

    • [ ] πŸ› Bug fix (non-breaking change which fixes an issue)
    • [ ] ⚑ New feature (non-breaking change which adds functionality)
    • [ ] πŸ“’ Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] πŸ“„ Documentation update

    What is the solution?

    Updated file names in links wherever necessary.

    Checklist:

    • [ ] My code follows the style guidelines(PEP 8) of MindsDB.
    • [ ] I have commented my code, particularly in hard-to-understand areas.
    • [x] I have updated the documentation.
    • [ ] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [ ] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    documentation 
    opened by martyna-mindsdb 0
  • Predictions for ranges/sets of values

    Predictions for ranges/sets of values

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Is your feature request related to a problem? Please describe.

    I'd like to retrieve a prediction filtering over a range of values rather than only individual exact values. Example: select rental_price, rental_price_explain from mindsdb.home_rentals_model where sqft between 1000 and 1200 and neighborhood in ('berkeley_hills', 'westbrae', 'downtown');

    This currently generates an error: Only 'and' and '=' operations allowed in WHERE clause, found: BetweenOperation(op='between', args=( Identifier(parts=['sqft']), Constant(value=1000), Constant(value=1200) )

    Describe the solution you'd like.

    I'd like to have the ability to use filter criteria such as "in", "between", or "or" logic to retrieve a single prediction.

    Describe an alternate solution.

    No response

    Anything else? (Additional Context)

    https://mindsdbcommunity.slack.com/archives/C01S2T35H18/p1672320517307839

    opened by rbkrejci 0
Releases(v22.12.4.3)
A Repository of Community-Driven Natural Instructions

A Repository of Community-Driven Natural Instructions TLDR; this repository maintains a community effort to create a large collection of tasks and the

AI2 244 Jan 04, 2023
My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs (GNN, GAT, GraphSAGE, GCN)

machine-learning-with-graphs My solutions for Stanford University course CS224W: Machine Learning with Graphs Fall 2021 colabs Course materials can be

Marko Njegomir 7 Dec 14, 2022
A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

GainedVAE A Pytorch Implementation of a continuously rate adjustable learned image compression framework, Gained Variational Autoencoder(GainedVAE). N

39 Dec 24, 2022
Supplementary code for SIGGRAPH 2021 paper: Discovering Diverse Athletic Jumping Strategies

SIGGRAPH 2021: Discovering Diverse Athletic Jumping Strategies project page paper demo video Prerequisites Important Notes We suspect there are bugs i

54 Dec 06, 2022
Some simple programs built in Python: webcam with cv2 that detects eyes and face, with grayscale filter

Programas en Python Algunos programas simples creados en Python: πŸ“Ή Webcam con c

Madirex 1 Feb 15, 2022
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech Keon Lee, Ky

Keon Lee 114 Dec 12, 2022
Wordle Env: A Daily Word Environment for Reinforcement Learning

Wordle Env: A Daily Word Environment for Reinforcement Learning Setup Steps: git pull [email&#

2 Mar 28, 2022
Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

EfficientZero (NeurIPS 2021) Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021. Thank you for you

Weirui Ye 671 Jan 03, 2023
Benchmark for evaluating open-ended generation

OpenMEVA Contributed by Jian Guan, Zhexin Zhang. Thank Jiaxin Wen for DeBugging. OpenMEVA is a benchmark for evaluating open-ended story generation me

25 Nov 15, 2022
Official PyTorch implementation of "Evolving Search Space for Neural Architecture Search"

Evolving Search Space for Neural Architecture Search Usage Install all required dependencies in requirements.txt and replace all ..path/..to in the co

Yuanzheng Ci 10 Oct 24, 2022
Frigate - NVR With Realtime Object Detection for IP Cameras

A complete and local NVR designed for HomeAssistant with AI object detection. Uses OpenCV and Tensorflow to perform realtime object detection locally for IP cameras.

Blake Blackshear 6.4k Dec 31, 2022
Supporting code for short YouTube series Neural Networks Demystified.

Neural Networks Demystified Supporting iPython notebooks for the YouTube Series Neural Networks Demystified. I've included formulas, code, and the tex

Stephen 1.3k Dec 23, 2022
Pytorch implementation of Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization https://arxiv.org/abs/2008.11646

[TCSVT] Each Part Matters: Local Patterns Facilitate Cross-view Geo-localization LPN [Paper] NEWs Prerequisites Python 3.6 GPU Memory = 8G Numpy 1.

46 Dec 14, 2022
AdamW optimizer for bfloat16 models in pytorch.

Image source AdamW optimizer for bfloat16 models in pytorch. Bfloat16 is currently an optimal tradeoff between range and relative error for deep netwo

Alex Rogozhnikov 8 Nov 20, 2022
Beginner-friendly repository for Hacktober Fest 2021. Start your contribution to open source through baby steps. πŸ’œ

Hacktober Fest 2021 πŸŽ‰ Open source is changing the world – one contribution at a time! πŸŽ‰ This repository is made for beginners who are unfamiliar wit

Abhilash M Nair 32 Dec 11, 2022
Python Wrapper for Embree

pyembree Python Wrapper for Embree Installation You can install pyembree (and embree) via the conda-forge package. $ conda install -c conda-forge pyem

Anthony Scopatz 67 Dec 24, 2022
Dynamical Wasserstein Barycenters for Time Series Modeling

Dynamical Wasserstein Barycenters for Time Series Modeling This is the code related for the Dynamical Wasserstein Barycenter model published in Neurip

8 Sep 09, 2022
Accelerating BERT Inference for Sequence Labeling via Early-Exit

Sequence-Labeling-Early-Exit Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit Requirement: Please refer to re

ζŽε­η”· 23 Oct 14, 2022
ZeroVL - The official implementation of ZeroVL

This repository contains source code necessary to reproduce the results presente

31 Nov 04, 2022
A different spin on dataclasses.

dataklasses Dataklasses is a library that allows you to quickly define data classes using Python type hints. Here's an example of how you use it: from

David Beazley 752 Nov 18, 2022