Bittorrent software for cats

Overview

NyaaV2 Build Status

Setting up for development

This project uses Python 3.7. There are features used that do not exist in 3.6, so make sure to use Python 3.7. This guide also assumes you 1) are using Linux and 2) are somewhat capable with the commandline.
It's not impossible to run Nyaa on Windows, but this guide doesn't focus on that.

Code Quality:

  • Before we get any deeper, remember to follow PEP8 style guidelines and run ./dev.py lint before committing to see a list of warnings/problems.
    • You may also use ./dev.py fix && ./dev.py isort to automatically fix some of the issues reported by the previous command.
  • Other than PEP8, try to keep your code clean and easy to understand, as well. It's only polite!

Running Tests

The tests folder contains tests for the the nyaa module and the webserver. To run the tests:

  • Make sure that you are in the python virtual environment.
  • Run ./dev.py test while in the repository directory.

Setting up Pyenv

pyenv eases the use of different Python versions, and as not all Linux distros offer 3.7 packages, it's right up our alley.

Setting up MySQL/MariaDB database

You may use SQLite but the current support for it in this project is outdated and rather unsupported.

  • Enable USE_MYSQL flag in config.py
  • Install latest mariadb by following instructions here https://downloads.mariadb.org/mariadb/repositories/
    • Tested versions: mysql Ver 15.1 Distrib 10.0.30-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
  • Run the following commands logged in as your root db user (substitute for your own config.py values if desired):
    • CREATE USER 'test'@'localhost' IDENTIFIED BY 'test123';
    • GRANT ALL PRIVILEGES ON *.* TO 'test'@'localhost';
    • FLUSH PRIVILEGES;
    • CREATE DATABASE nyaav2 DEFAULT CHARACTER SET utf8 COLLATE utf8_bin;

Finishing up

  • Run python db_create.py to create the database and import categories
    • Follow the advice of db_create.py and run ./db_migrate.py stamp head to mark the database version for Alembic
  • Start the dev server with python run.py
  • When you are finished developing, deactivate your virtualenv with pyenv deactivate or source deactivate (or just close your shell session)

You're now ready for simple testing and development!
Continue below to learn about database migrations and enabling the advanced search engine, Elasticsearch.

Database migrations

  • Database migrations are done with flask-Migrate, a wrapper around Alembic.
  • If someone has made changes in the database schema and included a new migration script:
    • If your database has never been marked by Alembic (you're on a database from before the migrations), run ./db_migrate.py stamp head before pulling the new migration script(s).
      • If you already have the new scripts, check the output of ./db_migrate.py history instead and choose a hash that matches your current database state, then run ./db_migrate.py stamp .
    • Update your branch (eg. git fetch && git rebase origin/master)
    • Run ./db_migrate.py upgrade head to run the migration. Done!
  • If you have made a change in the database schema:
    • Save your changes in models.py and ensure the database schema matches the previous version (ie. your new tables/columns are not added to the live database)
    • Run ./db_migrate.py migrate -m "Short description of changes" to automatically generate a migration script for the changes
      • Check the script (migrations/versions/...) and make sure it works! Alembic may not able to notice all changes.
    • Run ./db_migrate.py upgrade to run the migration and verify the upgrade works.
      • (Run ./db_migrate.py downgrade to verify the downgrade works as well, then upgrade again)

Setting up and enabling Elasticsearch

Installing Elasticsearch

Enabling MySQL Binlogging

  • Edit your MariaDB/MySQL server configuration and add the following under [mariadb]:
    log-bin
    server_id=1
    log-basename=master1
    binlog-format=row
    
  • Restart MariaDB/MySQL (sudo service mysql restart)
  • Copy the example configuration (es_sync_config.example.json) as es_sync_config.json and adjust options in it to your liking (verify the connection options!)
  • Connect to mysql as root
    • Verify that the result of SHOW VARIABLES LIKE 'binlog_format'; is ROW
    • Execute GRANT REPLICATION SLAVE ON *.* TO 'username'@'localhost'; to allow your configured user access to the binlog

Setting up ES

  • Run ./create_es.sh to create the indices for the torrents: nyaa and sukebei
    • The output should show acknowledged: true twice
  • Stop the Nyaa app if you haven't already
  • Run python import_to_es.py to import all the torrents (on nyaa and sukebei) into the ES indices.
    • This may take some time to run if you have plenty of torrents in your database.

Enable the USE_ELASTIC_SEARCH flag in config.py and (re)start the application.
Elasticsearch should now be functional! The ES indices won't be updated "live" with the current setup, continue below for instructions on how to hook Elasticsearch up to MySQL binlog.

However, take note that binglog is not necessary for simple ES testing and development; you can simply run import_to_es.py from time to time to reindex all the torrents.

Setting up sync_es.py

sync_es.py keeps the Elasticsearch indices updated by reading the binlog and pushing the changes to the ES indices.

  • Make sure es_sync_config.json is configured with the user you grated the REPLICATION permissions
  • Run import_to_es.py and copy the outputted JSON into the file specified by save_loc in your es_sync_config.json
  • Run sync_es.py as-is or, for actual deployment, set it up as a service and run it, preferably as the system/root
    • Make sure sync_es.py runs within the venv with the right dependencies!

You're done! The script should now be feeding updates from the database to Elasticsearch.
Take note, however, that the specified ES index refresh interval is 30 seconds, which may feel like a long time on local development. Feel free to adjust it or poke Elasticsearch yourself!

Owner
Meow
An distributed automation framework.

Automation Kit Repository Welcome to the Automation Kit repository! Note: This package is progressing quickly but is not yet ready for full production

Automation Mojo 3 Nov 03, 2022
Deluge BitTorrent client - Git mirror, PRs only

Deluge is a BitTorrent client that utilizes a daemon/client model. It has various user interfaces available such as the GTK-UI, Web-UI and a Console-UI. It uses libtorrent at it's core to handle the

Deluge team 1.3k Jan 07, 2023
Microsoft Distributed Machine Learning Toolkit

DMTK Distributed Machine Learning Toolkit https://www.dmtk.io Please open issues in the project below. For any technical support email to

Microsoft 2.8k Nov 19, 2022
Python Stream Processing

Python Stream Processing Version: 1.10.4 Web: http://faust.readthedocs.io/ Download: http://pypi.org/project/faust Source: http://github.com/robinhood

Robinhood 6.4k Jan 07, 2023
PowerGym is a Gym-like environment for Volt-Var control in power distribution systems.

Overview PowerGym is a Gym-like environment for Volt-Var control in power distribution systems. The Volt-Var control targets minimizing voltage violat

Siemens 44 Jan 01, 2023
Privacy enhanced BitTorrent client with P2P content discovery

Tribler Towards making Bittorrent anonymous and impossible to shut down. We use our own dedicated Tor-like network for anonymous torrent downloading.

4.2k Dec 31, 2022
Ray provides a simple, universal API for building distributed applications.

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyper

23.5k Jan 05, 2023
A lightweight python module for building event driven distributed systems

Eventify A lightweight python module for building event driven distributed systems. Installation pip install eventify Problem Developers need a easy a

Eventify 16 Aug 18, 2022
蓝鲸基础计算平台(BK-BASE)是一个专注于运维领域的的基础平台,打造一站式、低门槛的基础服务

蓝鲸基础计算平台(BK-BASE)是一个专注于运维领域的的基础平台,打造一站式、低门槛的基础服务。通过简化运维数据的收集、获取,提升数据开发效率,辅助运维人员实时运维决策,助力企业运营体系数字化、智能化转型。

Tencent 80 Dec 16, 2022
Framework and Library for Distributed Online Machine Learning

Jubatus The Jubatus library is an online machine learning framework which runs in distributed environment. See http://jubat.us/ for details. Quick Sta

Jubatus 701 Nov 29, 2022
Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Luigi is a Python (3.6, 3.7 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow managemen

Spotify 16.2k Jan 01, 2023
Bittorrent software for cats

NyaaV2 Setting up for development This project uses Python 3.7. There are features used that do not exist in 3.6, so make sure to use Python 3.7. This

3k Dec 30, 2022
Run MapReduce jobs on Hadoop or Amazon Web Services

mrjob: the Python MapReduce library mrjob is a Python 2.7/3.4+ package that helps you write and run Hadoop Streaming jobs. Stable version (v0.7.4) doc

Yelp.com 2.6k Dec 22, 2022
Distributed machine learning platform

Veles Distributed platform for rapid Deep learning application development Consists of: Platform - https://github.com/Samsung/veles Znicz Plugin - Neu

Samsung 897 Dec 05, 2022
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)

English | 简体中文 Welcome to the PaddlePaddle GitHub. PaddlePaddle, as the only independent R&D deep learning platform in China, has been officially open

19.4k Dec 30, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

23.2k Dec 30, 2022
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Horovod Horovod is a distributed deep learning training framework for TensorFlow, Keras, PyTorch, and Apache MXNet. The goal of Horovod is to make dis

Horovod 12.9k Dec 29, 2022
Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.

Streamparse lets you run Python code against real-time streams of data via Apache Storm. With streamparse you can create Storm bolts and spouts in Pyt

Parsely, Inc. 1.5k Dec 22, 2022
Distributed-systems-algos - Distributed Systems Algorithms For Python

Distributed Systems Algorithms ISIS algorithm In an asynchronous system that kee

Tony Joo 2 Nov 30, 2022
Distributed Synchronization for Python

Distributed Synchronization for Python Tutti is a nearly drop-in replacement for python's built-in synchronization primitives that lets you fearlessly

Hamilton Kibbe 4 Jul 07, 2022