This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks

Overview

S22-W4111-HW-1-0:
W4111 - Intro to Databases HW0 and HW1

Introduction

This project is the implementation template for HW 0 and HW 1 for both the programming and non-programming tracks.

HW 0 - All Students

You have completed the first step, which is cloning the project template.

Note: You are Columbia students. You should be able to install SW and follow instructions.

MySQL:

  • Download the installation files for MySQL Community Server..

    • Make sure you download for the correct operating system.
    • If you are on Mac make sure you choose the correct architecture. ARM is for Apple silicon. x86 is for other Apple systems.
    • On Windows, you can download and use the MSI.
  • Follow the installation instructions for MySQL. There are official instructions and many online tutorials.

  • Remember your root user ID and password, that you set during installation. Also, choose "Legacy Authentication" when prompted.

    • If you forget your root user or password, you are on your own. The TAs and I will not fix any problems due to forgetting the information.
    • Also, if you say something like, "It did not prompt me for a user ID and password when I instaled ... ..," we will laugh. We will say something like, ""Sure. 20 million MySQL installations asked for the information, but it decide to not to ask you."
    • If you tell us that you are sure that you are entering the correct user ID and password we will laugh. We will say something like, "Which is more likely. That a DATABASE forgot something or" you did?"
  • You only need to install the server. All other SW packages are optional.

Anaconda:

  • I strongly recommend uninstalling any existing version of Anaconda. If you choose not to uninstall previous versions, you may hit issues. You are on your own if you hit issues due to conflicting versions of Anaconda during the semester.

  • Download the most recent version of Ananconda..

  • Follow the installation instructions. Choose "Install for me" when prompted. If you hit a problem and I find your Anaconda installation in the wrong directory, you are on your own. If you say something like, "But, it did not give me that option," you can guess what will happen.

DataGrip:

  • Download DataGrip. Make sure you choose the correct OS and silicon.

  • Follow the installation instructions.

  • Apply for a student license.

  • When you receive confirmation of your student license, set the license information in DataGrip.

HW0: Non-Programming

Step 1: Initial Files

  1. Create a folder in the project of the form _src, where is your UNI I created an example, which is dff9_src.

  2. Create a file in the directory _HW0.

  3. Copy the Jupyter notebook file from dff9_src/dff9_HW0.ipynb into the directory you created and replace dff9 with your UNI.

  4. Do the same for dff9_HW0.py

Step 2: Jupter Notebook

  • Start Anaconda.

  • Open Jupyter Notebook in Anaconda.

  • Navigate to the directory where you cloned the repository, and then go into the folder you created.

  • Open the notebook (the file ending in .ipynb).

  • The remaining steps in HW0: Non-Programming are in the notebook that you opened.

HW 0: Programming

  • Complete the steps for HW0: Non-Programming.

  • The programming track is not "harder" than non-programming. The initial set up is a little more work, however.

  • Download and install PyCharm. Download and install the professional edition.

  • Follow the instructions to set the license key using the JetBrains account you used to get the DataGrip licenses.

  • Start PyCharm, navigate to and open the project that you cloned from GitHub.

  • Follow the instructions for creating a new virtual Conda environment for the project.

  • Select the root folder in the project, right click and add a new Python Package named _web_src. My example is dff9_web_src.

  • Copy the files from dff9_web_src into the package you created.

  • Follow the instructions for adding a package to your virtual environment. You should add the package flask.

  • Right click on your file application.py that you copied and select run. You will see a console window open and this will show a URL. Copy on the URL.

  • Open a browser. Paste the URL and append '/health'. My URL looks like http://172.20.1.14:5000/health. Yours may be a little different.

  • Hit enter. You should see a health message. Take a screenshot of the browser window and add the file to the directory. My example is ""

Owner
Donald F. Ferguson
Senior Technical Fellow, Chief SW Architect, Ansys, Inc. Adjunct Professor, Dept. of Computer Science, Columbia University. CTO and Co-Founder, Seeka.TV
Donald F. Ferguson
Orchest is a browser based IDE for Data Science.

Orchest is a browser based IDE for Data Science. It integrates your favorite Data Science tools out of the box, so you don’t have to. The application is easy to use and can run on your laptop as well

Orchest 3.6k Jan 09, 2023
A data parser for the internal syncing data format used by Fog of World.

A data parser for the internal syncing data format used by Fog of World. The parser is not designed to be a well-coded library with good performance, it is more like a demo for showing the data struc

Zed(Zijun) Chen 40 Dec 12, 2022
Meltano: ELT for the DataOps era. Meltano is open source, self-hosted, CLI-first, debuggable, and extensible.

Meltano is open source, self-hosted, CLI-first, debuggable, and extensible. Pipelines are code, ready to be version c

Meltano 625 Jan 02, 2023
Deep universal probabilistic programming with Python and PyTorch

Getting Started | Documentation | Community | Contributing Pyro is a flexible, scalable deep probabilistic programming library built on PyTorch. Notab

7.7k Dec 30, 2022
Methylation/modified base calling separated from basecalling.

Remora Methylation/modified base calling separated from basecalling. Remora primarily provides an API to call modified bases for basecaller programs s

Oxford Nanopore Technologies 72 Jan 05, 2023
Making the DAEN information accessible.

The purpose of this repository is to make the information on Australian COVID-19 adverse events accessible. The Therapeutics Goods Administration (TGA) keeps a database of adverse reactions to medica

10 May 10, 2022
Manage large and heterogeneous data spaces on the file system.

signac - simple data management The signac framework helps users manage and scale file-based workflows, facilitating data reuse, sharing, and reproduc

Glotzer Group 109 Dec 14, 2022
Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).

PandasVault ⁠— Advanced Pandas Functions and Code Snippets The only Pandas utility package you would ever need. It has no exotic external dependencies

Derek Snow 374 Jan 07, 2023
ICLR 2022 Paper submission trend analysis

Visualize ICLR 2022 OpenReview Data

Jintang Li 75 Dec 06, 2022
Programmatically access the physical and chemical properties of elements in modern periodic table.

API to fetch elements of the periodic table in JSON format. Uses Pandas for dumping .csv data to .json and Flask for API Integration. Deployed on "pyt

the techno hack 3 Oct 23, 2022
2019 Data Science Bowl

Kaggle-2019-Data-Science-Bowl-Solution - Here i present my solution to kaggle 2019 data science bowl and how i improved it to win a silver medal in that competition.

Deepak Nandwani 1 Jan 01, 2022
Investigating EV charging data

Investigating EV charging data Introduction: Got an opportunity to work with a home monitoring technology company over the last 6 months whose goal wa

Yash 2 Apr 07, 2022
The repo for mlbtradetrees.com. Analyze any trade in baseball history!

The repo for mlbtradetrees.com. Analyze any trade in baseball history!

7 Nov 20, 2022
A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.

TennisBusinessIntelligenceProject - A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.

carlo paladino 1 Jan 02, 2022
Implementation in Python of the reliability measures such as Omega.

OmegaPy Summary Simple implementation in Python of the reliability measures: Omega Total, Omega Hierarchical and Omega Hierarchical Total. Name Link O

Rafael Valero Fernández 2 Apr 27, 2022
A computer algebra system written in pure Python

SymPy See the AUTHORS file for the list of authors. And many more people helped on the SymPy mailing list, reported bugs, helped organize SymPy's part

SymPy 9.9k Dec 31, 2022
My solution to the book A Collection of Data Science Take-Home Challenges

DS-Take-Home Solution to the book "A Collection of Data Science Take-Home Challenges". Note: Please don't contact me for the dataset. This repository

Jifu Zhao 1.5k Jan 03, 2023
Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

Created covid data pipeline using PySpark and MySQL that collected data stream from API and do some processing and store it into MYSQL database.

2 Nov 20, 2021
Supply a wrapper ``StockDataFrame`` based on the ``pandas.DataFrame`` with inline stock statistics/indicators support.

Stock Statistics/Indicators Calculation Helper VERSION: 0.3.2 Introduction Supply a wrapper StockDataFrame based on the pandas.DataFrame with inline s

Cedric Zhuang 1.1k Dec 28, 2022
Data Analysis for First Year Laboratory at Imperial College, London.

Data Analysis for First Year Laboratory at Imperial College, London. For personal reference only, and to reference in lab reports and lab books.

Martin He 0 Aug 29, 2022