Data-science-on-gcp - Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

Overview

data-science-on-gcp

Source code accompanying book:

Data Science on the Google Cloud Platform, 2nd Edition
Valliappa Lakshmanan
O'Reilly, Jan 2022
Branch edition2 [being built]
Data Science on the Google Cloud Platform
Valliappa Lakshmanan
O'Reilly, Jan 2017
Branch edition1_tf2 (also: main)

Try out the code on Google Cloud Platform

Open in Cloud Shell

The code on Qwiklabs (see below) is continually tested, and this repo is kept up-to-date. The code should work as-is for you, however, there are three very common problems that readers report:

  • Ch 2: Download data fails. The Bureau of Transportation website to download the airline dataset periodically goes down or changes availability due to government furloughs and the like. Please use the instructions in 02_ingest/README.md to copy the data from my bucket. The rest of the chapters work off the data in the bucket, and will be fine.
  • Ch 3: Permission errors. These typically occur because we expect that you will copy the airline data to your bucket. You don't have write access to gs://cloud-training-demos-ml/. The instructions will tell you to change the bucket name to one that you own. Please do that.
  • Ch 4, 10: Dataflow doesn't do anything.. The real-time simulation requires that you simultaneously run simulate.py and the Dataflow pipeline. If the Dataflow pipeline is not progressing, make sure that the simulate program is still running.

If the code doesn't work for you, I recommend that you try the corresponding Qwiklab lab to see if there is some step that you missed. If you still have problems, please leave feedback in Qwiklabs, or file an issue in this repo.

Try out the code on Qwiklabs

Purchase book

Read on-line or download PDF of book

Buy on Amazon.com

Updates to book

I updated the book in Nov 2019 with TensorFlow 2.0, Cloud Functions, and BigQuery ML.

Owner
Google Cloud Platform
Google Cloud Platform
LotteryBuyPredictionWebApp - Lottery Purchase Prediction Model

Lottery Purchase Prediction Model Objective and Goal Predict the lottery type th

Wanxuan Zhang 2 Feb 14, 2022
Generate modern Python clients from OpenAPI

openapi-python-client Generate modern Python clients from OpenAPI 3.x documents. This generator does not support OpenAPI 2.x FKA Swagger. If you need

555 Jan 02, 2023
This is a tool to make easier brawl stars modding using csv manipulation

Brawler Maker : Modding Tool for Brawl Stars This is a tool to make easier brawl stars modding using csv manipulation if you want to support me, just

6 Nov 16, 2022
advance python series: Data Classes, OOPs, python

Working With Pydantic - Built-in Data Process ========================== Normal way to process data (reading json file): the normal princiople, it's f

Phung Hưng Binh 1 Nov 08, 2021
Proyecto - Desgaste y rendimiento de empleados de IBM HR Analytics

Acceder al código desde Google Colab para poder ver de manera adecuada todas las visualizaciones y poder interactuar con ellas. Links de acceso: Noteb

1 Jan 31, 2022
An awesome Data Science repository to learn and apply for real world problems.

AWESOME DATA SCIENCE An open source Data Science repository to learn and apply towards solving real world problems. This is a shortcut path to start s

Academic.io 20.3k Jan 09, 2023
xeuledoc - Fetch information about a public Google document.

xeuledoc - Fetch information about a public Google document.

Malfrats Industries 651 Dec 27, 2022
Fun interactive program to sort a list :)

LHD-Build-Sort-a-list Fun interactive program to sort a list :) Inspiration LHD Build Write a script to sort a list. What it does It is a menu driven

Ananya Gupta 1 Jan 15, 2022
Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane.

Xanadu Quantum Codebook The Xanadu Quantum Codebook is an experimental, exercise-based introduction to quantum computing using PennyLane. This reposit

Xanadu 43 Dec 09, 2022
VSCode extension that generates docstrings for python files

VSCode Python Docstring Generator Visual Studio Code extension to quickly generate docstrings for python functions. Features Quickly generate a docstr

Nils Werner 506 Jan 03, 2023
Modified fork of CPython's ast module that parses `# type:` comments

Typed AST typed_ast is a Python 3 package that provides a Python 2.7 and Python 3 parser similar to the standard ast library. Unlike ast up to Python

Python 217 Dec 06, 2022
Mkdocs obsidian publish - Publish your obsidian vault through a python script

Mkdocs Obsidian Mkdocs Obsidian is an association between a python script and a

Mara 49 Jan 09, 2023
Convenient tools for using Swagger to define and validate your interfaces in a Pyramid webapp.

Convenient tools for using Swagger to define and validate your interfaces in a Pyramid webapp.

Scott Triglia 64 Sep 18, 2022
A Python library for setting up projects using tabular data.

A Python library for setting up projects using tabular data. It can create project folders, standardize delimiters, and convert files to CSV from either individual files or a directory.

0 Dec 13, 2022
SqlAlchemy Flask-Restful Swagger Json:API OpenAPI

SAFRS: Python OpenAPI & JSON:API Framework Overview Installation JSON:API Interface Resource Objects Relationships Methods Custom Methods Class Method

Thomas Pollet 361 Nov 16, 2022
Code for our SIGIR 2022 accepted paper : P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning

P3 Ranker Implementation for our SIGIR2022 accepted paper: P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-bas

14 Jan 04, 2023
layout-parser 3.4k Dec 30, 2022
Workbench to integrate pyoptools with freecad, that means basically optics ray tracing capabilities for FreeCAD.

freecad-pyoptools Workbench to integrate pyoptools with freecad, that means basically optics ray tracing capabilities for FreeCAD. Requirements It req

Combustión Ingenieros SAS 12 Nov 16, 2022
Build documentation in multiple repos into one site.

mkdocs-multirepo-plugin Build documentation in multiple repos into one site. Setup Install plugin using pip: pip install git+https://github.com/jdoiro

Joseph Doiron 47 Dec 28, 2022
A Python module for creating Excel XLSX files.

XlsxWriter XlsxWriter is a Python module for writing files in the Excel 2007+ XLSX file format. XlsxWriter can be used to write text, numbers, formula

John McNamara 3.1k Dec 29, 2022