Predicting the duration of arrival delays for commercial flights.

Overview

Lifecycle

Flight Delay Prediction

Our objective is to predict arrival delays of commercial flights. According to the US Department of Transportation, about 21% of commercial flights scheduled between June 2003 and October 2021 have experienced some form of delay. It is critical for airlines to estimate flight delays as accurately as possible in order to improve customer satisfaction and optimize the income of airline agencies. This project will be evaluated on the basis of arrival delay prediction accuracy for flights

Contributors

  • Jordan Silke GitHub
  • Jonas Bacareza GitHub

Understanding the problem


In an effort to understand some common causes of commercial flight delays, a number of sources were consulted including government agencies and flight-focused blog posts. A brief overview of findings can be found in the Research directory. These common causes will inform feature selection and engineering decisions.

Data description


Data was sourced from a LHL PostgreSQL database and descriptions were provided for each table. We used a custom script to extract the feature names from these description files and the raw data can be found here. The rationale behind missing value processing can be reviewed and reproduced by reading and executing the data_overview notebook. The data from the flights table included in this repository is a randomly sampled subset of the source table.

Recommended exploration


Task Status
Test the hypothesis that the arrival delay is from Normal distribution and that mean of the delay is 0. Be careful about the outliers.
Is average/median monthly delay different during the year? If so, which months have the biggest delays and what could be the reason?
Does the weather affect the delay? 🧰
How are taxi times changing during the day? Does higher traffic lead to longer taxi times?
What is the average percentage of delays that exist prior to departure (i.e. are arrival delays caused by departure delays)? Are airlines able to lower the delay during the flights?
How many states cover 50% of US air traffic?
Test the hypothesis that planes fly faster when there is a departure delay.
When (which hour) do most 'LONG', 'SHORT', 'MEDIUM' haul flights take off? 🔳
Find the top 10 the bussiest airports. Does the greatest number of flights mean that the majority of passengers went through a given airport? How much traffic do these 10 airports cover? 🔳
Do bigger delays lead to bigger fuel consumption per passenger? 🔳

🔳 - To do.
- Core task 'complete' (at least a first pass).
🧰 - Work in progress.

Exploration task results can be found here

Owner
Jordan Silke
Jordan Silke
Weakly Supervised Scene Text Detection using Deep Reinforcement Learning

Weakly Supervised Scene Text Detection using Deep Reinforcement Learning This repository contains the setup for all experiments performed in our Paper

Emanuel Metzenthin 3 Dec 16, 2022
A model that attempts to learn and benefit from data collected on card counting.

A model that attempts to learn and benefit from data collected on card counting. A decision tree like model is built to win more often than loose and increase the bet of the player appropriately to c

1 Dec 17, 2021
Consensus Learning from Heterogeneous Objectives for One-Class Collaborative Filtering

Consensus Learning from Heterogeneous Objectives for One-Class Collaborative Filtering This repository provides the source code of "Consensus Learning

SeongKu-Kang 6 Apr 29, 2022
Oriented Response Networks, in CVPR 2017

Oriented Response Networks [Home] [Project] [Paper] [Supp] [Poster] Torch Implementation The torch branch contains: the official torch implementation

ZhouYanzhao 217 Dec 12, 2022
pytorch implementation of fast-neural-style

fast-neural-style 🌇 🚀 NOTICE: This codebase is no longer maintained, please use the codebase from pytorch examples repository available at pytorch/e

Abhishek Kadian 405 Dec 15, 2022
Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

mandos 43 Dec 07, 2022
Cweqgen - The CW Equation Generator

The CW Equation Generator The cweqgen (pronouced like "Queck-Jen") package provi

2 Jan 15, 2022
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

凌逆战 16 Dec 30, 2022
Code for Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights

Piggyback: https://arxiv.org/abs/1801.06519 Pretrained masks and backbones are available here: https://uofi.box.com/s/c5kixsvtrghu9yj51yb1oe853ltdfz4q

Arun Mallya 165 Nov 22, 2022
Diagnostic tests for linguistic capacities in language models

LM diagnostics This repository contains the diagnostic datasets and experimental code for What BERT is not: Lessons from a new suite of psycholinguist

61 Jan 02, 2023
This repository is for DSA and CP scripts for reference.

dsa-script-collections This Repo is the collection of DSA and CP scripts for reference. Contents Python Bubble Sort Insertion Sort Merge Sort Quick So

Aditya Kumar Pandey 9 Nov 22, 2022
List of content farm sites like g.penzai.com.

内容农场网站清单 Google 中文搜索结果包含了相当一部分的内容农场式条目,比如「小 X 知识网」「小 X 百科网」。此种链接常会 302 重定向其主站,页面内容为自动生成,大量堆叠关键字,揉杂一些爬取到的内容,完全不具可读性和参考价值。 尤为过分的是,该类网站可能有成千上万个分身域名被 Goog

WDMPA 541 Jan 03, 2023
Fast RFC3339 compliant Python date-time library

udatetime: Fast RFC3339 compliant date-time library Handling date-times is a painful act because of the sheer endless amount of formats used by people

Simon Pirschel 235 Oct 25, 2022
Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D)

Conjugated Discrete Distributions for Distributional Reinforcement Learning (C2D) Code & Data Appendix for Conjugated Discrete Distributions for Distr

1 Jan 11, 2022
This is the repository for Learning to Generate Piano Music With Sustain Pedals

SusPedal-Gen This is the official repository of Learning to Generate Piano Music With Sustain Pedals Demo Page Dataset The dataset used in this projec

Joann Ching 12 Sep 02, 2022
An index of algorithms for learning causality with data

awesome-causality-algorithms An index of algorithms for learning causality with data. Please cite our survey paper if this index is helpful. @article{

Ruocheng Guo 2.3k Jan 08, 2023
Interactive dimensionality reduction for large datasets

BlosSOM 🌼 BlosSOM is a graphical environment for running semi-supervised dimensionality reduction with EmbedSOM. You can use it to explore multidimen

19 Dec 14, 2022
Multi Task Vision and Language

12-in-1: Multi-Task Vision and Language Representation Learning Please cite the following if you use this code. Code and pre-trained models for 12-in-

Facebook Research 712 Dec 19, 2022
Spectralformer: Rethinking hyperspectral image classification with transformers

The code in this toolbox implements the "Spectralformer: Rethinking hyperspectral image classification with transformers". More specifically, it is detailed as follow.

Danfeng Hong 104 Jan 04, 2023
PyTorch Implementation of Unsupervised Depth Completion with Calibrated Backprojection Layers (ORAL, ICCV 2021)

Unsupervised Depth Completion with Calibrated Backprojection Layers PyTorch implementation of Unsupervised Depth Completion with Calibrated Backprojec

80 Dec 13, 2022