Huawei Hackathon 2021 - Sweden (Stockholm)

Last update: Nov 08, 2022

Related tags

Deep Learning huawei-hackathon-2021

Overview

huawei-hackathon-2021

Contributors

DrakeAxelrod

Challenge

Requirements:

python=3.8.10
Standard libraries (no importing)

Important factors:

Data dependency between tasks for a Directed Acyclic Graph (DAG).

Task waits until parent tasks finished and data generated by parent reaches current task.

Communication time: The time which takes to send the parents’ data to their children, if they are located on different processing nodes; otherwise it can be assumed negligible. As a result, we prefer to assign communicating tasks on the same processing node.

Assign tasks on the same processing node where possible; if not, make data transfers from parent -> children as fast as possible.

Affinity: It refers to the affinity of a task to its previous instances running on the same processing node that can reduce overhead to initialize the task, such as a lower Instruction Cache Miss. Ideally the task is better to run on the same processing node where its previous instance was recently run.

Reuse processing nodes where possible. I.e. run children tasks on parent node.

Load Balancing of processing nodes: The CPU utilization of processing nodes should be balanced and uniformed.

Self explanitory.

Assumptions

If communicating tasks assigned to the same processing node, the communication time between them is negligible, i.e., equal to 0.

Using same node reduces communication time to 0.
If the previous instance of the same task is recently assigned to the same processing node, the estimated execution time of the current instance of the task reduces by 10%. For example, if T0 is assigned to PN1, the execution time of the second instance of T0 (denoted by T0’) on PN1 is 9µs, rather than 10µs.

Using same node reduces processing time by 10%. PN1 = Processing Node 1. T0 = Task 0.
"Recently assigned" can be translated to:
- If the previous instance of the current task is among the last Χ tasks run on the PN.
- For this purpose we need to keep, a history of the X recent tasks which run on each PN.
  
  Log the tasks tracked?
A DAG’s deadline is relative to its release time which denoted by d_i . For example, if the deadline of a DAG is 3 and the release time of its ith instance is 12, it should be completed before 15.
All time units are in microseconds.
The execution of tasks are non-preemptive, in the sense that a task that starts executing on a processor will not be interrupted by other tasks until its execution is completed.

Tasks cannot run concurrently on the same processor.

Problem Formulation

Consider a real-time app including n DAGs (DAG1, DAG2, ... DAGn) each of which are periodically released with a period P_k . Instances of each DAG is released over the course of the running application. The i^th instance of the k^th DAG is denoted by D_k⁽ⁱ⁾. The application is run on x homogenous processing nodes (PN1, PN2, ... PNx). The algorithm should find a solution on how to assign the tasks of DAGs to the PNs so that all DAGs deadlines are respected and the makespan of the given application is minimized. Makespan: The time where all instances of DAGs are completed

Questions:

Propose an algorithm to solve the considered problem to maximize the utility function including both the total application Makespan and the standard deviation of the PN utilizations (i.e., how well-uniform is the assignment) such that both task dependency constraints and DAGs deadlines are met.

Utility Function = 1 / (10 * Normalized(Makespan) + STD(PN utilizations))
Normalized(Makespan) = Makespan / Application_worst_case_completion_time
Application_worst_case_completion_time = SUM(execution_times, DAG_communication_times)
Normalized(Makespan) and STD(PN utilizations) are both values [0..1] Algorithm should specify the assignment of tasks to PNs that maximize utility function. Algorithm should specify the order the tasks are scheduled and execution order of tasks for each PN.

I/O

Input

Scheduler input: 12 test cases consisting of a JSON file that includes:

A set of independent DAGs
The deadlines for the DAGs
Number of instances of each DAG
Period (P_k) of the DAGs
List of tasks for each DAG
Execution times for each DAG
Communication (inter-task) times for each DAG __ --> Number of cores mentioned in each test case <--__

Output

A CSV file including:

The PN_id of which each task was assigned to (0, 1, ... x)
Order of execution of the tasks in their assigned PN
Start and finish time of the task
Applcation markspan
The STD of the clusters' utilization (PN utilization?)
Value of the utility function
The execution time of the scheduler on our machine.

Note for Python coders: If you code in Python, you need to write your own printer function to create the csv files in the specified format.

Huawei Hackathon 2021 - Sweden (Stockholm)

Related tags

Overview

huawei-hackathon-2021

Contributors

Challenge

Requirements:

Important factors:

Assumptions

Problem Formulation

Questions:

I/O

Input

Output

Owner

Drake Axelrod

Task-related Saliency Network For Few-shot learning

TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations

Large-Scale Unsupervised Object Discovery

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Offline Reinforcement Learning with Implicit Q-Learning

This is the official repository of the paper Stocastic bandits with groups of similar arms (NeurIPS 2021). It contains the code that was used to compute the figures and experiments of the paper.

🔥 TensorFlow Code for technical report: "YOLOv3: An Incremental Improvement"

This is a collection of our NAS and Vision Transformer work.

NHL 94 AI contests

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Dense Gaussian Processes for Few-Shot Segmentation

A PyTorch implementation of "DGC-Net: Dense Geometric Correspondence Network"

Optimising chemical reactions using machine learning

The PyTorch re-implement of a 3D CNN Tracker to extract coronary artery centerlines with state-of-the-art (SOTA) performance. (paper: 'Coronary artery centerline extraction in cardiac CT angiography using a CNN-based orientation classiﬁer')

Py-faster-rcnn - Faster R-CNN (Python implementation)

Camera Distortion-aware 3D Human Pose Estimation in Video with Optimization-based Meta-Learning

Official implementation of Unfolded Deep Kernel Estimation for Blind Image Super-resolution.

Unofficial Tensorflow-Keras implementation of Fastformer based on paper [Fastformer: Additive Attention Can Be All You Need](https://arxiv.org/abs/2108.09084).

BLEURT is a metric for Natural Language Generation based on transfer learning.