The code for 2021 MGTV AI Challenge Anti Stealing Link, and the online result ranks 10th.

Overview

赛题介绍

芒果TV-第二届“马栏山杯”国际音视频算法大赛-防盗链 随着业务的发展,芒果的视频内容也深受网友的喜欢,不少视频网站和应用开始盗播芒果的视频内容,盗链网站不经过芒果TV的前端系统,跳过广告播放,且消耗大量的服务器、带宽资源,直接给公司带来了巨大的经济损失,因此防盗链在日常运营中显得尤为重要。如何从海量日志中识别出盗链行为,并进行有效拦截,也是防盗链工作的核心。 比赛网址:https://challenge.ai.mgtv.com/contest/detail/7

 

模型介绍

  • 对原始特征进行LabelEncoder编码保存为feather文件
  • 构建 group_feature、count_feature、count_time_feature、cumcount_time_feature、cumcount_feature、cumcountratio_feature、diff_feature、category_encoders_feature,对每类特征加原始特征进行LightGBM建模得到特征重要性
  • 汇总各类特征重要性后分别取前50、100、150、200、250个特征进行LightGBM建模,提交200个特征结果B榜得分0.5174

 

代码

python preprocess.py # 原始数据文件放入/data文件夹中,处理原始数据

python feature_engineering.py # 特征工程,单个特征保存为feather文件

python main.py # 模型训练,按日期排序,最后两天数据作为线下验证集数据
Owner
tongji40
tongji40
A calculator to test numbers against the collatz conjecture

The Collatz Calculator This is an algorithm custom built by Kyle Dickey, used to test numbers against the simple rules of the Collatz Conjecture.

Kyle Dickey 2 Jun 14, 2022
Djangoblog - A blogging site where people can make their accout and write blogs and read other author's blogs

This a blogging site where people can make their accout and write blogs and read other author's blogs.

1 Jan 26, 2022
pyinsim is a InSim module for the Python programming language.

PYINSIM pyinsim is a InSim module for the Python programming language. It creates socket connection with LFS and provides many classes, functions and

2 May 12, 2022
Fuzz introspector for python

Fuzz introspector High-level goals: Show fuzzing-relevant data about each function in a given project Show reachability of fuzzer(s) Integrate seamles

14 Mar 25, 2022
A TODO-list tool written in Python

PyTD A TODO-list tool written in Python. Its goal is to provide a stable posibility to get a good view over all your TODOs motivate you to actually fi

1 Feb 12, 2022
A collection of useful functions for writers to analyze text/stories.

AuthorTools AuthorTools provides a multitude of functions for easily analyzing (your?) writing. AuthorTools is made especially for creative writers wi

1 Jan 14, 2022
A modern message based async agent framework

Munggoggo A modern message based async agent framework An asyncio based agent platform written in Python and based on RabbitMQ. Agents are isolated pr

24 Dec 28, 2022
Gba-free-fonts - Free font resources for GBA game development

gba-free-fonts Free font resources for GBA game development This repo contains m

28 Dec 30, 2022
Writeup and scripts for the 2021 malwarebytes crackme

Malwarebytes Crackme 2021 Tools and environment setup We will be doing this analysis in a Windows 10 VM with the flare-vm tools installed. Most of the

Jerome Leow 9 Dec 02, 2022
Chat meetup

FLiP-Meetup-Chat Chat meetup create function bin/pulsar-admin functions create --auto-ack true --jar pulsardjlexample-1.0.jar --classname "dev.pulsarf

Timothy Spann 1 Dec 09, 2021
Tracking stock volatility.

SP500-highlow-tracking Track stock volatility. Being a useful indicator of the stock price volatility, High-Low gap represents the price range of the

Thong Huynh 13 Sep 07, 2022
Combines power of torch, numerical methods to conquer and solve ALL {O,P}DEs

torch_DE_solver Combines power of torch, numerical methods and math overall to conquer and solve ALL {O,P}DEs There are three examples to provide a li

Natural Systems Simulation Lab 28 Dec 12, 2022
Configure request params such as text, color, size etc. And then download the image

Configure request params such as text, color, size etc. And then download the image

6 Aug 18, 2022
Refer'd Resume Scanner

Refer'd Resume Scanner I wanted to share a free resource we built to assist applicants with resume building. Our resume scanner identifies potential s

Refer'd 74 Mar 07, 2022
A simple Python script for generating a variety of hashes from safe urandom entropy.

Hashgen A simple Python script for generating a variety of hashes from safe urandom entropy. For whenever you need a random hash (e.g. generating an a

Xanspie 1 Feb 17, 2022
This is a library to do functional programming in Python.

Fpylib This is a library to do functional programming in Python. Index Fpylib Index Features Intelligents Ranges with irange Lazyness to functions Com

Fabián Vega Alcota 4 Jul 17, 2022
Additional useful operations for Python

Pyteal Extensions Additional useful operations for Python Available Operations MulDiv64: calculate m1*m2/d with no overflow on multiplication (TEAL 3+

Ulam Labs 11 Dec 14, 2022
GA SEI Unit 4 project backend for Bloom.

Grow Your OpportunitiesTM Background Watch the Bloom Intro Video At Bloom, we believe every job seeker deserves an opportunity to find meaningful work

Jonathan Herman 3 Sep 20, 2021
CEI Natural Disaster Tracking Portal

CEI Natural Disaster Tracking Portal (cc) Climatic Eye of ISCI We are an initiative that conducts studies in the field of Space Science, publishes pro

Baris Dincer 7 Dec 24, 2022
🐍 A Python lib for (de)serializing Python objects to/from JSON

Turn Python objects into dicts or (json)strings and back No changes required to your objects Easily customizable and extendable Works with dataclasses

Ramon Hagenaars 253 Dec 14, 2022