The code for 2021 MGTV AI Challenge Anti Stealing Link, and the online result ranks 10th.

Overview

赛题介绍

芒果TV-第二届“马栏山杯”国际音视频算法大赛-防盗链 随着业务的发展,芒果的视频内容也深受网友的喜欢,不少视频网站和应用开始盗播芒果的视频内容,盗链网站不经过芒果TV的前端系统,跳过广告播放,且消耗大量的服务器、带宽资源,直接给公司带来了巨大的经济损失,因此防盗链在日常运营中显得尤为重要。如何从海量日志中识别出盗链行为,并进行有效拦截,也是防盗链工作的核心。 比赛网址:https://challenge.ai.mgtv.com/contest/detail/7

 

模型介绍

  • 对原始特征进行LabelEncoder编码保存为feather文件
  • 构建 group_feature、count_feature、count_time_feature、cumcount_time_feature、cumcount_feature、cumcountratio_feature、diff_feature、category_encoders_feature,对每类特征加原始特征进行LightGBM建模得到特征重要性
  • 汇总各类特征重要性后分别取前50、100、150、200、250个特征进行LightGBM建模,提交200个特征结果B榜得分0.5174

 

代码

python preprocess.py # 原始数据文件放入/data文件夹中,处理原始数据

python feature_engineering.py # 特征工程,单个特征保存为feather文件

python main.py # 模型训练,按日期排序,最后两天数据作为线下验证集数据
Owner
tongji40
tongji40
Synthetik Python Mod - A save editor tool for the game Synthetik written in python

Synthetik_Python_Mod A save editor tool for the game Synthetik written in python

2 Sep 10, 2022
CuraMultiplyByGrid - Cura Плагин для размножения детали сеткой на весь стол автоматически без поворота

CuraMultiplyByGrid Cura Плагин для размножения детали сеткой на весь стол автоматически без поворота. Размножение в куре настолько ужасно реализовано,

3 Dec 02, 2022
A simple python script that print the Mandelbrot set for every power of the formal formula.

Python Mandelbrot A simple python script that print the Mandelbrot set for every power of the formal formula.

Paride Giunta 2 Apr 15, 2022
SimBiber - A tool for simplifying bibtex with official info

SimBiber: A tool for simplifying bibtex with official info. We often need to sim

336 Jan 02, 2023
Um sistema de llogin feito em uma interface grafica.

Interface-para-login Um sistema de login feito com JSON. Utilizando a biblioteca Tkinter, eu criei um sistema de login, onde guarda a informações de l

Mobben 1 Nov 28, 2021
DG - A(n) (unusual) programming language

DG - A(n) (unusual) programming language General structure There are no infix-operators (i.e. 1 + 1) Each operator takes 2 parameters When there are m

1 Mar 05, 2022
Practice10 - Operasi String With Python

Operasi String MY SOSIAL MEDIA : Apa itu Python String ? String adalah urutan si

Maulana Reza Badrudin 1 Jan 05, 2022
Its a simple and fun to use application. You can make your own quizes and send the lik of the quiz to your friends.

Quiz Application Its a simple and fun to use application. You can make your own quizes and send the lik of the quiz to your friends. When they would a

Atharva Parkhe 1 Feb 23, 2022
Data and analysis relating to the 5.8M Melbourne quake of 2021

quake2021 Data and analysis relating to the 5.8M Melbourne quake of 2021 Monash University Woodside Living Lab Building The building is located here T

Colin Caprani 6 May 16, 2022
A simple IDA Pro plugin to show all HexRays decompiler comments written by user

XRaysComments A simple IDA Pro plugin to show all HexRays decompiler comments written by user Installation Copy the file xray_comments.py to the plugi

Nox 20 Dec 27, 2022
Another Provably Rare Gem Miner 💎 (for Raritygems)

Provably Rare Gem Miner Go (for Rarity) Pull Request is strongly welcome as I don't know anything about Golang/Python/Web3. Usage Install Python 3.x i

朱里 6 Apr 22, 2022
Rock 💎 Paper 📝 Scissors ✂️ Lizard 🦎 Spock 🖖

Rock 💎 Paper 📝 Scissors ✂️ Lizard 🦎 Spock 🖖 If you’ve seen The Big Bang Theory, you’ve heard of a game called “Rock, Paper, Scissors, Lizard, Spoc

AmirHossein Mohammadi 16 Jun 19, 2022
Dump Data from FTDI Serial Port to Binary File on MacOS

Dump Data from FTDI Serial Port to Binary File on MacOS

pandy song 1 Nov 24, 2021
BinCat is an innovative login system, with which the account you register will be more secure.

BinCat is an innovative login system, with which the account you register will be more secure. This project is inspired by a conventional token system.

Hipotesi 2 May 22, 2022
An example of python package

An example of python package Why use packages? It is a good practice to not code the same function twice, and to reuse common code from one python scr

10 Oct 18, 2022
A StarkNet project template based on a Pythonic environment

StarkNet Project Template This is an opinionated StarkNet project template. It is based around the Python's ecosystem and best practices. tox to manag

Francesco Ceccon 5 Apr 21, 2022
python's memory-saving dictionary data structure

ConstDict python代替的Dict数据结构 若字典不会增加字段,只读/原字段修改 使用ConstDict可节省内存 Dict()内存主要消耗的地方: 1、Dict扩容机制,预留内存空间 2、Dict也是一个对象,内部会动态维护__dict__,增加slot类属性可以节省内容 节省内存大小

Grenter 1 Nov 03, 2021
A tool that bootstraps your dotfiles ⚡️

Dotbot Dotbot makes installing your dotfiles as easy as git clone $url && cd dotfiles && ./install, even on a freshly installed system! Rationale Gett

Anish Athalye 5.9k Jan 07, 2023
A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

A webdav demo using a virtual filesystem that serves a random status of whether a cat in a box is dead or alive.

Marshall Conover 2 Jan 12, 2022
XlvnsScriptTool - Tool for decompilation and compilation of scripts .SDT from the visual novel's engine xlvns

XlvnsScriptTool English Dual languaged (rus+eng) tool for decompiling and compiling (actually, this tool is more than just (dis)assenbler, but less th

Tester 3 Sep 15, 2022