DOP-Tuning(Domain-Oriented Prefix-tuning model)

Overview

DOP-Tuning

DOP-Tuning(Domain-Oriented Prefix-tuning model)代码基于Prefix-Tuning改进.

Files

├── seq2seq                       # Code for encoder-decoder architecture
│   ├── train_bart.py             # high-level scripts to train.
│   ├── prefixTuning.py           # code that implements prefix-tuning.
│   ├── finetune.py               # training code (contains data loading, model loading, and calls trainer)   
│   ├── lightning_base.py         # helper code
│   ├── utils.py                  # helper code
│   └── callbacks.py              # helper code
└── ...

To run the code for encoder-decoder architecture like BART, the code is in seq2seq. This corresponds to the summarization experiments in the paper.

Setup:

cd transformer; pip install -e .

Train via DOP-tuning

cd seq2seq; 

python train_bart.py --mode xsum --preseqlen 200 --do_train yes --fp16 yes --bsz 16  --epoch 30  --gradient_accumulation_step 3 --learning_rate 0.00005  --mid_dim 800 --use_lowdata_token 'yes' --lowdata_token 'summarize'

其中use_lowdata_token表示是否采用domain word初始化的方式;lowdata_token表示传入的domain word.

Decode

cd seq2seq; 

python train_bart.py --mode xsum --do_train no --prefix_model_path {checkpoint_path} --preseqlen {same as training} --mid_dim {same as training} --use_lowdata_token 'yes' --lowdata_token 'summarize'
Owner
Andrew Zeng
Andrew Zeng
Andrew Zeng
Fast Base64 encoding/decoding in Python

Fast Base64 implementation This project is a wrapper on libbase64. It aims to provide a fast base64 implementation for base64 encoding/decoding. Insta

Matthieu Darbois 96 Dec 26, 2022
An Insurance firm providing tour insurance is facing higher claim frequency

An Insurance firm providing tour insurance is facing higher claim frequency. Data is collected from the past few years. Made a model which predicts the claim status using CART, RF & ANN and compare t

1 Jan 27, 2022
Web app to find your chance of winning at Texas Hold 'Em

poker_mc Web app to find your chance of winning at Texas Hold 'Em A working version of this project is deployed at poker-mc.ue.r.appspot.com. It's run

Aadith Vittala 7 Sep 15, 2021
Random Turkish name generator with realistic probabilities.

trnames Random Turkish name generator with realistic probabilities. Based on Trey Hunner's names package. Installation The package can be installed us

Kaan Öztürk 20 Jan 02, 2023
Fully cross-platform toolkit (and library!) for MachO+Obj-C editing/analysis

fully cross-platform toolkit (and library!) for MachO+Obj-C editing/analysis. Includes a cli kit, a curses GUI, ObjC header dumping, and much more.

cynder 301 Dec 28, 2022
Workshop OOP - Workshop OOP - Discover object-oriented programming

About: This is an open-source bot, the code is open for anyone to see, fork and

Francis Clairicia-Rose-Claire-Joséphine 5 May 02, 2022
Advanced python code - For students in my advanced python class

advanced_python_code For students in my advanced python class Week Topic Recordi

Ariel Avshalom 3 May 27, 2022
kodi addon 115网盘

plugin.video.115 kodi addon 115网盘 插件,需要kodi 18以上版本,原码播放需配合 https://github.com/feelfar/115proxy-for-kodi 使用 安装 HEAD 由于release包尚未释出,可直接下载源代码zip包

109 Dec 29, 2022
Reactjs web app written entirely in python, using transcrypt compiler.

Reactjs web app written entirely in python, using transcrypt compiler.

Dan Shai 22 Nov 27, 2022
STAC in Jupyter Notebooks

stac-nb STAC in Jupyter Notebooks Install pip install stac-nb Usage To use stac-nb in a project, start Jupyter Lab (jupyter lab), create a new noteboo

Darren Wiens 32 Oct 04, 2022
🏆 A ranked list of awesome Python open-source libraries and tools. Updated weekly.

Best-of Python 🏆 A ranked list of awesome Python open-source libraries & tools. Updated weekly. This curated list contains 230 awesome open-source pr

Machine Learning Tooling 2.7k Jan 03, 2023
本仓库整理了腾讯视频、爱奇艺、优酷、哔哩哔哩等视频网站中,能够观看的「豆瓣电影 Top250 榜单」影片。

Where is top 250 movie ? 本仓库整理了腾讯视频、爱奇艺、优酷、哔哩哔哩等视频网站中,能够观看的「豆瓣电影 Top250 榜单」影片,点击 Badge 可跳转至相应的电影首页。

MayanDev 123 Dec 22, 2022
Code and yara rules to detect and analyze Cobalt Strike

Cobalt Strike Resources This repository contains: analyze.py: a script to analyze a Cobalt Strike beacon (python analyze.py BEACON) extract.py; extrac

Tek 224 Jan 04, 2023
Open-source library for analyzing the results produced by ABINIT

Package Continuous Integration Documentation About AbiPy is a python library to analyze the results produced by Abinit, an open-source program for the

ABINIT 91 Dec 09, 2022
A simple, fantasy and fast note taking program.

notes A simple, fantasy and fast note taking program Installation This program supposed to run in linux and may have some bugs on windows or any other

Ali Hosseinverdi 1 Apr 06, 2022
APC Power Usage is an application which shows power consuption overtime for UPS units manufactured by APC.

APC Power Usage Introduction APC Power Usage is an application which shows power consuption overtime for UPS units manufactured by APC. Screenshoots G

Stefan Kondinski 3 Oct 08, 2021
A simple API to upload notes or files to KBFS

This API can be used to upload either secure notes or files to a secure KeybaseFS folder.

Dakota Brown 1 Oct 08, 2021
Companion Web site for Fluent Python, Second Edition

Fluent Python, the site Source code and content for fluentpython.com. The site complements Fluent Python, Second Edition with extra content that did n

Fluent Python 49 Dec 08, 2022
Covid-ml-predictors - COVID predictions using AI.

COVID Predictions This repo contains ML models to be trained on COVID-19 data from the UK, sourced off of Kaggle here. This uses many different ML mod

1 Jan 09, 2022
Strawberry Benchmark With Python

Strawberry benchmarks these benchmarks have been made to compare the performance of dataloaders and joined database queries. How to use You can run th

Doctor 4 Feb 23, 2022