2021搜狐校园文本匹配算法大赛baseline

Last update: Sep 06, 2022

Related tags

Text Data & NLP sohu2021-baseline

Overview

sohu2021-baseline

2021搜狐校园文本匹配算法大赛baseline

简介

分享了一个搜狐文本匹配的baseline，主要是通过条件LayerNorm来增加模型的多样性，以实现同一模型处理不同类型的数据、形成不同输出的目的。

线下验证集F1约0.74，线上测试集F1约0.73。预训练模型是RoFormer，也欢迎对比其他预训练模型的效果。

测试环境：tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.5，如果在其他环境组合下报错，请根据错误信息自行调整代码。

详情请看：https://kexue.fm/archives/8337

交流

QQ交流群：808623966，微信群请加机器人微信号spaces_ac_cn

Owner

苏剑林(Jianlin Su)

科学爱好者

GitHub Repository

PG-19 Language Modelling Benchmark

PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje

161 Oct 30, 2022

A Chinese to English Neural Model Translation Project

ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C

29 Nov 26, 2022

Need: Image Search With Python

Need: Image Search The problem is that a user needs to search for a specific ima

1 Dec 30, 2021

SimBERT升级版（SimBERTv2）！

RoFormer-Sim RoFormer-Sim，又称SimBERTv2，是我们之前发布的SimBERT模型的升级版。介绍 https://kexue.fm/archives/8454 训练 tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.6 下载

317 Dec 23, 2022

🧪 Cutting-edge experimental spaCy components and features

spacy-experimental: Cutting-edge experimental spaCy components and features This package includes experimental components and features for spaCy v3.x,

65 Dec 30, 2022

ADCS cert template modification and ACL enumeration

Purpose This tool is designed to aid an operator in modifying ADCS certificate templates so that a created vulnerable state can be leveraged for privi

78 Dec 12, 2022

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset.

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset. TunBERT was applied to three NLP downstream tasks: Sentiment Analysis (S

72 Dec 09, 2022

[ICLR'19] Trellis Networks for Sequence Modeling

TrellisNet for Sequence Modeling This repository contains the experiments done in paper Trellis Networks for Sequence Modeling by Shaojie Bai, J. Zico

460 Oct 13, 2022

OpenChat: Opensource chatting framework for generative models

OpenChat is opensource chatting framework for generative models.

427 Jan 06, 2023

Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Wav2Vec2CTC With KenLM Using KenLM ARPA language model with beam search to decode audio files and show the most probable transcription. Assuming you'v

65 Sep 21, 2022

Easy-to-use CPM for Chinese text generation

CPM 项目描述 CPM（Chinese Pretrained Models）模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型，参数量分别为109M、334M、2.6B，用户需申请与通过审核，方可下载。由于原项目需要考虑大模型的训练和使用，需要安装较为复杂

382 Jan 07, 2023

Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.

Sentiment Analyzer The goal of this project is to perform sentiment analysis on textual data that people generally post on websites like social networ

53 Mar 01, 2022

The code for two papers: Feedback Transformer and Expire-Span.

transformer-sequential This repo contains the code for two papers: Feedback Transformer Expire-Span The training code is structured for long sequentia

125 Dec 25, 2022

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning We propose a SASE mode

1 Nov 20, 2021

Code for "Generative adversarial networks for reconstructing natural images from brain activity".

Reconstruct handwritten characters from brains using GANs Example code for the paper "Generative adversarial networks for reconstructing natural image

2 May 17, 2022

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Pattern-Exploiting Training (PET) This repository contains the code for Exploiting Cloze Questions for Few-Shot Text Classification and Natural Langua

1.4k Dec 30, 2022

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours

Words_And_Phrases Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours Abbreviations Abbreviation

1 Feb 01, 2022

2021搜狐校园文本匹配算法大赛baseline

Related tags

Overview

sohu2021-baseline

简介

交流

Owner

苏剑林(Jianlin Su)

PG-19 Language Modelling Benchmark

A Chinese to English Neural Model Translation Project

Need: Image Search With Python

SimBERT升级版（SimBERTv2）！

🧪 Cutting-edge experimental spaCy components and features

ADCS cert template modification and ACL enumeration

TunBERT is the first release of a pre-trained BERT model for the Tunisian dialect using a Tunisian Common-Crawl-based dataset.

[ICLR'19] Trellis Networks for Sequence Modeling

OpenChat: Opensource chatting framework for generative models

Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

Easy-to-use CPM for Chinese text generation

Perform sentiment analysis on textual data that people generally post on websites like social networks and movie review sites.

The code for two papers: Feedback Transformer and Expire-Span.

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

Code for "Generative adversarial networks for reconstructing natural images from brain activity".

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

Rhyme with AI

The source code of "Language Models are Few-shot Multilingual Learners" (MRL @ EMNLP 2021)

Opal-lang - A WIP programming language based on Python

Words_And_Phrases - Just a repo for useful words and phrases that might come handy in some scenarios. Feel free to add yours