A set of procedures that can realize covid19 virus detection based on blood.

Overview

COVID19_detection

背景

​ 当前,世界正遭受全球COVID19大流行的困扰。数十亿人受到影响,数百万的人员伤亡已经发生。因此,鉴定受SARS-CoV-2病毒感染或已经受其污染的个人至关重要。 这种识别有助于公共卫生组织和政府制定行动计划,以减少这种大流行的影响。从这种意义上讲,Hilab是一家远程实验室公司,它执行数十种类型的血液检查,包括针对COVID19的血清学检查,该公司已经在巴西进行了数百万次检查。为了改善对这种病毒的检测,可以使用机器学习方法来帮助实验室专家进行决策。 因此,本项目将致力于解决构建用于检测COVID19的具有高置信度和准确性的机器学习模型的难题。

方法

  • 决策树(Decision tree)
  • 随机森林(Random forest)
  • 支持向量机(SVN)
  • 主成分分析(PCA)

数据集

数据集地址:https://drive.google.com/drive/folders/1FfIx5WmEc_C7d3Ai7ONIQE4s-o2xQZz5?usp=sharing

项目结构

/
-dataset/		#数据集存放目录
--test/			#测试集目录
---test.csv		#测试集文件
--train/  		#训练集目录
---train_1.csv	#训练集文件1(此文件与测试集相同,默认不使用)
---train_2.csv	#训练集文件2
.......
---train_7.csv	#训练集文件7

-data_preprocess.py	#数据集提取与预处理
-pca.py				#pca降维的相关实验
-decision_tree.py	#决策树
-random_forest.py	#随机森林
-SVM.py				#SVM
-README.md			#说明文件
Owner
Nuyoah-xlh
Nuyoah-xlh
Flenser is a simple, minimal, automated exploratory data analysis tool.

Flenser Have you ever been handed a dataset you've never seen before? Flenser is a simple, minimal, automated exploratory data analysis tool. It runs

John McCambridge 79 Sep 20, 2022
A tool to compare differences between dataframes and create a differences report in Excel

similarpanda A module to check for differences between pandas Dataframes, and generate a report in Excel format. This is helpful in a workplace settin

Andre Pretorius 9 Sep 15, 2022
Time ranges with python

timeranges Time ranges. Read the Docs Installation pip timeranges is available on pip: pip install timeranges GitHub You can also install the latest v

Micael Jarniac 2 Sep 01, 2022
Feature Detection Based Template Matching

Feature Detection Based Template Matching The classification of the photos was made using the OpenCv template Matching method. Installation Use the pa

Muhammet Erem 2 Nov 18, 2021
Pipeline and Dataset helpers for complex algorithm evaluation.

tpcp - Tiny Pipelines for Complex Problems A generic way to build object-oriented datasets and algorithm pipelines and tools to evaluate them pip inst

Machine Learning and Data Analytics Lab FAU 3 Dec 07, 2022
Randomisation-based inference in Python based on data resampling and permutation.

Randomisation-based inference in Python based on data resampling and permutation.

67 Dec 27, 2022
bigdata_analyse 大数据分析项目

bigdata_analyse 大数据分析项目 wish 采用不同的技术栈,通过对不同行业的数据集进行分析,期望达到以下目标: 了解不同领域的业务分析指标 深化数据处理、数据分析、数据可视化能力 增加大数据批处理、流处理的实践经验 增加数据挖掘的实践经验

Way 2.4k Dec 30, 2022
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Amundsen 3.7k Jan 03, 2023
Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which he recommends to buy. We will use this data to build a portfolio

Backtesting the "Cramer Effect" & Recommendations from Cramer Recommendations from Cramer: On the show Mad-Money (CNBC) Jim Cramer picks stocks which

Gábor Vecsei 12 Aug 30, 2022
Making the DAEN information accessible.

The purpose of this repository is to make the information on Australian COVID-19 adverse events accessible. The Therapeutics Goods Administration (TGA) keeps a database of adverse reactions to medica

10 May 10, 2022
Yet Another Workflow Parser for SecurityHub

YAWPS Yet Another Workflow Parser for SecurityHub "Screaming pepper" by Rum Bucolic Ape is licensed with CC BY-ND 2.0. To view a copy of this license,

myoung34 8 Dec 22, 2022
This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot.

superSFS This is a tool for speculation of ancestral allel, calculation of sfs and drawing its bar plot. It is easy-to-use and runing fast. What you s

3 Dec 16, 2022
Python Library for learning (Structure and Parameter) and inference (Statistical and Causal) in Bayesian Networks.

pgmpy pgmpy is a python library for working with Probabilistic Graphical Models. Documentation and list of algorithms supported is at our official sit

pgmpy 2.2k Dec 25, 2022
Aggregating gridded data (xarray) to polygons

A package to aggregate gridded data in xarray to polygons in geopandas using area-weighting from the relative area overlaps between pixels and polygons. Check out the binder link above for a sample c

Kevin Schwarzwald 42 Nov 09, 2022
Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions.

About Unsub is a collection analysis tool that assists libraries in analyzing their journal subscriptions. The tool provides rich data and a summary g

9 Nov 16, 2022
A lightweight, hub-and-spoke dashboard for multi-account Data Science projects

A lightweight, hub-and-spoke dashboard for cross-account Data Science Projects Introduction Modern Data Science environments often involve many indepe

AWS Samples 3 Oct 30, 2021
This tool parses log data and allows to define analysis pipelines for anomaly detection.

logdata-anomaly-miner This tool parses log data and allows to define analysis pipelines for anomaly detection. It was designed to run the analysis wit

AECID 32 Nov 27, 2022
A columnar data container that can be compressed.

Unmaintained Package Notice Unfortunately, and due to lack of resources, the Blosc Development Team is unable to maintain this package anymore. During

944 Dec 09, 2022
This is a python script to navigate and extract the FSD50K dataset

FSD50K navigator This is a script I use to navigate the sound dataset from FSK50K.

sweemeng 2 Nov 23, 2021
Instant search for and access to many datasets in Pyspark.

SparkDataset Provides instant access to many datasets right from Pyspark (in Spark DataFrame structure). Drop a star if you like the project. 😃 Motiv

Souvik Pratiher 31 Dec 16, 2022