Using the provided dataset which includes various book features, in order to predict the price of books, using various proposed methods and models.

Overview

Predict-The-Price-Of-Books

For this task, a big dataset which consists of book of different genres and authors was utilized. The provided dataset included various book features, such as Author, Edition, Reviews, etc. Those features have been used as regressors in order to predict the price of books, using various proposed methods and models.

Author: Nikolas Petrou, MSc in Data Science

Technical-Report and Code Availability

  • A complete file-folder guide is located in the folder-file guide folder
  • The technical report and analysis of the work is available and located in report.pdf file
  • The implementation and code of the project is located in the code files folder

Dataset Overview

Regarding the data of this work, there is an online competition for this task, which has been up since 27/09/2019. Currently, the competition has 3579 participants in total. The data was downloaded directly from MachineHack. There were two files forthe train and test sets. The training and test sets included 6237 and 1560 records respectively. In addition, the values of the target variable (Price) were not included in the test set, as the evaluation of the test set is employed through the website of MachineHack.

Methodology

Some of the key methods which were used throughout the work are:

  • Visualization
  • TF-IDF and LDA Topic Extraction
  • Text-tranlsation using Google Trasnlate Ajax API
  • Cyclical feature encoding for time-based feature extraction
  • Price Prediction using different conventional and advanced algorithms (e.g. GBM, RF, SVM, CatBoost, LightGBM)

An abstract methodology scheme of the work is illustrated in the following Figure.

Summarizing, firstly the exploratory data understanding process was commenced. Each feature was assessed in order to obtain a better understanding of what it represents and how it could affect book pricing. Next, each future was brought into a format that was appropriate for model development. Following, through visualization, it was examined how the different features were correlated to the dependent-target variable. Furthermore, the processed data were used to implement the employed models. The prediction-modelling phase was conducted with two different approaches. Finally, the whole methodology procedure followed a cyclical behaviour, until the final prediction model was implemented.

Owner
Nikolas Petrou
M.Sc. Data Science student, University of Cyprus (UCY) Research Assistant at the Laboratory of Internet Computing (LInC) B.Sc degree in Computer Science
Nikolas Petrou
A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano

yolov5-fire-smoke-detect-python A python implementation of Yolov5 to detect fire or smoke in the wild in Jetson Xavier nx and Jetson nano You can see

20 Dec 15, 2022
MediaPipeで姿勢推定を行い、Tokyo2020オリンピック風のピクトグラムを表示するデモ

Tokyo2020-Pictogram-using-MediaPipe MediaPipeで姿勢推定を行い、Tokyo2020オリンピック風のピクトグラムを表示するデモです。 Tokyo2020Pictgram02.mp4 Requirement mediapipe 0.8.6 or later O

KazuhitoTakahashi 295 Dec 26, 2022
Streamlit component for TensorBoard, TensorFlow's visualization toolkit

streamlit-tensorboard This is a work-in-progress, providing a function to embed TensorBoard, TensorFlow's visualization toolkit, in Streamlit apps. In

Snehan Kekre 27 Nov 13, 2022
Human Dynamics from Monocular Video with Dynamic Camera Movements

Human Dynamics from Monocular Video with Dynamic Camera Movements Ri Yu, Hwangpil Park and Jehee Lee Seoul National University ACM Transactions on Gra

215 Jan 01, 2023
PyTorch implementation of DUL (Data Uncertainty Learning in Face Recognition, CVPR2020)

PyTorch implementation of DUL (Data Uncertainty Learning in Face Recognition, CVPR2020)

Mouxiao Huang 20 Nov 15, 2022
My personal code and solution to the Synacor Challenge from 2012 OSCON.

Synacor OSCON Challenge Solution (2012) This repository contains my code and solution to solve the Synacor OSCON 2012 Challenge. If you are interested

2 Mar 20, 2022
VOS: Learning What You Don’t Know by Virtual Outlier Synthesis

VOS This is the source code accompanying the paper VOS: Learning What You Don’t

248 Dec 25, 2022
PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

PyTorch code for DriveGAN: Towards a Controllable High-Quality Neural Simulation

76 Dec 24, 2022
A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

David Griffis 532 Jan 02, 2023
The code for SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network.

SAG-DTA The code is the implementation for the paper 'SAG-DTA: Prediction of Drug–Target Affinity Using Self-Attention Graph Network'. Requirements py

Shugang Zhang 7 Aug 02, 2022
Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"

README The code is based on the ILswiss. To run the code, use python run_experiment.py --nosrun -e your YAML file -g gpu id Generally, run_experim

ApexRL 12 Mar 19, 2022
PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation. Warning: the master branch might collapse. To ob

559 Dec 14, 2022
RuleBERT: Teaching Soft Rules to Pre-Trained Language Models

RuleBERT: Teaching Soft Rules to Pre-Trained Language Models (Paper) (Slides) (Video) RuleBERT is a pre-trained language model that has been fine-tune

16 Aug 24, 2022
https://sites.google.com/cornell.edu/recsys2021tutorial

Counterfactual Learning and Evaluation for Recommender Systems (RecSys'21 Tutorial) Materials for "Counterfactual Learning and Evaluation for Recommen

yuta-saito 45 Nov 10, 2022
Exploring Simple Siamese Representation Learning

G-SimSiam A PyTorch implementation which refers to repo for the paper Exploring Simple Siamese Representation Learning by Xinlei Chen & Kaiming He Add

zhuyun 1 Dec 19, 2021
Pyeventbus: a publish/subscribe event bus

pyeventbus pyeventbus is a publish/subscribe event bus for Python 2.7. simplifies the communication between python classes decouples event senders and

15 Apr 21, 2022
A forwarding MPI implementation that can use any other MPI implementation via an MPI ABI

MPItrampoline MPI wrapper library: MPI trampoline library: MPI integration tests: MPI is the de-facto standard for inter-node communication on HPC sys

Erik Schnetter 31 Dec 22, 2022
Brain tumor detection using CNN (InceptionResNetV2 Model)

Brain-Tumor-Detection Building a detection model using a convolutional neural network in Tensorflow & Keras. Used brain MRI images. InceptionResNetV2

1 Feb 13, 2022
CTF challenges and write-ups for MicroCTF 2021.

MicroCTF 2021 Qualifications About This repository contains CTF challenges and official write-ups for MicroCTF 2021 Qualifications. License Distribute

Shellmates 12 Dec 27, 2022
Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation

Ansible Automation Example: JSNAPY PRE/POST Upgrade Validation Overview This example will show how to validate the status of our firewall before and a

Calvin Remsburg 1 Jan 07, 2022