Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment

Overview

Data Scientist in Simple Stock Analysis of PT Bukalapak.com Tbk for Long Term Investment

Brief explanation of PT Bukalapak.com Tbk

Bukalapak was founded on January 10, 2010 by Achmad Zaky, Nugroho Herucahyono, and Fajrin Rasyid in a boarding house while studying at the Bandung Institute of Technology. Bukalapak is one of the e-commerce companies in Indonesia. From the owner of a local shopping brand through an ownership group founded by Achmad Zaky, Nugroho Herucahyono, and Muhamad Fajrin Rasyid in 2010. Bukalapak was originally an online store that allowed Small and Medium Enterprises (SMEs) to venture into cyberspace. The company has now expanded into various other business lines, including helping to increase sales of traditional warungs through the Bukalapak Partner service. In 2017, Bukalapak became one of the unicorn startups from Indonesia. Currently, Bukalapak's valuation has reached 7.6 billion US dollars or around Rp. 110.2 trillion.

Bukalapak conducts Initial Public Offering of Shares

Bukalapak is the largest Initial Public Offering (IPO) company in Indonesia. This is the latest achievement that the Southeast Asian startup community is starting to grow. Quoting CNN.com, Saturday (8/7), Bukalapak reaped fresh funds of US $ 1.5 billion or Rp. 21.4 trillion (exchange rate of Rp. 14,300 per US dollar) from the corporate action. When Bukalapak's shares were traded on the first day, the price immediately jumped almost 25% in the first session. This indicates that many investors are hunting for Bukalapak's shares.

Mandiri Sekuritas said Bukalapak's IPO was oversubscribed by 8.7 times with orders from nearly 100 thousand investors. Bukalapak itself has developed into an e-commerce player in Southeast Asia. Some of Bukalapak's competitors, including Shopee, Lazada, and Tokopedia.

The company is backed by major investors, including Microsoft (MSFT) and Standard Chartered (SCBFF). The company plans to use the funds from the IPO to roll out more features. This will allow Bukalapak to offer more services and add new revenue streams. Indonesia's first unicorn to be listed on the stock exchange has great potential. BUKA is part of a conglomerate that has penetrated into various potential business lines.

As a data analyst, we can do a simple analysis on Bukalapak shares to assess whether the benefits provided are tempting for investors? or is it detrimental? on a long term basis. Through this simple analysis, we will calculate the Return On Investment on the Bukalapak stock

Simple Stock Analysis for Long Term Investment

A. Questions and Goals

  • If we want to invest in long-term intervals, how can we reduce risk while maximizing return on investment?
  • What is the best way to select stocks based on the criteria from the previous question? The purpose of this analysis is to look at Bukalapak's stock in providing a return on investment considering the risks involved. The analysis must provide substantial data to confidently use the stock for future analysis or use.

B. Data Collection

There are many ways to aggregate historical stock prices with their fundamentals. For this particular analysis, Pandas provides a library for retrieving forum data from multiple sources. The library is called Pandas Data Reader. The library is a wrapper for retrieving data like historical stock prices, country GDP, World economic data, etc. This analysis in particular. Yahoo Finance data is used because it is free and has a very large stock database. By avoiding manual, tedious work like downloading CSV files, analysis can be used for as much stock as possible. By doing import pandas_datareader.data as web

Since the final analysis will involve a return on investment, the Closing price is Adjusted. Used This option was taken to simplify the analysis. also, based on the adjusted closing price period, the price should best reflect the BUKA price on a given day.

Plotting Price BUKA Stock

download (3)

The plot above shows the price of BUKA from the beginning of the IPO until November 2021. As expected, the data has shown a clear trend, namely a down trend. For example, let's take the return on investment if someone returns the money at the start of the IPO. Since plots of the same style will be widely used in this analysis, functions can be written to simplify future use.

Result Return On Investment

Screenshot 2021-11-13 134113

By using logarithmic returns using the `NumPy` library, the result is that the return on investment in BUKA.JK shares from the beginning of the IPO until November 2021 is -35.94%. so if we invest from the beginning of the IPO of Rp 10 million, our money now will be Rp 6.5 million. Of course we have suffered losses, but our analysis does not end here, we will conduct a deeper analysis to see how attractive BUKA shares are in the investor's portfolio.

C. Data Analysis

With a quick glance at the data, there is some work to be done before it can be used. One of them is looking for a return on investment from every day. This is a necessary step. In Investing, the stock price on a given day is not very relevant. Price difference 2 different days.

Visualization ROI of BUKA Stock everyday

download

By looking at the graph above, although it's a bit difficult to tell the difference, there are some clues that can be taken. For example, the worst ROI occurred around August and October 2021 where it was less than -6%. However, the best day comes also around November 2021 where ROI exceeds 8% in one day. Furthermore, at the outset, it is stated that the analysis will be used to invest in long-term intervals. This means the daily ROI will be less relevant because the interval is too short. Resampling is a good way to convert data from daily ROI to monthly ROI.

Visualization ROI of BUKA Stock every month

download (1)

Describe

Screenshot 2021-11-13 140503

From the visualization and summary above, the data now looks a bit easier to read. For example, although ROIs fluctuate, they are not far from a certain point. This is called oscillation. The mean of the oscillations is the average of the data, the form of this example is -8.98 This also means, on average, one will earn an estimated ROI of -8.98% every month over a 4 month period. Of course this is not desired by investors, because it will result in losses. Then, their oscillations also have bounded upper and lower bounds, and it is called standard deviation or std. with a slight modification to the code, everything can be visualized as follows.

Visualization Volatility ROI of BUKA Stock every month

download (2)

In stock prices, std is called volatility. This is an important metric because when large amounts of money are involved, less volatile stocks are more profitable. Less volatile stocks mean they are easier to predict because they are also less risky. Moreover, If the data is normally distributed, one is normally distributed. One that fits is the Q-Q plot.

Visualization Q-Q Plot

download (4)

Using the statsmodels library, the Q-Q plot should show if the dataset is normally distributed. If most of the points fall on the red line, then the distribution is normally distributed. Unfortunately, not this data. So in short, collecting stock data over a 4 month period can yield a rough estimate for generating expectations about what ROI and risk are used for. Mean and Standard Deviation.

Conclusion

We can conclude that during the past 4 months, BUKA shares have been less attractive in terms of ROI. However, this BUKA stock is attractive to be included in an investor's portfolio because the price is already relatively cheap. In terms of long-term investment, this stock is attractive to buy considering that Bukalapak is a startup company with a high valuation. It is possible that in the long term there will be an increase. However, to determine whether the risk and ROI is high or low, a comparison with other stocks is needed.

Owner
Najibulloh Asror
`Welcome to my world`
Najibulloh Asror
A simplified prototype for an as-built tracking database with API

Asbuilt_Trax A simplified prototype for an as-built tracking database with API The purpose of this project is to: Model a database that tracks constru

Ryan Pemberton 1 Jan 31, 2022
Creating a statistical model to predict 10 year treasury yields

Predicting 10-Year Treasury Yields Intitially, I wanted to see if the volatility in the stock market, represented by the VIX index (data source), had

10 Oct 27, 2021
Udacity-api-reporting-pipeline - Udacity api reporting pipeline

udacity-api-reporting-pipeline In this exercise, you'll use portions of each of

Fabio Barbazza 1 Feb 15, 2022
My solution to the book A Collection of Data Science Take-Home Challenges

DS-Take-Home Solution to the book "A Collection of Data Science Take-Home Challenges". Note: Please don't contact me for the dataset. This repository

Jifu Zhao 1.5k Jan 03, 2023
We're Team Arson and we're using the power of predictive modeling to combat wildfires.

We're Team Arson and we're using the power of predictive modeling to combat wildfires. Arson Map Inspiration There’s been a lot of wildfires in Califo

Jerry Lee 3 Oct 17, 2021
Project under the certification "Data Analysis with Python" on FreeCodeCamp

Sea Level Predictor Assignment You will anaylize a dataset of the global average sea level change since 1880. You will use the data to predict the sea

Bhavya Gopal 3 Jan 31, 2022
An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify.

An ETL Pipeline of a large data set from a fictitious music streaming service named Sparkify. The ETL process flows from AWS's S3 into staging tables in AWS Redshift.

1 Feb 11, 2022
Renato 214 Jan 02, 2023
This repository contains some analysis of possible nerdle answers

Nerdle Analysis https://nerdlegame.com/ This repository contains some analysis of possible nerdle answers. Here's a quick overview: nerdle.py contains

0 Dec 16, 2022
2019 Data Science Bowl

Kaggle-2019-Data-Science-Bowl-Solution - Here i present my solution to kaggle 2019 data science bowl and how i improved it to win a silver medal in that competition.

Deepak Nandwani 1 Jan 01, 2022
Validation and inference over LinkML instance data using souffle

Translates LinkML schemas into Datalog programs and executes them using Souffle, enabling advanced validation and inference over instance data

Linked data Modeling Language 7 Aug 07, 2022
Aggregating gridded data (xarray) to polygons

A package to aggregate gridded data in xarray to polygons in geopandas using area-weighting from the relative area overlaps between pixels and polygons. Check out the binder link above for a sample c

Kevin Schwarzwald 42 Nov 09, 2022
An orchestration platform for the development, production, and observation of data assets.

Dagster An orchestration platform for the development, production, and observation of data assets. Dagster lets you define jobs in terms of the data f

Dagster 6.2k Jan 08, 2023
A CLI tool to reduce the friction between data scientists by reducing git conflicts removing notebook metadata and gracefully resolving git conflicts.

databooks is a package for reducing the friction data scientists while using Jupyter notebooks, by reducing the number of git conflicts between different notebooks and assisting in the resolution of

dataroots 86 Dec 25, 2022
Stream-Kafka-ELK-Stack - Weather data streaming using Apache Kafka and Elastic Stack.

Streaming Data Pipeline - Kafka + ELK Stack Streaming weather data using Apache Kafka and Elastic Stack. Data source: https://openweathermap.org/api O

Felipe Demenech Vasconcelos 2 Jan 20, 2022
Using approximate bayesian posteriors in deep nets for active learning

Bayesian Active Learning (BaaL) BaaL is an active learning library developed at ElementAI. This repository contains techniques and reusable components

ElementAI 687 Dec 25, 2022
Bamboolib - a GUI for pandas DataFrames

Community repository of bamboolib bamboolib is joining forces with Databricks. For more information, please read our announcement. Please note that th

Tobias Krabel 863 Jan 08, 2023
Python utility to extract differences between two pandas dataframes.

Python utility to extract differences between two pandas dataframes.

Jaime Valero 8 Jan 07, 2023
Data imputations library to preprocess datasets with missing data

Impyute is a library of missing data imputation algorithms. This library was designed to be super lightweight, here's a sneak peak at what impyute can do.

Elton Law 329 Dec 05, 2022