A research into mail services used by different business sectors.

Overview

A Brief Research in Mail Service Used by Organisations

Author: Zeji Chen (email), student @ Lancaster University.


Table of Contents

Introduction

This project, or research, is conducted in supplyment of evidence for the author's essay assignment. This project uses Python as a crawler to verify MX Records (mail server record) for individual domains.

The Python crawler used is written by Ryan Zhao (email) - student at Manchester University, UK. Ryan has fully agreed and authorised the use of his script in this research.

The data, scripts and other files of this project is made public at https://github.com/ChenFocus/alexa-top-sites-mx. The code repository contains some files involved in the early stage of this project, that may not be relevant.

Aim of Research

This research attempts to find out the greater market share occupier between Gmail and Outlook, by looking at a large sample of organisations' domains in both technology and education sector. Please refer to Dataset Used.

Datasets Used

Alexa Top 1M Sites

Alexa is a web analytic & intelligence company owned by Amazon. The Alexa World Site Rank is often regarded as the most representive world site rank.

World Global University Dataset

Datasets Availability

Alexa Top 1M Sites Dataset: http://s3.amazonaws.com/alexa-static/top-1m.csv.zip

Please be aware that although this data is from Alexa officially, some said it's no longer updated.

Global University Dataset: https://raw.githubusercontent.com/Hipo/university-domains-list/master/world_universities_and_domains.json

Please be aware that as this data is from an open source project, which its content may change anytime in the future, if you would like to get the same dataset as this project, at the date of this research is conducted, please instead download the data from this repository, or at https://raw.githubusercontent.com/ChenFocus/university-domains-list/master/world_universities_and_domains.json.

Method

Through the dig commandline software, one is able to find the MX Record (mail server record) for a domain. By running this command:

dig +short mx example.com

If we tried to find, for example, what mail service is used by Twitter. Inc, we can run:

dig +short mx example

# response
10 aspmx.l.google.com.
30 aspmx3.googlemail.com.
20 alt1.aspmx.l.google.com.
20 alt2.aspmx.l.google.com.
30 aspmx2.googlemail.com.

Therefore we know that Twitter use googlemail, which is Gmail, for its domain twitter.com.

The Python crawler, can read a single-lined .csv (a data-sheet format), then run the dig command from a Python library (dependecy) dnspython, print the current progress of querying MX records and current stats gathered, eventually output the result as such:

Gmail:  305 
Outlook:  83 
Others:  524 
N/A:  88

Explanation:

Service name/type Description
Gmail: The count of domains in the dataset provided that use Gmail on their domain
Outlook: The count of domains in the dataset provided that use Outlook on their domain
Other: The count of domains in the dataset provided that use other email serviceson their domain
N/A: No relevant record detected; the domain could have not been configured to run email service

Accuracies/Error Analysis

As the Alexa dataset may contain asset domains (domains that are used to solely distribute static file/other recources), such as software-download.microsoft.com, which dig +short mx software-download.microsoft.com wouldn't return anything as there is no mail services running on the domain; or, no mail service is ran on the root domain. However, all calculations made from the result will exclude occassions like this.

It's known that The World Global University Dataset may not be complete and contains some inaccurate information (for example, the domain of Lancaster University in the dataset is lancs.ac.uk, instead of lancaster.ac.uk). This may affect the conclusion drawn from this dataset.

It's also known that different runs of the script may produce slightly different results, as one's DNS record may be failed to be fetched due to server/internet errors. Multiple runs of the script should eliminate this error partially.

Effort had been made to eliminate any error for any calculations made / conclusions drawn, at the best of personal knowledge, capabiltiy and computer/network limits.

Method

Getting the usage of mail service in Top 1k, 10k, 100k, 1M Alexa Ranked Sites

At a Linux computer:

git clone github.com/ChenFocus/alexa-top-sites-mx
cd alexa-top-sites-mx
cd python_method

# make sure Python 3 and Pip 3 is installed
pip install dnspython
python main.py 1000 | tee result_1000.txt # modify the `1000` to 10000 and more to get different results
# wait for completion of the script

A cloud computer (server) is used to calculate results for above targets.

As the Alexa rank list goes further, more irrevelant/personal domains may be recorded, therefore resulting in more inaccurate results.

Getting the usage of mail service in Global University Dataset

At a Linux computer:

git clone github.com/ChenFocus/alexa-top-sites-mx
cd alexa-top-sites-mx
cd python_method

# make sure Python 3 and Pip 3 is installed
pip install dnspython
python main.py 9773 | tee result.txt # Check whole dataset - usage of mail services
python us_uni.py 2730 | tee result_us_uni.txt # Check for US universities only
python uk_uni.py 161 | tee result_uk_uni.txt # Check for UK universities only
# wait for completion of the script

The conversion of the original .json data to .csv format is through a series of replace/regrex match/plugin operation conducted in Visual Studio Code.

Results

Alexa Dataset

Dataset Name/Variation Gmail Counts Outlook Counts Others Counts N/A Counts Total Counts Total Valid Counts Calculated Gmail Share Calculated Outlook Share Gmail Outcompete Outlook in Technology Sector By
Alexa Top 1M - 1k list 305 83 524 88 1000 912 33.4% 9.1% 73%
Alexa Top 1M - 10k list 2779 873 4961 1387 10000 8613 32.3% 10.1% 69%
Alexa Top 1M - 100k list 22527 8461 51942 17070 100000 82930 27.2% 10.2% 62%

Note that the result for the whole 1 million site is currently being calculated by a cloud server. The result of this may be updated in the future.

Global University Dataset

Dataset Name/Variation Gmail Counts Outlook Counts Others Counts N/A Counts Total Counts Total Valid Counts Calculated Gmail Share Calculated Outlook Share Outlook Outcompete Gmail in Education Sector By
World University list 1296 2158 4691 1628 9773 8145 15.9% 26.5% 40%
World University list - US only 298 923 876 273 2370 2097 14.2% 44.0% 68%
World University list - UK only 8 63 65 25 161 136 5.9% 46.3% 87%

Simple Conclusions

From the results, Gmail leads the competition in the technology sector by occupying 33.4% market share in the Top 1000 websites ranked by Alexa. This data is extended into 32.3%, 27.2% when 10k, 100k top sites are used as samples. From results available, Gmail beats Outlook in the technology sector with at least 60% more market share.

Outlook outcompetes Gmail in the education sector, occupying 44.0% and 46.3% market share in education section in US & UK respectively, outcompeting Gmail with 68% and 87% more market share.

Result Tables

Technology Sector - From Alexa Top 1M Dataset

Dataset Name/Variation Calculated Gmail Share Gmail Outcompete Outlook in Technology Sector By
Alexa Top 1M - 1k list 33.4% 73%
Alexa Top 1M - 10k list 32.3% 69%
Alexa Top 1M - 100k list 27.2% 62%

Education Sector - From Global University Dataset

Dataset Name/Variation Calculated Outlook Share Outlook Outcompete Gmail in Education Sector By
World University list 26.5% 40%
World University list - US only 44.0% 68%
World University list - UK only 46.3% 87%
Owner
Focus Chen
My Academic Research at Lancaster University
Focus Chen
A news curator and newsletter subscription package for Django

django-newsfeed What is django-newsfeed? django-newsfeed is a news curator and newsletter subscription package for django. It can be used to create a

Maksudul Haque 179 Nov 14, 2022
Dotfiles and some scripts for NeoMutt

Mutt dotfiles Robust Mutt configs with examples for the following account types: Generic IMAP/SMTP Google (Gmail/Gsuite etc) via IMAP/SMTP Microsoft O

CEUK 29 Jan 04, 2023
Send Multiple Mail From List With Python

Send Multiple Mail From List With Python You can send multiple e-mail using HTML themes with Python. Here is the e-mail information to be sent. #The m

Mücahid Eker 1 Dec 23, 2021
ghotok mail - lets you find available contact email addresses from target website

ghotok-mail ghotok mail - lets you find available contact email addresses from target website git clone https://github.com/josifkhan/ghotok-mail cd gh

Md Josif Khan 3 Mar 14, 2022
A script based on an article I wrote on decluttering emails.

Decluttering_Email A script based on an article I wrote on decluttering emails. What does this program do? This program is a python script that sends

Ogheneyoma Obomate Okobiah 6 Oct 21, 2021
A CLI client for sending text emails. (Currently only gmail supported)

emailCLI A CLI client for sending text emails. (Currently only gmail supported)

Amethist 3 Dec 17, 2021
Spam-bot - Simple email-spammer discord bot

📝 Functional [ ✔️ ] Premium system via .json [ ✔️ ] Spammer [ ✔️ ] Validater [ ✔️ ] Discord bot ❓ How to launch ➡️ 1) Make discord bot ➡️ 2) Paste to

1 Feb 18, 2022
Fastapi mail system sending mails(individual, bulk) attachments(individual, bulk)

Fastapi-mail The fastapi-mail simple lightweight mail system, sending emails and attachments(individual && bulk) 🔨 Installation $ pip install fastap

Sabuhi 399 Dec 29, 2022
Pysces (read: Pisces) is a program to help you send emails with an user-customizable time-based scheduling.

Pysces (Python Scheduled-Custom-Email-Sender) Pysces (read: Pisces) is a program to help you send emails with an user-customizable time-based email se

Peter 1 Jun 16, 2022
This library is helpful when creating accounts, it has everything you need for this

AccountGeneratorHelper Library to facilitate accounts generation. Unofficial API for temp email services. Receive SMS from free services. Parsing and

Denis 52 Jan 07, 2023
Tempmail API aswell as a SMTP server.

Tempmail API/Server Tempmail API aswell as a SMTP server. Website · Report Bug · Request Feature Setup Firstly create a mongodb account, and proceed t

femboy.party 16 Mar 09, 2022
Django email backends and webhooks for Amazon SES, Mailgun, Mailjet, Postmark, SendGrid, Sendinblue, SparkPost and more

Django email backends and webhooks for Amazon SES, Mailgun, Mailjet, Postmark, SendGrid, Sendinblue, SparkPost and more

1.4k Jan 01, 2023
A Django email backend that uses a celery task for sending the email.

django-celery-email - A Celery-backed Django Email Backend A Django email backend that uses a Celery queue for out-of-band sending of the messages. Wa

Paul McLanahan 430 Dec 16, 2022
A small system for writing via email.

A small system for writing via email.

0 Nov 24, 2021
Django module to easily send emails/sms/tts/push using django templates stored on database and managed through the Django Admin

Django-Db-Mailer Documentation available at Read the Docs. What's that Django module to easily send emails/push/sms/tts using django templates stored

LPgenerator 250 Dec 21, 2022
A functional demo of the O365 Module to send an email on an authenticated, tokenized account.

O365_email A functional demo of the O365 Module to send an email on an authenticated, tokenized account. Prep Create an app in Azure Developer's porta

2 Oct 14, 2022
ok-mail-helper是一个基于imap/smtp协议邮件客户端,使用python3.x开发

ok-mail-helper ok-mail-helper是一个基于imap/smtp协议邮件客户端,使用python3.x开发,支持邮件接收并解析、邮件发送,用户可在自己的项目中直接引入、开箱即用,或者结合flask等web框架轻松做成http接口供前端调用、把邮箱管理集成到自己的系统中,亦可通过

xlvchao 1 Feb 08, 2022
This simple python script uses cv2 to create and mail certificates to participants of workshops.

This simple python script uses cv2 to create and mail certificates to participants of workshops. Just collect the names and email ids of participants in a csv file (i used google docs), and place it

Sounder Rajendran 0 Dec 19, 2022
Disposable Temporary Email (Python Library)

Disposable Temporary Email (Python Library)

krypton 13 Nov 24, 2022
You take an email and password from the combo list file and check it on mail.com

Brute-Force-mail tool information: Combo Type: email:pass Domains: All domains of the site Url: https://www.mail.com Api: ☑️ Proxy: No ☑️ The correct

6 Jun 05, 2022