Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

Last update: Sep 29, 2022

Overview

Comment Webpage Screenshot

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository, uploads them to an Image Upload Service and comments the screenshots on the pull request that triggered the action.

Note: This Action Only Works on Pull Requests.

Workflow inputs

These are the inputs that can be provided on the workflow.

Name	Required	Description	Default
`upload_to`	No	Image Upload Service Name (Options are: `github_branch`, `imgur`) More Details	`github_branch`
`capture_changed_html_files`	No	Enable or Disable Screenshot Capture for Changed HTML Files on the Pull Request (Options are: `yes`, `no`)	`yes`
`capture_html_file_paths`	No	Comma Seperated paths to the HTML files to be captured (Example: `/pages/index.html, about.html`)	`null`
`capture_urls`	No	Comma Seperated URLs to be captured (Example: `https://dev.example.com, https://dev.example.com/about.html`)	`null`
`github_token`	No	`GITHUB_TOKEN` provided by the workflow run or Personal Access Token (PAT)	`github.token`

Example Workflow

name: Comment Webpage Screenshot

on:
  pull_request:
    types: [opened, reopened, synchronize]

jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/[email protected]

      - name: Comment Webpage Screenshot
        uses: saadmk11/[email protected]
        with:
          # Optional, the action will create a new branch and
          # upload the screenshots to that branch.
          upload_to: github_branch  # Or, imgur
          # Optional, the action will capture screenshots
          # of all the changed html files on the pull request.
          capture_changed_html_files: yes  # Or, no
          # Optional, the action will capture screenshots
          # of the html files provided in this input.
          # Comma seperated file paths are accepted
          capture_html_file_paths: "/pages/index.html, about.html"
          # Optional, the action will capture screenshots
          # of the URLs provided in this input.
          # You can add URLs of your development server or
          # run the server in the previous step
          # and add that URL here (For Example: http://172.17.0.1:8000/).
          # Comma seperated URLs are accepted.
          capture_urls: "https://dev.example.com, https://dev.example.com/about.html"
          # Optional
          github_token: {{ secrets.MY_GITHUB_TOKEN }}

Run Local Development Server Inside the Workflow and Capture Screenshots

If you want to run your application development server inside the action workflow and capture screenshot from the server running inside the workflow, Then You can structure the workflow yaml file similar to this:

Using Docker to Run The Application:

name: Comment App Screenshot

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      # Checkout your pull request code
    - uses: actions/[email protected]
    
    # Build Development Docker Image
    - run: docker build -t local .
    # Run the Docker Image
    # You need to run this detached (-d)
    # so that the action is not blocked
    # and can move on to the next step
    # You Need to publish the port on the host (-p 8000:8000)
    # So that it is reachable outside the container
    - run: docker run --name demo -d -p 8000:8000 local
    # Sleep for few seconds and let the container start
    - run: sleep 10
    
    # Run Screenshot Comment Action
    - name: Run Screenshot Comment Action
      uses: saadmk11/[email protected]
      with:
        upload_to: github_branch
        capture_changed_html_files: no
        # You must use `172.17.0.1` if you are running
        # the application locally inside the workflow
        # Otherwise the container which will run this action 
        # will not be able to reach the application
        capture_urls: 'http://172.17.0.1:8000/, http://172.17.0.1:8000/admin/login/'

Directly Running The Application:

name: Comment App Screenshot

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/[email protected]

      - name: Use Node.js
        uses: actions/[email protected]
        with:
          node-version: '16.x'
      # Use `nohup` to run the node app
      # so that the execution of the next steps are not blocked
      - run: nohup node main.js &
      # Sleep for few seconds and let the container start
      - run: sleep 5

      # Run Screenshot Comment Action
      - name: Run Screenshot Comment Action
        uses: saadmk11/[email protected]
        with:
          upload_to: imgur
          capture_changed_html_files: no
          # You must use `172.17.0.1` if you are running
          # the application locally inside the workflow
          # Otherwise, the container which will run this action 
          # will not be able to reach the application
          capture_urls: 'http://172.17.0.1:8081'

Important Note:

If you run the application server inside the GitHub Actions Workflow:

You need to run it in the background or detached mode.
If you are using docker to run your application server you need top publish the port to the host (for example: -p 8000:8000).
you can not use localhost url on capture_urls. You need to use 172.17.0.1 so that comment-webpage-screenshot action can send request to the server running locally. So, http://localhost:8081 will become http://172.17.0.1:8081

Examples including application code can be found here: Example Projects

Run External Development Server and Capture Screenshots

If your application has a external development server that deploys changes on every pull request. You can add the URLs of your development server on capture_urls input. This will let the action capture screenshots from the external development server after deployment.

Example:

name: Comment App Screenshot

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    # Run Screenshot Comment Action
    - name: Run Screenshot Comment Action
      uses: saadmk11/[email protected]
      with:
        upload_to: github_branch
        capture_changed_html_files: no
        # Add you external development server URL
        capture_urls: 'https://dev.example.com, https://dev.example.com/about.html'

Capture Screenshots for Static HTML Pages

If your repository contains only static files and does not require a server. You can just put the file path of the HTML files you want to capture screenshot of.

Example:

name: Comment Static Site Screenshot

on:
  pull_request:
    types: [opened, synchronize]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    # Run Screenshot Comment Action
    - name: Run Screenshot Comment Action
      uses: saadmk11/[email protected]
      with:
        upload_to: imgur
        # Capture Screenshots of Changed HTML Files
        capture_changed_html_files: yes
        # Comma seperated paths to any other HTML File
        capture_html_file_paths: "/pages/index.html, about.html"

Available Image Upload Services

As GitHub Does not allow us to upload images to a comment using the API we need to rely on other services to host the screenshots.

These are the currently available image upload services.

Imgur

If the value of upload_to input is imgur then the screenshots will be uploaded to Imgur. Keep in mind that the uploaded screenshots will be public and anyone can see them. Imgur also has a rate limit of how many images can be uploaded per hour. Refer to Imgur's Rate Limits Docs for more details. This is suitable for small open source repositories.

Please refer to Imgur terms of service here

GitHub Branch (Default)

If the value of upload_to input is github_branch then the screenshots will be pushed to a GitHub branch created by the action on your repository. The screenshots on the comments will reference the Images pushed to this branch.

This is suitable for open source and private repositories.

If you want to add/use a different image upload service, feel free create a new issue/pull request.

Examples

You Can find some example use cases of this action here: Example Projects

Here are some comments created by this action on the Example Project:

License

The code in this project is released under the GNU GENERAL PUBLIC LICENSE Version 3.

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Parsel Parsel is a BSD-licensed Python library to extract and remove data from HTML and XML using XPath and CSS selectors, optionally combined with re

859 Dec 29, 2022

Extract embedded metadata from HTML markup

extruct extruct is a library for extracting embedded metadata from HTML markup. Currently, extruct supports: W3C's HTML Microdata embedded JSON-LD Mic

725 Jan 3, 2023

A pure-python HTML screen-scraping library

Scrapely Scrapely is a library for extracting structured data from HTML pages. Given some example web pages and the data to be extracted, scrapely con

1.8k Dec 31, 2022

WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

Project A: WebScraper A script that prints out a list of all EXTERNAL references

2 Apr 26, 2022

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

slocpi-scraper Sun Life of Canada Philippines Inc Investment Funds Scraper Install dependencies pip install -r requirements.txt Usage General format:

2 Jan 7, 2022

Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc)

Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc).

6 Aug 26, 2022

robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.

RoboBrowser: Your friendly neighborhood web scraper Homepage: http://robobrowser.readthedocs.org/ RoboBrowser is a simple, Pythonic library for browsi

3.7k Dec 27, 2022

A web service for scanning media hosted by a Matrix media repository

Matrix Content Scanner A web service for scanning media hosted by a Matrix media repository Installation TODO Development In a virtual environment wit

5 Dec 1, 2022

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

Github Scraper Github scraper app is used to scrape data for a specific user profile. Github scraper app gets a github profile name and check whether

6 Apr 5, 2022

Comments

Good job, thank you

Buddy, very good work! This is awesome. In a place that feels like the wild wild west, full of badly documented actions, crappy ones, actions that do not work or that are built for one person very specific use case finding a gem like this just feels awesome. It worked out of the box, with the provided example configuration and does exactly what I expect it to do: I'm impressed. Took me more time to find it than using it, and that is already a compliment to it. I love how it takes advantage of the features of github without requiring you to use anything else. Very, very good job, thank you.

One little question, does it keep the history of screenshot or is the branch overwritten? Regards
question

opened by danielo515 2
Factor out comment/upload feature from screenshot feature

Hey ! Thanks a lot for this !

In my use case, I do not want to use the screenshoting part, since I already produce my own in my dockerized test suite (by the way, they are animated gifs, not images).

Unfortunately, this makes your action unusable. This is a shame since all the logic about upload service and calling the API works great.

I'd suggest factoring out the upload/comment part to a different action, usable on it's own. I've done this here: https://github.com/opengisch/comment-pr-with-images but of course happy to merge back / collaborate on this if you're interested maintaining a splitted version of your action.

Cheers.

opened by olivierdalang 0

Comment Webpage Screenshot is a GitHub Action that captures screenshots of web pages and HTML files located in the repository

Related tags

Overview

Comment Webpage Screenshot

Workflow inputs

Example Workflow

Run Local Development Server Inside the Workflow and Capture Screenshots

Important Note:

Run External Development Server and Capture Screenshots

Capture Screenshots for Static HTML Pages

Available Image Upload Services

Imgur

GitHub Branch (Default)

Examples

License

You might also like...

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Extract embedded metadata from HTML markup

A pure-python HTML screen-scraping library

WebScraper - A script that prints out a list of all EXTERNAL references in the HTML response to an HTTP/S request

Scrapes the Sun Life of Canada Philippines web site for historical prices of their investment funds and then saves them as CSV files.

Python based Web Scraper which can discover javascript files and parse them for juicy information (API keys, IP's, Hidden Paths etc)

robobrowser - A simple, Pythonic library for browsing the web without a standalone web browser.

A web service for scanning media hosted by a Matrix media repository

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

Comments

Good job, thank you

Factor out comment/upload feature from screenshot feature

Releases(v0.5)

v0.5(Nov 19, 2021)

Owner

Maksudul Haque

Web-Scraping using Selenium Master

Pyrics is a tool to scrape lyrics, get rhymes, generate relevant lyrics with rhymes.

This is a module that I had created along with my friend. It's a basic web scraping module

Web Scraping images using Selenium and Python

Dude is a very simple framework for writing web scrapers using Python decorators

A Python Covid-19 cases tracker that scrapes data off the web and presents the number of Cases, Recovered Cases, and Deaths that occurred because of the pandemic.

script to scrape direct download links (ddls) from google drive index.

Scrapy uses Request and Response objects for crawling web sites.

爬取各大SRC当日公告 | 通过微信通知的小工具 | 赏金工具

Footballmapies - Football mapies for learning webscraping and use of gmplot module in python

A dead simple crawler to get books information from Douban.

Tool to scan for secret files on HTTP servers

This is a sport analytics project that combines the knowledge of OOP and Webscraping

A web scraper for nomadlist.com, made to avoid website restrictions.

This is my CS 20 final assesment.

Python script for crawling ResearchGate.net papers✨⭐️📎

Transistor, a Python web scraping framework for intelligent use cases.

Web scraper build using python.

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸 每日一句 + 毒鸡汤（从2月份稳定运行至今）

A web crawler for recording posts in "sina weibo"

python+selenium实现的web端自动打卡 + 每日邮件发送 + 金山词霸每日一句 + 毒鸡汤（从2月份稳定运行至今）