Extrator de dados do jupiterweb

Overview

Extrator de dados do jupiterweb

O programa é composto de dois arquivos:

  • Um constando apenas de classes complementares que representam as unidades e as disciplinas
  • Outro que executa o processo de extração dos dados do jupiterweb

Em essência o programa faz um get para a pagina do jupiterweb que contem a lista das unidades, onde extra quais sao as unidades ativas e algumas informçoes sovre elas (nome e código da disciplina).

A partir da informação, extrai-se quais as disciplinas são ministradas e, em caso de encerradas, foram ministradas pelas unidades. Destas disciplinas existem as informaçoes basicas (Codigo, nome, data de ativação e desativação) e informaçoes complementares, que podem ser obrigatorias ou não (Creditos, Metodo, Docente, Tipo de recuperação, etc)

No codigo principal existe um método toJson que importa os dados extraidos relativo a cada unidade e suas disciplinas para um arquivo no formato Json.

O formato do arquivo Json é:

{
 codigo_da_unidade: {
   nome: "nome da unidade",
   code: "codigo_da_unidade",
   disciplinas: {
     codigo_da_disciplina: {
          nome: "nome_da_disciplina",
          codigo: "codigo_da_disciplina",
          ativacao: "data_de_ativacao",
          desativacao: "data_de_desativacao",
          credito_aula: "numero_de_creditos",
          credito_trabalho: "numero_de_creditos",
          tipo: "semestral/anual",
          objetivos: "objetivos_da_disciplina",
          docentes: "docentes",
          programa: "programa_da_disciplina",
          programa_resumido: "programa_resumido_da_disciplina",
          metodo: "metodo_de_avaliacao",
          criterio: "criterio_de_aprovacao",
          norma_de_recuperacao: "tipo_de_recuperacao",
          bibliografia: "bibliografia_da_disciplina"
    },
    #Outras_disciplinas_da_unidade#
    }
  },
  #outras_unidades#
}

Consta no repositório um arquivo extraído do jupiterweb no dia 02/12/2021 como exemplo. Lembrando que eventuais mudanças de layout no jupiterweb podem interferir no desempenho e bom funcionamento do algoritmo, ja que os dados são obtidos por meio de web scrapping e web crawling.

Owner
Bruno Aricó
Comp. Scientist and enthusiast in Hardware and Aviation
Bruno Aricó
A data driven app for bicycle hiring in London(UK)

bicycle_hiring_app_deployed A data driven app for bicycle hiring in London(UK). It predicts expected number of bicycle hire in London. It asks users t

Rajarshi Roy Raju 1 Dec 10, 2021
Collie is for uncovering RDMA NIC performance anomalies

Collie is for uncovering RDMA NIC performance anomalies. Overview Prerequ

Bytedance Inc. 34 Dec 11, 2022
Tindicators is a Python library to calculate the values of various technical indicators

Tindicators is a Python library to calculate the values of various technical indicators

omar 3 Mar 03, 2022
Find the remote website version based on a git repository

versionshaker Versionshaker is a tool to find a remote website version based on a git repository This tool will help you to find the website version o

Orange Cyberdefense 110 Oct 23, 2022
A Classroom Engagement Platform

Project Introduction This is project introduction Setup Setting up Postgres This is the most tricky part when setting up the application. You will nee

Santosh Kumar Patro 1 Nov 18, 2021
laTEX is awesome but we are lazy -> groff with markdown syntax and inline code execution

pyGroff A wrapper for groff using python to have a nicer syntax for groff documents DOCUMENTATION Very similar to markdown. So if you know what that i

Subhaditya Mukherjee 27 Jul 23, 2022
Module 2's katas from Launch X's python introduction course.

Module2Katas Module 2's katas from Launch X's python introduction course. Virtual environment creation process (on Windows): Create a folder in any de

Javier Méndez 1 Feb 10, 2022
use Notepad++ for real-time sync after python appending new log text

FTP远程log同步工具 使用Notepad++配合来获取实时更新的log文档效果 适用于FTP协议的log远程同步工具,配合MT管理器开启FTP服务器使用,通过Notepad++监听文本变化,更便捷的使用电脑查看方法注入打印后的信息 功能 过滤器 对每行要打印的文本使用回调函数筛选,支持链式调用

Liuhaixv 1 Oct 17, 2021
Adjust the white point, gamma or make your XDR display darker without losing HDR peak luminance or the ability to adjust display brightness

XDR Tuner Adjust the white point, gamma or make your XDR display darker without losing HDR peak luminance or the ability to adjust display brightness

François Simond 16 Dec 28, 2022
Use `forge` and `cast` commands in Python scripts

foundrycli.py ( 🔥 , 🐍 ) foundrycli.py is a Python library I've made for personal use; now open source. It lets you access forge and cast CLIs from P

Zero Ekkusu 17 Jul 17, 2022
Edorado93 - Unraveling a Rockstar! -- Too much? Fine, Unraveling a humble programmer then?

Hi, I'm Sachin Malhotra ( ⛄ 💻 🎃 🍺 ) Let me set the records straight. Roger Federer is the GOAT and I will not hear otherwise! Now that we have that

Sachin Malhotra 7 Dec 25, 2022
Annotates sequences with Eggnog-mapper and hhblits against PDB70

Annotating "hypothetical" proteins with the PDB See config/ for configuration information. This workflow takes as input a set of protein sequences. It

1 Apr 05, 2022
Python scripts to interact with Upper Deck ePack online trading card platform

This script should connect to the Upper Deck ePack API using your browser cookies and download a list of your current collection and save it as a CSV.

Adrian Kent 1 Nov 22, 2021
The code for 2021 MGTV AI Challenge Anti Stealing Link, and the online result ranks 10th.

赛题介绍 芒果TV-第二届“马栏山杯”国际音视频算法大赛-防盗链 随着业务的发展,芒果的视频内容也深受网友的喜欢,不少视频网站和应用开始盗播芒果的视频内容,盗链网站不经过芒果TV的前端系统,跳过广告播放,且消耗大量的服务器、带宽资源,直接给公司带来了巨大的经济损失,因此防盗链在日常运营中显得尤为重要

tongji40 16 Jun 17, 2022
A Python script to parse Fortinet products serial numbers, and detect the associated model and version.

ParseFortinetSerialNumber A Python script to parse Fortinet products serial numbers, and detect the associated model and version. Example $ ./ParseFor

Podalirius 10 Oct 28, 2022
A multi-platform fuzzer for poking at userland binaries and servers

litefuzz A multi-platform fuzzer for poking at userland binaries and servers litefuzz intro why how it works what it does what it doesn't do support p

52 Nov 18, 2022
This application is made solely for entertainment purposes

Timepass This application is made solely for entertainment purposes helps you find things to do when you're bored ! tells jokes guaranteed to bring on

Omkar Pramod Hankare 2 Nov 24, 2021
A python library with various gambling and gaming classes

gamble is a simple library that implements a collection of some common gambling-related classes Features die, dice, d-notation cards, decks, hands pok

Jacobi Petrucciani 16 May 24, 2022
Step by step development of a vending coffee machine project, including tkinter, sqlite3, simulation, etc.

Step by step development of a vending coffee machine project, including tkinter, sqlite3, simulation, etc.

Nikolaos Avouris 2 Dec 05, 2021
Remote Worker

Remote Worker Separation of Responsibilities There are several reasons to move some processing out of the main code base for security or performance:

V2EX 69 Dec 05, 2022