Get you an ultimate lexer generator using Fable; port OCaml sedlex to FSharp, Python and more!

Overview

NOTE: currently we support interpreted mode and Python source code generation.

It's EASY to compile compiled_unit into source code for C#, F# and others. See CodeGen.Python.fs for how to write a custom backend.

Fable.Sedlex

Thanks to the Fable.Python compiler, this project managed to modify code from the amazing OCaml sedlex, implementing an ultimate lexer generator for (currently) both Python and F#.

The most impressive feature of sedlex is that Sedlex statically analyses regular expressions built with lexer combinators, correctly ordering/merging/optimizing the lexical rules. In short, sedlex is correct and efficient.

For Python users: the generated Python code is located in fable_sedlex directory, and you can directly copy them to your Python package to access such lexer generator.

For .NET(C#, F#) users: copying Sedlex.fs will give you such lexer generator in .NET.

For users in other programming languages: you might access compiled_unit and write a code generator.

python test.py { token_id = 2 lexeme = 123 line = 0 col = 3 span = 3 offset = 0 file = } ... ">
from fable_sedlex.sedlex import *
from fable_sedlex.code_gen_python import codegen_python
from fable_sedlex.code_gen import show_doc

digit = pinterval(ord('0'), ord('9'))

dquote = pchar('"')
backslash = pchar('\\')
dot = pchar('.')

exp = pseq([por(pchar('E'), pchar('e')), pplus(digit)])
flt = pseq([pstar(digit), dot, pplus(digit), popt(exp)])
integral = pseq([pplus(digit), popt(exp)])
string_lit = pseq([dquote, pstar(por(pcompl(dquote), pseq([backslash,  pany]))), dquote])
space = pchars(["\n", "\t", "\r", ' '])

EOF_ID = 0
cu = build(
    [
        (space, Lexer_discard),
        (flt, Lexer_tokenize(1)),
        (integral, Lexer_tokenize(2)),
        (string_lit, Lexer_tokenize(3)),
        (pstring("+"), Lexer_tokenize(4)),
        (pstring("+="), Lexer_tokenize(4)),
        (peof, Lexer_tokenize(EOF_ID))
    ], "my error")

f = compile_inline_thread(cu)
# we can also generate Python source code for equivalent lexer via:
#    `code = show_doc(codegen_python(cu))`
# see `test_codegen_py.py`, `generated.py` and `test_run_generated.py` 
# for more details


buf = from_ustring(r'123 2345 + += 2.34E5 "sada\"sa" ')

tokens = []
while True:
    try:
        x = f(buf)
    except Exception as e:
        print(e)
        raise
    print(x)
    if x is None:
        continue
    if x.token_id == EOF_ID:
        break
    tokens.append(x)

print(tokens)

> python test.py

{ token_id = 2
  lexeme = 123
  line = 0
  col = 3
  span = 3
  offset = 0
  file =  }
 ...
You might also like...
An universal linux port of deezer, supporting both Flatpak and AppImage

Deezer for linux This repo is an UNOFFICIAL linux port of the official windows-only Deezer app. Being based on the windows app, it allows downloading

This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan

This tool for beginner and help those people they gather information about Email Header Analysis, Instagram Information, Instagram Username Check, Ip Information, Phone Number Information, Port Scan. This tool shows your hostname and public IP first, then user give input and according to option this tool work. This tool work diffrent Oprating system.

Rufus port to linux, writed on Python3

Rufus-for-Linux Rufus port to linux, writed on Python3 Программа будет иметь тот же интерфейс что и оригинал, и тот же функционал. Программа создается

Dump Data from FTDI Serial Port to Binary File on MacOS

Dump Data from FTDI Serial Port to Binary File on MacOS

Rufus port to linux, writed on Python3

Rufus-for-Linux Rufus port to linux, writed on Python3 Программа будет иметь тот же интерфейс что и оригинал, и тот же функционал. Программа создается

ASCII-Wordle - A port of the game Wordle to terminal emulators/CMD
ASCII-Wordle - A port of the game Wordle to terminal emulators/CMD

ASCII-Wordle A 'port' of Wordle to text-based interfaces A near-feature complete

This is a vscode extension with a Virtual Assistant that you can play with when you are bored or you need help..

VS Code Virtual Assistant This is a vscode extension with a Virtual Assistant that you can play with when you are bored or you need help. Its currentl

Gives you more advanced math in python.

AdvancedPythonMath Gives you more advanced math in python. Functions .simplex(args: {number}) .circ(args: {raidus}) .pytha(args: {leg_a + leg_2}) .slo

🤖🤖 Jarvis is an virtual assistant which can some tasks easy for you like surfing on web opening an app and much more... 🤖🤖

Jarvis 🤖 🤖 Jarvis is an virtual assistant which can some tasks easy for you like surfing on web opening an app and much more... 🤖 🤖 Developer : su

Releases(v0.1.0)
Owner
Taine Zhao
Taine Zhao
E5 自动续期

请选择跳转 新版本系统 (2021-2-9采用): 以后更新都在AutoApi,采用v0.0版本号覆盖式更新 AutoApi : 最新版 保留1到2个稳定的简易版,防止萌新大范围报错 AutoApi'X' : 稳定版1 ( 即本版AutpApiP ) AutoApiP ( 即v5.0,稳定版 ) —

95 Feb 15, 2021
SimCSE在中文任务上的简单实验

SimCSE 中文测试 SimCSE在常见中文数据集上的测试,包含ATEC、BQ、LCQMC、PAWSX、STS-B共5个任务。 介绍 博客:https://kexue.fm/archives/8348 论文:《SimCSE: Simple Contrastive Learning of Sente

苏剑林(Jianlin Su) 504 Jan 04, 2023
Larvamatch - Find your larva or punk match.

LarvaMatch Find your larva or punk match. UI TBD API (not started) The API will allow you to specify a punk by token id to find a larva match, and vic

1 Jan 02, 2022
Read and write life sciences file formats

Python-bioformats is a Python wrapper for Bio-Formats, a standalone Java library for reading and writing life sciences image file formats. Bio-Formats

CellProfiler 106 Dec 19, 2022
Lock a program and kills it indefinitely if it is started.

Kill By Lock Lock a program and kills it indefinitely if it is started. How start it? It' simple, you just have to double-click on the python file (.p

1 Jan 12, 2022
Linux Pressure Stall Information (PSI) Status App

Linux Pressure Stall Information (PSI) Status App psistat is a simple python3 program to display the PSIs and to capture/display exception events. psi

Joe D 3 Sep 18, 2022
A Lite Package focuses on making overwrite and mending functions easier and more flexible.

Overwrite Make Overwrite More flexible In Python A Lite Package focuses on making overwrite and mending functions easier and more flexible. Certain Me

2 Jun 15, 2022
Wrapper around anjlab's Android In-app Billing Version 3 to be used in Kivy apps

IABwrapper Wrapper around anjlab's Android In-app Billing Version 3 to be used in Kivy apps Install pip install iabwrapper Important ( Add these into

Shashi Ranjan 8 May 23, 2022
Python Projects is an Open Source to enhance your python skills

Welcome! 👋🏽 Python Project is Open Source to enhance your python skills. You're free to contribute. 🤓 You just need to give us your scripts written

Tristán 6 Nov 28, 2022
Open Source Management System for Botanic Garden Collections.

BotGard 3.0 Open Source Management System for Botanic Garden Collections built and maintained by netzkolchose.de in cooperation with the Botanical Gar

netzkolchose.de 1 Dec 15, 2021
Convert .1pux to .csv

1PasswordConverter Convert .1pux to .csv 1Password uses this new export format .1pux, I assume stands for 1 Password User eXport. As of right now, 1Pa

Shayne Hartford 7 Dec 16, 2022
A web application which you can search, buy or sell shares with current prices which provided by IEX.

CS50 - Stock Exchange A web application which you can search, buy or sell shares with current prices which provided by IEX. Table of Contents Setup St

1 May 28, 2022
Automatically load and dump your dataclasses 📂🙋

file dataclasses Installation By default, filedataclasses comes with support for JSON files only. To support other formats like YAML and TOML, filedat

Alon 1 Dec 30, 2021
LibreMind is a free meditation app made in under 24 hours. It has various meditation, breathwork, and visualization exercises.

libreMind Meditation exercises What is it? LibreMind is a free meditation app made in under 24 hours. It has various meditation, breathwork, and visua

1 May 24, 2022
Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells.

Identifies the faulty wafer before it can be used for the fabrication of integrated circuits and, in photovoltaics, to manufacture solar cells. The project retrains itself after every prediction, mak

Arun Singh Babal 2 Jul 01, 2022
Pygments is a generic syntax highlighter written in Python

Welcome to Pygments This is the source of Pygments. It is a generic syntax highlighter written in Python that supports over 500 languages and text for

1.2k Jan 06, 2023
Location of public benchmarking; primarily final results

CSL_public_benchmark This repo is intended to provide a periodically-updated, public view into genome sequencing benchmarks managed by HudsonAlpha's C

HudsonAlpha Institute for Biotechnology 15 Jun 13, 2022
TallerStereoVision Convencion Python Chile 2021

TallerStereoVision Convencion Python Chile 2021 Taller Stereo Vision & Python PyCon.cl 2021 Instalación Se recomienta utilizar Virtual Environment pyt

2 Oct 20, 2022
A subleq VM/interpreter created by me for no reason

What is Dumbleq? Dumbleq is a dumb Subleq VM/interpreter implementation created by me for absolutely no reason at all. What is Subleq? If you haven't

Phu Minh 2 Nov 13, 2022
Catalogue CRUD Application

This Python program creates a relational SQL database hosted on the Snowflake platform, then opens a CRUD GUI to manipulate and view the data. In this application, it is used as a book catalogue. CUR

0 Dec 13, 2022