当前位置:网站首页>Can SQL also play AI? you 're right! Mlops meetup V3 review openmlbd+sqlflow+byzer
Can SQL also play AI? you 're right! Mlops meetup V3 review openmlbd+sqlflow+byzer
2022-07-18 09:21:00 【51CTO】
7 month 10 Japan , The third issue hosted by xingce community 「MLOps Meetup」 Online , This activity is organized by 51CTO、 Open source in China 、CSDN、 Think no 、 Sheshuo net 、 Live streaming of cloud native communities , Total number of viewers 1.5w+ .
The activity revolves around “ How to use SQL Realize the whole process of industrial machine learning ”, Initiator of xingce community — Tan Zhongyi , Opening the event , At the same time, it introduces “ SQL boy You can also do AI ” Activity background of ; Open source project OpenMLDB PMC、 The fourth paradigm platform architect — Chen Dihao , Introduce how to use SQL Realize feature Engineering ; Baidu PaddlePaddle 、EDL、SQLFlow、Couler Core developer contributors — Wuyi , Share how to use SQL Do model training and prediction ;Byzer Community PMC、Kyligence Technical partner — Zhu Hailin , Introduce how to use SQL Complete the end-to-end machine learning process .
This article is based on the key content shared by four teachers , See the video review at the end of the article ,PPT Attention please. Official account. 「 Xingce open source 」 And the reply 「0710」
Part1:SQL Boy You can also do AI — Tan Zhongyi

Why? SQL It's still popular today ?
SQL from 1978 Developed in, it is still very popular in the industry , The reasons are as follows : First ,SQL It's a declarative programming language , It just needs to express the desired result , And don't care about the specific implementation process . secondly ,SQL It's standardized , As long as it meets the standard (ANSI etc. ) Of SQL, It can run in different machines and environments , And get the same result . Last ,SQL The biggest advantage is simplicity , Engineers can easily learn and use .
SQL Use of
SQL It has a wide range of uses , For example, in the traditional enterprise business system , use MySQL Database or Microsoft SQL Database, etc CRUD Application . Data analysis field of big data , Yes Spark SQL 、HiveSQL etc. . besides ,SQL The application of can also involve the field of machine learning , It can realize feature Engineering 、 model training , You can even do end-to-end machine learning . This time Meetup Will introduce how to use SQL Complete these .
Part2:OpenMLDB— With SQL As the core online and offline consistency feature platform — Chen Dihao

Data and feature challenges of artificial intelligence engineering landing
according to Gartner Survey statistics of , Now in the field of artificial intelligence 95% Time and energy are spent on data , How to correctly 、 efficient AI The supply of data and features has become a new challenge on the data side . The whole process of machine learning application from development to online (MLOps) It can be divided into two processes: offline development and online service , These two processes include :DataOps、FeatureOps、ModelOps Three steps . Among them, the problem of characteristics is particularly thorny , Such as FeatureOps( Feature Engineering ) The high engineering landing cost caused by the online and offline consistency verification of the central line . To solve this problem :1% The leading enterprises choose to spend thousands of hours to build their own platforms , Some non head enterprises will purchase expensive SaaS Tools and services , and OpenMLDB Provides another solution , It uses SQL At the core , Provides low cost 、 Efficient online and offline consistent production level feature computing platform .
OpenMLDB use SQL Go online after completing the development
OpenMLDB Is an open source machine learning database , It provides a consistent feature platform online and offline , His overall structure is shown in the figure below , And AI The tool chain required for application landing is the same , The overall framework is divided into offline and online parts , Based on Spark++ Batch processing of SQL Engine and real-time based on self-developed timing database SQL engine , The middle tier provides services based on SQL Consistent execution plan generator . In short , adopt OpenMLDB As long as developers can write SQL, Just three steps , You can complete the process of development and launch .

With SQL Core development and management experience
Use OpenMDLB,SQL boy You can also do machine learning . As shown in the figure below OpenMLDB The command line is similar to SQL The command line is similar to , Get into OpenMLDB CLI after , Users can directly execute SQL Statement to do offline feature calculation , At the same time through Deploy Statement will SQL The scheme is online , After online, you can make online requests on the client . The whole experience is based on SQL , It reduces the use threshold of machine learning feature Engineering .
OpenMLDB GitHub( https://github.com/4paradigm/OpenMLDB)
Part3: from SQLFlow To “3-FLow”— Wuyi

SQLFLow
SQLFlow It's a compiler , It can be SQL The program is compiled in Kubernetes Workflow running on . The input is a SQL Program , With extension SQL Grammar to write , To support the AI Homework , Including training 、 forecast 、 Model to evaluate 、 Model explanation 、 Custom homework and math programming . The output is distributed Kubernetes Running on the cluster Argo workflow . meanwhile ,SQLFlow Support various database systems , Such as MySQL、MariaDB、TiDB、Hive、MaxCompute And many machine learning toolkits , Such as TensorFlow、Keras、XGBoost etc. .
Why use SQLFLow
as everyone knows , Use SQL Realization AI The calculation does not need to know the specific calculation details , Just define what kind of model and parameters to train , You can complete the overall training .SQLFlow The goal is to reduce the overall machine learning and AI Application building multi threshold , At the same time, it is Declarative and imperative Coexisting ,SQLFLow Can be more High Level Define the flow direction of the whole model and data .
SQLFLow advantage
As shown in the figure below ,SQLFLow In benchmarking Microsoft SQL Server、Teradata SQL for DL And Google BigQuery It has the following four advantages : It can adapt to mainstream database systems 、 Adapt the syntax dialect of each database system 、 Define extended grammar to complete training / forecast / explain / Linear programming 、 At the same time, it can also model library / Customize the function of the model .

among , About SQLFLow How to train in detail DNN etc. DL Model 、 Automatically adjust parameters 、 Training XGBoost Model 、 Execution forecast 、 Model explanation (SHAP)、 Use the model in the model library / Custom model 、 Solving linear programming problems (linear programming)、 Run custom programs and other design details to see the full video at the end of the article .
stay MLOps The realization of
“3-Flow” yes KubeFlow、MLFlow、SQLFlow Integration of , You can override MLOps Most of the processes in . At present, the project uses Helm Chart Quickly install and configure in other environments Kubeflow,SQLFlow The integration deployment and corresponding functions have been completed by the addition of . future , Will join ParaFlow SDK Equal module , Continue to improve its MLOps function .

SQLFLow GitHub( https://github.com/sql-machine-learning/sqlflow)
Part4:OpenMLDB + Byzer Use SQL Complete the end-to-end machine learning process ”— Zhu Hailin

Byzer What is it?
Byzer It's a course for students Data+AI Cloud native class of domain SQL Language . at present ,Byzer Support ETL , Data mining and Analysis , Machine learning modeling , Model deployment . Use Byzer Can easily complete the whole machine learning Pipline, This includes : Load data 、 Processing data 、 model training ( Support multiple groups of parameters , Model version, etc )、 Batch Forecasting 、 Model to evaluate 、 Deploy API service . The architecture is as follows .

Why Byzer
Now , Big data and AI The threshold is still high , The platform splits , Language fragmentation , Difficult to maintain , Many people are deterred by problems such as difficulty in use .Byzer The emergence of can help enterprises and individuals to mine the value of data at the lowest cost , Greatly reduce Data + AI The threshold .
Byzer + OpenMLDB
Although at present ,Byzer It has been possible to complete the whole machine learning almost without programming Pipeline. But for Feature Engineering , Especially real-time online feature calculation ,Byzer There are still weaknesses . So it's introducing OpenMLDB after , A perfect solution Byzer Weakness in Feature Engineering . The activity demonstrated in detail with Kaggle Take taxi travel time prediction as an example , How to use OpenMLDB and Byzer Combine to build a complete machine learning application , Covering the loading of Lake warehouse data , Feature calculation , model training , Deploy , And externally Rest API Provide end-to-end prediction , See the full video at the end of the article .
In short , Use OpenMLDB+ Byzer The process of , Let the whole feature calculation pass FeatureStoreExt The plug-in is converted to OpenMLDB Calculation in .Online Partially through Rest Function requests in real time OpenMLDB Get eigenvalues .Byzer As streaming , Data can be regularized and written Kafka Enter again OpenMLDB. From this we can find ,Byzer The use of is very extensible , Algorithm engineers can / Data Engineer / Analysts are waiting Notebook Connect more ecosystems in .

Byzer GitHub(https://github.com/byzer-org/byzer-lang)
summary
With the continuous development of artificial intelligence and the iterative evolution of Technology , Machine learning is no longer out of reach , Engineers can pass some tests such as SQL It's easier 、 Easy to use 、 Low threshold programming language , By like OpenMLDB /SQLflow / Byzer Open source tools such as , Quickly realize the whole process of machine learning application , Solve the problems of machine learning at all stages . Believe in the future ,AI The landing of will be simpler and simpler , The landing speed will also be faster and faster . You can watch the full video playback of the implementation details of each family in this activity , I hope you can get something .
Last , You are welcome to continue to pay attention MLOps , Join in MLOps Enthusiast exchange group Discuss relevant contents with us .
- SQL Can also play industrial AI !—— Tan Zhongyi
- https://www.bilibili.com/video/BV14g411f7Ut/\
- OpenMLDB With SQL As the core online and offline consistency feature platform —— Chen Dihao
- https://www.bilibili.com/video/BV1eg411f7sC/
- from SQLFlow To “3-FLow” use SQL complete AI modeling —— Wuyi
- https://www.bilibili.com/video/BV1hT411g7hz/
- OpenMLDB + Byzer use SQL Complete the whole process of end-to-end machine learning !—— Zhu Hailin
- https://www.bilibili.com/video/BV1te4y1R7yn/
边栏推荐
- Comprehensive explanation of e-commerce crawler API
- HybridCLR——划时代的Unity原生C#热更新技术
- 漫画 | 重磅!七国集团决定制裁Go语言!
- Accumulate class hour experience and practice of children's programming
- Hcip day 8 notes
- Analysis of the new steam curriculum combined with labor education
- KingbaseES V8R6 ksql 关闭自动提交
- openGauss 联合产业界创新,共建开源数据库根社区
- Interpreting the teaching principles of robot programming course
- 如何在dataworks写ADB的sql
猜你喜欢

Health prevention guide 1: the secret of weight and weight loss

小目标检测1_Focal loss

【C语言初阶】函数学习报告

Analyzing the starting point of modern maker Education

解读机器人编程课的示教准则

Win11提示Outlook搜索错误怎么办?Win11提示Outlook搜索错误

用户登录和注册功能带验证码

CSDN blog expert exclusive honor award is coming

下班前几分钟,我彻底弄懂了JSON.stringify()

Hcip day 8 notes
随机推荐
User login and registration function with verification code
Analyzing the starting point of modern maker Education
Analysis on the application security and technology of electronic signature
安装CUDA失败的情况nsight visual studio edition失败
[unity] skill sharing: how to change the default download resource location of the unity asset store
2022 极术通讯-安谋科技开启商业化新篇章
4-Redis架构设计到使用场景-Redis请求执行过程
DMS如何赋权数据库函数权限呢?
请问政务网服务器的mysql数据已经与dataworks开通网络策略,但是测试还是说数据库测试连通性
求助,更新到思源 v2.0.27,dark+ 主题自适应宽度问题
[matlab project practice] digital signal processing system based on GUI
海量遥感数据处理与GEE云计算技术实践应用
小目标检测1_Focal loss
ACL 2022 | 基于阅读理解的论点对抽取
Interpreting the teaching principles of robot programming course
语言AI原来知道自己的回答是否正确!伯克利等高校新研究火了,网友:危险危险危险
开机按F11 选择 one-shot再选U盘启动
我国数据安全治理研究
目前网上开户炒股安全吗?
Input device drive process