当前位置:网站首页>ETL tool -- kettle realizes simple data migration
ETL tool -- kettle realizes simple data migration
2022-07-19 05:29:00 【Small mayfly star Wei】
List of articles
1、Kettle Concept
Kettle Is a foreign open source ETL Tools , pure java To write , Can be in Window、Linux、Unix Up operation , Green does not need to be installed , Data extraction is efficient and stable .
Kettle This ETL Toolset , It allows you to manage data from different databases , Describe what you want to do by providing a graphical user environment , Not what you want to do .
Kettle There are two script files in ,transformation and job,transformation Complete the basic transformation for data ,job Then complete the control of the whole workflow .
2. Installation and startup
KETTLE It is a green installation software , Directly decompress the compressed package and then use , stay Linux Operating system through .sh File run , stay Windows Corresponding to the operating system is .bat file
spoon.bat In a flash , Problems that cannot be opened : Memory settings
spoon.bat modify
if “%PENTAHO_DI_JAVA_OPTIONS%”=="" setPENTAHO_DI_JAVA_OPTIONS="-Xms512m" “-Xmx512m”"-XX:MaxPermSize=256m"
3. Common components
- Database connection
- sql Script
- Table input ( Data source table )
- Field selection
- Table output ( Drop new table )
- Insert / to update
Of course kettle There are many components that can be used , At present, I have only been exposed to these .kettle It also supports files as input and output .
4. Specific cases
On the server pgsql And local sqlserver Simple cases of data migration between
pgsql: result user_number user_name -> sqlserver:demo_user user_id user_name
The overall structure :
sql The script is cleared first sqlserver in demo_user The data table
Then enter , I chose pgsql:result All fields in the table
Field selection , I chose user_number user_name And changed his name to user_id user_name
Table output , choice sqlserver:demo_user
Use... Between components shift+ Left mouse button connection
There is an independent table input at the bottom of the figure 2 It is only used to check whether the data migration is effective , It's the same to go directly to the database .
4.1 Database connection
sqlserver You need to download the driver manually jar to kettle Of lib in https://sourceforge.net/projects/jtds/files/latest/download
Here are the servers pgsql And local sqlserver

3.2 sql Script
eliminate pgsql Medium rc_user surface , Easy to check later from sqlserver Migrated data 
3.3 Table input
Database connection -> surface ->sql sentence -> preview
The server pgsql To local sqlserver
Only migrated user_number user_name
pgsql: result user_number user_name -> sqlserver:demo_user user_id user_name

3.4 Field selection :
Get the selected field -> Modify or remove
You can also directly sql Filter fields in , It's easier to maintain 
3.5 Table output
Select the libraries and tables to output as tables

Click execute to complete , Data migration .
pgsql: result user_number user_name -> sqlserver:demo_user user_id user_name
边栏推荐
- Excel template export of easypoi
- 9.数据仓库搭建之DIM层搭建
- Cesium BIND Mouse Events and remove Mouse Events
- 在 CDP中使用Iceberg 为数据湖仓增压
- Usage and examples of vlookup function
- Parent components plus scoped sometimes affect child components
- ECS deployment web project
- 共用(联合)体
- UML (use case diagram, class diagram, object diagram, package diagram)
- Implementation of synchronization interface of 6 libcurl based on libco
猜你喜欢
随机推荐
Performance bottleneck finding - Flame graph analysis
ArcMap creates a constant grid and tessellates it into a new grid
mysql的事务
mysql的使用
2.6.2 内存泄漏
操作系統常見面試題
ECS deployment web project
二叉树的先序、中序、后序遍历
Solutions for vscode terminal failure
Redis source code analysis 2 iterator
2.6.2 memory leakage
User mode protocol stack - UDP implementation based on netmap
MySQL学习笔记(4)——(基本CRUD)操作数据库中的表的数据
共用(联合)体
Excel计算本月剩余天数
From 20s to 500ms, I used these three methods
H5 page uses JS to generate QR code
Is the cookie valid for a limited time? How to set cookies? Teach you to set by hand
MySQL cache strategy and solution
递归的应用








