当前位置:网站首页>ETL tool -- kettle realizes simple data migration
ETL tool -- kettle realizes simple data migration
2022-07-19 05:29:00 【Small mayfly star Wei】
List of articles
1、Kettle Concept
Kettle Is a foreign open source ETL Tools , pure java To write , Can be in Window、Linux、Unix Up operation , Green does not need to be installed , Data extraction is efficient and stable .
Kettle This ETL Toolset , It allows you to manage data from different databases , Describe what you want to do by providing a graphical user environment , Not what you want to do .
Kettle There are two script files in ,transformation and job,transformation Complete the basic transformation for data ,job Then complete the control of the whole workflow .
2. Installation and startup
KETTLE It is a green installation software , Directly decompress the compressed package and then use , stay Linux Operating system through .sh File run , stay Windows Corresponding to the operating system is .bat file
spoon.bat In a flash , Problems that cannot be opened : Memory settings
spoon.bat modify
if “%PENTAHO_DI_JAVA_OPTIONS%”=="" setPENTAHO_DI_JAVA_OPTIONS="-Xms512m" “-Xmx512m”"-XX:MaxPermSize=256m"
3. Common components
- Database connection
- sql Script
- Table input ( Data source table )
- Field selection
- Table output ( Drop new table )
- Insert / to update
Of course kettle There are many components that can be used , At present, I have only been exposed to these .kettle It also supports files as input and output .
4. Specific cases
On the server pgsql And local sqlserver Simple cases of data migration between
pgsql: result user_number user_name -> sqlserver:demo_user user_id user_name
The overall structure :
sql The script is cleared first sqlserver in demo_user The data table
Then enter , I chose pgsql:result All fields in the table
Field selection , I chose user_number user_name And changed his name to user_id user_name
Table output , choice sqlserver:demo_user
Use... Between components shift+ Left mouse button connection
There is an independent table input at the bottom of the figure 2 It is only used to check whether the data migration is effective , It's the same to go directly to the database .
4.1 Database connection
sqlserver You need to download the driver manually jar to kettle Of lib in https://sourceforge.net/projects/jtds/files/latest/download
Here are the servers pgsql And local sqlserver

3.2 sql Script
eliminate pgsql Medium rc_user surface , Easy to check later from sqlserver Migrated data 
3.3 Table input
Database connection -> surface ->sql sentence -> preview
The server pgsql To local sqlserver
Only migrated user_number user_name
pgsql: result user_number user_name -> sqlserver:demo_user user_id user_name

3.4 Field selection :
Get the selected field -> Modify or remove
You can also directly sql Filter fields in , It's easier to maintain 
3.5 Table output
Select the libraries and tables to output as tables

Click execute to complete , Data migration .
pgsql: result user_number user_name -> sqlserver:demo_user user_id user_name
边栏推荐
猜你喜欢

Redis source code analysis - data structure and Implementation (Dictionary dict)

Common interview questions of operating system

Excel calculates the remaining days of the month

Swagger配置与使用

Redis source code analysis 3 implementation of discontinuous traversal
![[first launch in the whole network] one month later, we switched from MySQL dual master to master-slave](/img/2f/b894569c8d13c9c18fd201c12a5d07.png)
[first launch in the whole network] one month later, we switched from MySQL dual master to master-slave

ambari2.7.5集成es6.4.2

Swagger configuration and use

Distributed storage fastdfs

MySQL学习笔记(5)——JOIN联表查询,自连接查询,分页和排序,子查询与嵌套查询
随机推荐
Teach you to reproduce log4j2 nuclear weapon level vulnerability hand in hand
聊聊redis分布式锁的8大坑
Excel计算本月剩余天数
2022年春招最新消息:IT互联网行业平均薪资18500元
【函数的效率】
ArcGIS point cloud (XYZ) data to DEM
Easypoi之excel简单导出
mysql的使用
[first launch in the whole network] one month later, we switched from MySQL dual master to master-slave
Redis source code analysis skip table implementation
Common interview questions of operating system
Easypoi之excel多sheet导入
指针进阶简单总结
ambari集群扩容节点+扩容服务操作
Common methods of goframe error handling & use of error codes
C语言的指针函数
ArcMap 创建常量栅格并镶嵌至新栅格
在 CDP中使用Iceberg 为数据湖仓增压
Solutions for vscode terminal failure
面渣逆袭:线程池夺命连环十八问,面试官直夸我