문제

What's the most proper and best practice driven way of configuration my transformations?

In other words let's imagine I have a big ETL solution based on kettle that does stuff by connecting to different data source, I would like to store these data sources in a centralized location and have each transformation look it up everytime it needs to connect somewhere.

In SSIS there is package configuration what is the alternative that I have with pentaho?

Ps: I do not want to install any 3rd party framework.

Thank you

도움이 되었습니까?

해결책

This can be done in various ways.

  1. Parameterising the database connections, and configuring the properties via kettle.properties. You could still access that kettle.properties from a shared area or something.

  2. As above, but configuring the connections by reading credentials from a database. Has to be hand crafted, but can be made to work with some caveats.

  3. If you use the repository, then the database connections are stored centrally anyway. So if you have a dev and a prd repo, when you promote, dont promote the db connection itself. Trickier than it sounds though.

As for all of that, the new 4.4(?) release should have proper lifecycle management to make dealing with all this stuff a lot easier!

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top