The bridge between operational systems and Business Intelligence systems is the main process of extract, transform and load (ETL) the results of the first to used by the last. These ETL processes are highly specialized for select, extract or take data from transactional repositories, convert or translate this data to give it meaning and appropriate format for Business Intelligence querying and final users’ understanding.
To achieve this objective, the original or operational data could pass several intermediate processes and it could be saved on intermediate repositories or staging areas.
Some aspects to consider are:
- Know the needed data: it is useful to know the business in order to establish the valuable information and key performance indicators. With these indicators as starting point, determine and document starting production data and underlying relationships to be considered and processed.
- Which data have to be extracted, when it has to be extracted: for complement previous consideration, it is needed to check with precision and establish the fields to be extracted, the timing and load for extraction processes without negative effects for origin systems performance or data quality for Business Intelligence purposes.
- Which data have to be transformed, which it has to be fixed: it is necessary to give useful meaning to extracted data, so some data has to be transformed and some data has to be fixed, applying decision and business rules for its following analysis.
- How to choose tools to build these data processes: depending on operational data platform, available resources and timing, it is vital, because performance issues, to determine appropriate tools to build these processes. It is necessary to coding custom processes? It is possible to use existent and tested tools? What about Open Source tools?