But how fast can you load?
Posted at 3/05/2008 07:06:00 AM
When I talk to people about the Illuminate CDBMS, the first reaction I usually get from them is, “That sounds great, but how fast can you load data?”
My usual answer is “How fast is fast enough?”
This is not a flippant answer, but an honest question that is intended to start a discussion on the meaning of load speed. The real measurement of load speed should be the time needed from starting the transfer of source data from the operational data stores, to the availability of the business information for the end user. In a multi-layer information infrastructure, that will include time for:
In an RDBMS environment, steps 1 and 2 are very fast since there is very little indexing on the foundation data layer. Depending on the hardware, communications methods and schema, this could be at a rate of perhaps 2 to 10 gigabytes per hour or even more in extreme cases with extreme hardware.
Steps 3 and 4 are the most unpredictable since every organization will have differing amounts of work in each layer. However, it is safe to say that this will be the longest part of the load process. A factor of ten times the duration of steps 1 and 2 would not be unusual. An overnight load process that loads for thirty minutes would then need five more hours to complete all of the remaining preparation steps.
In the illuminate environment, the load is a single step with some aggregations needed in limited cases. When data is loaded into iLuminate, the load process indexes everything. This means that step 2 in this case is zero.
Because everything is indexed, there is no star schema or OLAP cube building required. The foundation layer with 100% indexing already provides the full functionality of those structures. Also, due to the indexing, aggregations will not be needed as soon as they are with an RDBMS. The iLuminate database will minimize the I/O needed for the aggregation so the computational speed of the computer is the determining factor for aggregations. Small aggregations can be done as a part of the query process without impacting the user response time.
While an exact comparison can not be made because of variability of hardware, schema design and the number of information layers, an example comparison can demonstrate the conceptual difference between loading with a conventional system and loading with iLuminate.

So the answer to the original question is that iLuminate loads very fast. More fast enough for almost any enterprise.
My usual answer is “How fast is fast enough?”
This is not a flippant answer, but an honest question that is intended to start a discussion on the meaning of load speed. The real measurement of load speed should be the time needed from starting the transfer of source data from the operational data stores, to the availability of the business information for the end user. In a multi-layer information infrastructure, that will include time for:
- Importing raw data
- Indexing the imported data
- Running all aggregation processes and building intermediate layer access structures as needed
- Further aggragations, rebuilding and re-indexing star schemas, OLAP cubes and any other customized, end use information delivery structures.
In an RDBMS environment, steps 1 and 2 are very fast since there is very little indexing on the foundation data layer. Depending on the hardware, communications methods and schema, this could be at a rate of perhaps 2 to 10 gigabytes per hour or even more in extreme cases with extreme hardware.
Steps 3 and 4 are the most unpredictable since every organization will have differing amounts of work in each layer. However, it is safe to say that this will be the longest part of the load process. A factor of ten times the duration of steps 1 and 2 would not be unusual. An overnight load process that loads for thirty minutes would then need five more hours to complete all of the remaining preparation steps.
In the illuminate environment, the load is a single step with some aggregations needed in limited cases. When data is loaded into iLuminate, the load process indexes everything. This means that step 2 in this case is zero.
Because everything is indexed, there is no star schema or OLAP cube building required. The foundation layer with 100% indexing already provides the full functionality of those structures. Also, due to the indexing, aggregations will not be needed as soon as they are with an RDBMS. The iLuminate database will minimize the I/O needed for the aggregation so the computational speed of the computer is the determining factor for aggregations. Small aggregations can be done as a part of the query process without impacting the user response time.
While an exact comparison can not be made because of variability of hardware, schema design and the number of information layers, an example comparison can demonstrate the conceptual difference between loading with a conventional system and loading with iLuminate.

So the answer to the original question is that iLuminate loads very fast. More fast enough for almost any enterprise.