Flat Files vs Cubes
July 2, 2012
Both have value, but they’re not interchangeable; one is just data, the other is information
On my long drive back from vacation last week, my seven year old was watching Avatar on my iPhone. That got me thinking about how much visual and audio information she was missing compared to the 3d iMax version with THX surround sound.

It made me realize that using flat files for data analysis is almost like watching a flat 2d version of a movie, while using cubes for data analysis is more akin to the rich experience that 3d brings you.
There is no question that Flat files have some advantages: they are much easier to insert and update data. And for an experienced database administrator, flat files can be faster to search for simple queries. Since the data is, quite literally, flat (text fields, fixed lengths) – they are easier to manage because the structure is much simpler. When you have a small data set, and inexperienced users, flat files are easier to implement. They can also be produced without purchasing additional software – you can use any spreadsheet or word processing program that you are already familiar with.
However, while flat files may make it easier to insert data, it is much harder to perform complex searches. As the data grows, so does the complexity and redundancy in entering new fields and new records, which can lead to data inconsistency and insecurity.
Like viewing Avatar on an iPhone, flat files can be difficult for the end user to interpret because so much of the detail is lost. Since flat files exist without relationships, the end user must understand the relationships between the data themselves. They must imagine – and more importantly, have the knowledge to imagine – how each piece of data relates to other pieces of data – much like imagining what 3D looks like.
Cubes, however, give a much richer 3D view of the data. It may take more work on the back end to structure and format the database, but once set up, the user can themselves make complex queries that produce a visual result. The relationships between the data exist in the cube, not just in the end user’s mind. Cubes take raw data and transform it to literally show the user useable information.
Even better than just showing the information in an easily processed way, cubes allow the user to control and modify their own queries. They can play with the data – they can slice and dice and drill – performing their own analysis according to their specific needs at that time. Users don’t need to understand SQL to extract complicated information from their data. In fact, they don’t even need to have an underlying knowledge of database systems at all. Like the iMAX version of the movie, they can sit back and enjoy the complex relationships that exist between the data.
Of course there can be disadvantages to using cubes: sometimes you don’t need analysis. If all you need to do is store a small amount of data, it may be simper to use flat files. Databases take more time and space, and take technical skills to structure. For simple text files, like lists or static data, flat files might be the most parsimonious answer.
I think the key thing here, when thinking about flat files versus cubes, is that one must understand the difference between data and information. The iPhone version of the movie provided “data” the storyline was there, the characters were blue, but it lacks the richness of “information” provided by the iMAX 3D version. Data, on its own, is limited. Data doesn’t lead to decisions. You can have as much data as possible on your potential customers. But all of that data is worthless if you can’t analyze it and extract meaningful information from it.
Information, on the other hand, is the result of data analysis. Data becomes powerful only when we apply relational knowledge to it. In this way, data plus knowledge equals information. And information is what you need to make clear and powerful decisions.
When you look at the differences between flat files and cubes, it comes down to the choice between two-dimensional data or multidimensional information. Your choice is up to you.







