Tuesday, March 4, 2008

uploading and normalization

When uploading data into glasshouse, make sure you have created a table which is fully normalized. Each column of the table should be a dimension. Consider some data recording temperatures by day for two cities: Prague and Memphis. You might have recorded it this way:

day,prague temp,memphis temp
Sunday,25,29
Monday,26,30
Tuesday,30,29


What works best for glasshouse, and reflects the fact that Prague and Memphis are both really just values of an implicit dimension (geographical location), is the following arrangement:

day,city,temp
Sunday,Prague,25
Sunday,Memphis,29
Monday,Prague,26
Monday,Memphis,30
Tuesday,Prague,30
Tuesday,Memphis,29


ark

No comments: