I keep running into people that represent each month, January through
December, as a column in a table. Considering the nature of some of the
reports generated from this data, storing it in this way can certainly be
convenient; however I personally find the idea repugnant. I've been doing
this a while and think I have a pretty good intuitive grasp on what is good
database design and what isn't... but in this case the other designer could
come back and argue that since the number of months never changes,
representing each of them with a column is perfectly valid. Then I could
argue that that greatly complicates retrieving aggregate data, but then he
could say that my approach greatly complicates presenting the data with a
column per month...
In the beginning of my career as a database designer I never had issues like
this, I didn't need to work with anyone. Now it seems at every turn I end up
arguning with someone about what is good design and what isn't. Usually it's
regrettably obvious that the other person doesn't know what he's doing
("MUST you use nested cursors when all you need is a JOIN!?"), but in this
case I'm not so sure. Before I start another argument, I would like to hear
other people's thoughts in the matter.
Comments?
PaulYour colleague seems to favour denormalization without justification. By
normalizing your data, you make querying straightforward. Same goes for
maintenance. It's easy to craft a pivot query to present the results he
wants. It's messier to take his denormalized table and then run various
queries against it.
Tom
----
Thomas A. Moreau, BSc, PhD, MCSE, MCDBA
SQL Server MVP
Toronto, ON Canada
.
"PJ6" <nobody@.nowhere.net> wrote in message
news:u21R9lWYGHA.4144@.TK2MSFTNGP04.phx.gbl...
I keep running into people that represent each month, January through
December, as a column in a table. Considering the nature of some of the
reports generated from this data, storing it in this way can certainly be
convenient; however I personally find the idea repugnant. I've been doing
this a while and think I have a pretty good intuitive grasp on what is good
database design and what isn't... but in this case the other designer could
come back and argue that since the number of months never changes,
representing each of them with a column is perfectly valid. Then I could
argue that that greatly complicates retrieving aggregate data, but then he
could say that my approach greatly complicates presenting the data with a
column per month...
In the beginning of my career as a database designer I never had issues like
this, I didn't need to work with anyone. Now it seems at every turn I end up
arguning with someone about what is good design and what isn't. Usually it's
regrettably obvious that the other person doesn't know what he's doing
("MUST you use nested cursors when all you need is a JOIN!?"), but in this
case I'm not so sure. Before I start another argument, I would like to hear
other people's thoughts in the matter.
Comments?
Paul|||I'll sometimes do this but will treat it as redundant data.
The source will be the normalised structure and the denormalised created
from it - maybe recreated from scratch sometines maintained depending on the
requirement.
The reason would be performance - if there is a lot of data then this
structure can help a lot.
Unfortunately the tools available for database manipulation (especially sql
server) are so user friendly now that people are able to do quite a lot
without any real understanding of what they are doing so you end up having t
o
work with collegues who've got quite a long way without much basic knowledge
.
And in a company being right isn't always appreiated.
I've had that recently with a company that's full of application and excel
programmers. They refuse to listen to anything about relational theory and
even the sql server "experts" try to make the database into a glorified exce
l
spreadsheet.
I was moved to another seat because the guy who wants to ru things got upset
that I kept correcting him when he was diseminating knowledge about sql
server.|||I once had a system where monthly accounting summaries were held this way.
Made reports easy, but updates were obviously horrendous.
The real fun came when the company wanted to change from calendar month
periods to fixed 4-wly, so you have 13 (ish) in a year, Loads of code to
be redone!
Separate rows with appropriate start and end date columns is far more
flexible and probably the better way to go.
Mike John
"PJ6" <nobody@.nowhere.net> wrote in message
news:u21R9lWYGHA.4144@.TK2MSFTNGP04.phx.gbl...
>I keep running into people that represent each month, January through
>December, as a column in a table. Considering the nature of some of the
>reports generated from this data, storing it in this way can certainly be
>convenient; however I personally find the idea repugnant. I've been doing
>this a while and think I have a pretty good intuitive grasp on what is good
>database design and what isn't... but in this case the other designer could
>come back and argue that since the number of months never changes,
>representing each of them with a column is perfectly valid. Then I could
>argue that that greatly complicates retrieving aggregate data, but then he
>could say that my approach greatly complicates presenting the data with a
>column per month...
> In the beginning of my career as a database designer I never had issues
> like this, I didn't need to work with anyone. Now it seems at every turn I
> end up arguning with someone about what is good design and what isn't.
> Usually it's regrettably obvious that the other person doesn't know what
> he's doing ("MUST you use nested cursors when all you need is a JOIN!?"),
> but in this case I'm not so sure. Before I start another argument, I would
> like to hear other people's thoughts in the matter.
> Comments?
> Paul
>|||RE:
<< They refuse to listen to anything about relational theory and even the
sql server "experts" try to make the database into a glorified excel
spreadsheet.>>
And to make matters worse, Microsoft is intentionally blurring the lines
between a spreadsheet and a database
Check out the second paragraph in this PC Magazine review of MS Access 12:
http://www.pcmag.com/article2/0,1895,1888065,00.asp
It's a shame because it explicitly promotes the "spreadsheet mentality" of
database design.
"Nigel Rivett" <NigelRivett@.discussions.microsoft.com> wrote in message
news:E3AF2A1A-F935-4CC8-8659-70E0A6801053@.microsoft.com...
> I'll sometimes do this but will treat it as redundant data.
> The source will be the normalised structure and the denormalised created
> from it - maybe recreated from scratch sometines maintained depending on
> the
> requirement.
> The reason would be performance - if there is a lot of data then this
> structure can help a lot.
> Unfortunately the tools available for database manipulation (especially
> sql
> server) are so user friendly now that people are able to do quite a lot
> without any real understanding of what they are doing so you end up having
> to
> work with collegues who've got quite a long way without much basic
> knowledge.
> And in a company being right isn't always appreiated.
> I've had that recently with a company that's full of application and excel
> programmers. They refuse to listen to anything about relational theory and
> even the sql server "experts" try to make the database into a glorified
> excel
> spreadsheet.
> I was moved to another seat because the guy who wants to ru things got
> upset
> that I kept correcting him when he was diseminating knowledge about sql
> server.|||PJ6 (nobody@.nowhere.net) writes:
> I keep running into people that represent each month, January through
> December, as a column in a table. Considering the nature of some of the
> reports generated from this data, storing it in this way can certainly
> be convenient; however I personally find the idea repugnant. I've been
> doing this a while and think I have a pretty good intuitive grasp on
> what is good database design and what isn't... but in this case the
> other designer could come back and argue that since the number of months
> never changes,
And then the company decides to change its fiscal year to run from
April to March. Or Mike John's example where they moved to four-w
periods.
> representing each of them with a column is perfectly valid. Then I could
> argue that that greatly complicates retrieving aggregate data, but then he
> could say that my approach greatly complicates presenting the data with a
> column per month...
Yes, the last comment has some relevance, but
CREATE VIEW ... AS
SELECT product,
Jan = SUM(CASE month WHEN 1 THEN sales ELSE 0 END),
Feb = SUM(CASE month WHEN 2 THEN sales ELSE 0 END),
..
FROM monthsales
GROUP BY product
takes care of that.
> In the beginning of my career as a database designer I never had issues
> like this, I didn't need to work with anyone. Now it seems at every turn
> I end up arguning with someone about what is good design and what isn't.
It cannot be denied that database design in real life is about trade-offs,
and there are often more than one valid design.
There are plenty of rules, and while I mostly obey them, I've broken
them all. Including "no repeating groups". What you need is a good
feeling for when it's right to break the rule.
And it matters a lot what the purpose of the database is. They say that
OLAP databases don't look like OLTP databases. (I haven't worked with
OLAP, so I don't know.)
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||This denormalized design is fine in a Data Warehouse or OLAP schema,
where the data is static, the range of the data is static and the data
validated before it goes into the tables.
It is a nightmare for OLTP. I would construct a Calendar table with
ranges for fiscal periods, holidays, promotional periods, etc. set it
up for 10-20 years and join to it to get your "annual window" for
reports.
As an aside, years ago there was an article in DB2 magazine comparing
performance and storage usage between the two designs. I vaguely
remember that the normalized design performed better but took up more
space after April in the sample data. Disk is cheap; time and
integrity are not.
Wednesday, March 7, 2012
representing months with columns
Labels:
column,
columns,
considering,
database,
microsoft,
mysql,
nature,
oracle,
represent,
representing,
running,
server,
sql,
table,
thereports,
throughdecember
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment