I appreciate you coming back and saying that. Your company has been very open to our
input. However, I respectfully disagree
with almost everything you've said in your original post. Below are my responses:
“I agree with you,
dynamic data structures will always be slower compared to the traditional
design for relational databases but let’s face it, retrieving data form the
database is just a fraction of the time needed to render a page. Besides with
support for output caching, performance should not be a concern at all.”
I think you’re dead wrong here. I would argue that the GREAT MAJORITY of
performance issues in any application (desktop and web) are with remote calls
over the wire, such as remote procedures, web services, etc. This includes database calls, which is basically
the most common remote call that web pages make. Excluding bandwidth issues due to lots of
html, images or large viewstate, rendering a page on the server takes very
little time. However, retrieving
database records over and over can kill performance. We’re not talking milliseconds here…we’re
talking seconds or even minutes. Note
that this may not have anything to do with how long it takes for the database
to process the query--that may in fact take very little time. It has to do with the time it takes to return
those records over the wire from the database server to the application because
its asking for the data over and over instead of in one batch.
Many of the books on O/R Mapping will tell you that one of
the major performance killers are many unneeded lazy loads or other data
centric problems with retrieval. This
matches my own experience A few extra
lazy loads or database calls, even 10 or 20 on a page will not hurt. But a hundred or a thousand will. Note, that i'm not against using O/R Mapping
or lazy load features within them--they need to be used judiciously. And in
your case, you’re lazy loading child records that should be fields in
“Yes SQL’s built-in
reporting will not do much good but most contemporary reporting tools support
object data sources.”
Can you mention some reporting services that will support
calling .NET objects without a lot of code wrangling? I’m not being aggressive
here…I really would like to know. I’ve
mostly used sql reporting service and I don’t think it will natively use a
business object or collection as a datasource but i could be wrong.
But lets assume it does.
Lets say a year from now my client has 5000 records of some sort,
whether News, or Blogs or some other module that is using the meta data
structure you currently have in News.
Lets say they want a report with some simple aggregation (sums,
averages) on this data. If I run the
report by tapping into the Nolics business objects I will not be able to use
built in Sum or Avg function in SQL Server because i'm calling your objects to
get my data. I will instead have to load
the entire resultset to get the totals.
When i get each object, it will then have to lazy load to get all its
related fields such as Title, Summary, etc to retrieve the field is that i want
a count, sum or average on. This could
potentially mean 5000 queries on one report, each call going back and forth
over the wire to the sql server (a call out to send the sql or stored proc
name, and data coming back in), and each
query returning 5-10 records of its own, for a total of 25,000 -50,000
records. This is going to run abysmally
slow. The same thing could happen on the
pages that are using the module.
If all the fields were instead in one "News" table
or "Blog" table, 5000 records (or objects, if the report package
supports it) could be returned in one database call (one query.) Maybe not lightening fast, but acceptable
performance given the amount of data. My
point is that the database should be doing the work in these set-based and
batch operations, because that’s what its designed to do. C# and most 3rd party reporting
software is not designed to do this work…they’re relying on the database.
But even after i get all the data, you have not addressed
the problem that my report will then have to take all the rows and pivot them
as i mentioned in my earlier post. I
will have to take each batch of 5-10 rows (that really represent fields such as
Title, summary, ExpirationDate, etc) and flip them to display them left to
right on the report. I’m sure that this
can be done somehow, but its not exactly simple.
“It is also true it
will be much more difficult to use DTS but why would you? It will take less
than ten lines of code to import or export data using the Content Manager.”
I WOULD like to look at the content import feature and the
ability to change the News items data provider.
Can you point me in the right direction?
I agree your data structure will work fine for a small amount of news
items. But if I have to import thousands
or hundred thousands of records into one of your modules, DTS or SSIS will be
much, much faster than using a content importer written in C#. Also, most of us developer types are used to
working with normalized data with things like SQL, SSIS, and DTS.
“With the upcoming
technologies like LINQ and Lambda Expressions, object databases are advancing
quickly and everyday’s increasing computer power dynamism seems more important
to me than saving a few milliseconds per request. This design allows fields to
be added and removed dynamically by simply declaring them in the web.confg.”
You mentioned LINQ and LAMBDA expressions. I'm confused about what Lambda expressions
have to do with this discussion since these are just anonymous delegates and
don't really affect database access.
Again, I’m not trying to be mean, I just really don’t know how these are
related…could just be my ignorance.
I can understand as you said that you did this for
flexibility…I looked in the web.config and I can see where you are adding the
columns, and indeed it is very flexible.
But for large projects or larger amounts of data it will cause
problems. Also, it seems to be that you
are basically recreating in the web.config and in your framework what the
database itself was designed for! The relational
databases were designed to hold things like “Title”, “Summary”, “DateExpired” AS
Fields, not rows. One of the developers
at my company argued that your team is well on their way to the “inner
platform” anti-pattern if you keep creating modules this way:
I think you were also referring to my comment about multiple
values being stored in a column delimited by semi-colons. LINQ may have some better ways of calling
"unique" data structures but its not out yet, and also it will still
rely on a great degree on a properly normalized relational database. Its supposed to ease querying and mapping
from objects to a relational database not “fix” a non standard relational database
design. Again, columns were designed to
hold ONE value, not multiples. You
should split those semi-colon delimited values into separate columns or rows (I
can’t say which because I don’t understand fully what you’re using them for.) What you’re doing in the database would be
like taking a C# class and creating a field and storing a comma or semi-colon
delimited string instead of just using an array or collection.
Again, I’m sorry for the harshness and length of this
post. As you can probably tell, I’m very
passionate about proper database design.
I have both read and worked for companies that ignore good design and
have seen them pay for it. I enjoy your
company’s openness and I see a lot of great improvements in Sitefinity. I just think that you are ignoring very basic
database design principles.