On Tuesday, February 19, 2002, at 09:13 AM, Joel Rosi-Schwartz wrote:
The model of multiple front end app servers is perfect. But why do you feel the
need for callback notification? This is what clustering technology in the app
server already solves very nicely. In another thread I championed a disciplined
approach to the architecture, this is simply an extension thereof with multiple
servers.
The goal is, lighten the database server. In the relational world there is a tendency to want to push more and more stuff into the database. This makes sense because relational data is normally highly normalized, which means it's kind of difficult to move around. XML is different. From what I'
ve seen, applications built on XML databases store data in larger chunks. People are also used to moving chunks of XML around. In fact many people think that's the only thing it's good for.
What I'm saying is that you might be able to build a more scalable and better performing system by building a lighter overall database engine. I'
m assuming that updates to data will be considerably fewer then reads. If you have a notification mechanism for modifications then your app servers are free to always serve data out of cache until they're told other wise. It should significantly increase the effectiveness of your cache. Without notification you'd have to ping the database to verify the cache everytime you access the data. You gain the benefit of getting the data closer to the client, while eliminating the concern of cache staleness.
Think about a system with say a 1GB database, fronted by 10 app servers each with 2GB of RAM. If update volume is low enough, and in many, many apps it will be, you could effectively have in memory copies of your database on each of your app servers with a very high degree of confidence that the caches are accurate. If you do things right and your caches are smart enough, you'll even have a system that could keep running, sans writes, even if the database server is down. Another way to think about this is to view the caches as in memory replication slaves, except they only replicate on an as needed basis. This is actually how I see it, because I'd want as much as possible to be executing within the app server,
even queries.
There's boatloads of questions of course. Like where does the system fall over? There's going to come a point where the update volume forces it to start thrashing. Also you first have to find out if the cost of notification is actually lower and results in overall better performance then constantly pinging the database. I'm guessing this will be the case in most real world apps, but I don't know for sure. Then you have to look at queries and how they would most effectively work.
This isn't really a new idea, some object databases already do this. XML is just a more compelling and general purpose data model.
BTW, just to be clear, I'm just speaking hypothetically. I'm not suggesting we build something like this.
Kimbro Staken - http://www.kstaken.org - http://www.xmldatabases.org Apache Xindice native XML database http://xml.apache.org XML:DB Initiative http://www.xmldb.org Senior Technologist (Your company name here)
