This is just something I've been pondering lately. I've been doing a lot of work this year with Lucene.Net (a port of Lucene, which is written in Java, to .NET) to manage a search engine. In our configuration, it uses RAMDirectory objects to retain the indexes in memory, then it searches the indexed content as though it was on disk. It takes up a lot of RAM, but it's very performant. A search query, including network load of transfering the XML-based query and the XML-based result set (over Windows Communication Framework), is typically in the range of about 0.05 seconds over a gigabit switch using standard, low-end, modern server hardware.
We don't just spider our sites with this stuff as with Google (or Nutch). We manually index our content using real field names and values per index, very similar to SQL Server tables except that you can have multiple same-name fields with different values in the same index record ("document") which is great for multiple keywords. If we could get it to properly join index on fields like in SQL Server you can join tables on fields, as well as to perform arbitrary functions or delegates as query parameters (which is DOABLE!!), we'd have ourselves something that is useful enough for us to consider throwing SQL Server completely out the window for read-only tasks and get a hundredfold performance boost. Yes, I just said, that!!
Because of the load we put on RAM, trying to keep the I/O off the SCSI adapter and limit it to the memory bus, all of this has led me to question why network and RAM capacities have not evolved nearly as fast as hard drive capacities. It seems to me that a natural and clean way of managing the performance of any high-traffic, database-driven web site is to minimize the I/O contention, period. I hear about people spending big money on redundant database servers with all these terabytes of storage space, but then only, say, 16 GB of RAM and gigabit switch. And that's fine, I guess, considering how when the scale goes much higher than that, the prices escalate out of control.
That, then, is my frustration. I want 10 gigabit switches and adapters NOW. I want 128GB RAM on a single motherboard NOW. I want 512GB solid state drives NOW. And I want it all for less than fifteen grand. Come on, industry. Hurry up. :P
But assuming that the hardware became available, this kind of architectural shift would be a shift, indeed, that would also directly affect how server-side software is constructed. Microsoft Windows and SQL Server, in my opinion, should be overhauled. Windows should natively support RAM disks. Microsoft yanked an in-memory OLE-DB database provider a few years ago and I never understood why. And while I realize that SQL Server needs to be rock-solid for reliably persisting committed database transaction to long-term storage, there should be greater design flexibility in the database configuration and greater runtime flexibility, such as in the Transact-SQL language, that determines how transactions persist (lately or atomically).
Maybe I missed stuff that's already there, which is actually quite likely. I'm not exactly an extreme expert on SQL Server. I just find these particular aspects of data service optimizations an area of curiosity.