Looking for ways to make db4o faster, we recently invested some work into better caching. We found a nice paper here and decided to implement a LRU/2Q cache for db4o. Our new cache implementations can now be found in the classes LRU2QCache, LRU2QXCache and LRUCache. Feel free to use them anywhere the GPL allows you to.
As the best testing ground for the new caches we identified our IoAdapters. Driven by the requirement to make the caching layer pluggable, we first refactored IoAdapters from their old factory-and-implementation-in-a-single-class approach to a clean factory concept. The roots of the new class hierarchy:
Storage - a factory class that knows how to open a Bin
Bin - a representation of a container for storage of data
If you are interested in details of the implementation, it's probably easiest if you browse the sources and open the type hierarchy for Storage and Bin from your favourite IDE. The existing old IoAdapter interface still continues to work, wrapped by an IoAdapterStorage.
After we had the caching and a corrected Storage interface, we wrote a new CachingStorage that makes use of the caches. Benchmarking the new cache implementations against our old specialized CachingIoAdapter we found that the old cache was already very good. It took us quite a bit of tuning work to reach the same speed again, but we managed. Accordingly we could safely replace the old CachingIoAdapter with the new CachingStorage in the default db4o configuration.
Here are three of the most common configuration setups that you may be interested in for everyday use of db4o:
(1) Setting up db4o with a different cache page size and page count
final int cachePageCount = 128;
final int cachePageSize = 4096;
Storage cachedStorage = new CachingStorage(new FileStorage(), cachePageCount, cachePageSize){
// By overriding the newCache method you can plug in a different cache
@Override
protected Cache4 newCache() {
return CacheFactory.new2QXCache(cachePageCount);
}
};
// opening an embedded session
EmbeddedConfiguration embeddedConfiguration = Db4oEmbedded.newConfiguration();
FileConfiguration fileConfiguration = embeddedConfiguration.file();
fileConfiguration.storage(cachedStorage);
Db4oEmbedded.openFile(embeddedConfiguration, "myEmbeddedDb.db4o");
// opening a server for a client/server session
ServerConfiguration serverConfiguration = Db4oClientServer.newServerConfiguration();
FileConfiguration fileConfiguration = serverConfiguration.file();
fileConfiguration.storage(cachedStorage);
Db4oClientServer.openServer(serverConfiguration, "myServerDb.db4o", PORT);
(2) Using db4o as a fast in-memory database
//opening an embedded session
EmbeddedConfiguration embeddedConfiguration = Db4oEmbedded.newConfiguration();
FileConfiguration fileConfiguration = embeddedConfiguration.file();
fileConfiguration.storage(new MemoryStorage());
Db4oEmbedded.openFile(embeddedConfiguration, "myEmbeddedDb.db4o");
// opening a server for a client/server session
ServerConfiguration serverConfiguration = Db4oClientServer.newServerConfiguration();
FileConfiguration fileConfiguration = serverConfiguration.file();
fileConfiguration.storage(new MemoryStorage());
Db4oClientServer.openServer(serverConfiguration, "myServerDb.db4o", PORT);
(3) Using the NonFlushingStorage for highly improved db4o speed at the risk of corrupted database files in case of abnormal system failures
Storage nonFlushingStorage = new NonFlushingStorage(new CachingStorage(new FileStorage()));
// opening an embedded session
EmbeddedConfiguration embeddedConfiguration = Db4oEmbedded.newConfiguration();
FileConfiguration fileConfiguration = embeddedConfiguration.file();
fileConfiguration.storage(nonFlushingStorage);
Db4oEmbedded.openFile(embeddedConfiguration, "myEmbeddedDb.db4o");
// opening a server for a client/server session
ServerConfiguration serverConfiguration = Db4oClientServer.newServerConfiguration();
FileConfiguration fileConfiguration = serverConfiguration.file();
fileConfiguration.storage(nonFlushingStorage);
Db4oClientServer.openServer(serverConfiguration, "myServerDb.db4o", PORT);
In case you have missed Adriano's article, the above also demonstrates how the new configuration interface is intended to be used.
Now that we had a new good working caching architecture, we thought about further places where to use it. We came up with the idea to cache BTree nodes directly. Since these nodes are used for class indexes and for field indexes we expected improved performance for repetitive queries.
To get caching of BTree nodes to work, we implemented a new CacheablePersistentBase class to override some of the default behaviour of PersistentBase and derived BTreeNode from this new class.
After we had all up and running smoothely we compared the performance of different setups with the Poleposition benchmark. We could indeed see much faster queries if queries were executed in the same way a second time. Therefore we decided to leave this new second level cache in db4o as the default with a size of 30.
The size of the new node cache is also configurable as follows:
// opening an embedded session
EmbeddedConfiguration embeddedConfiguration = Db4oEmbedded.newConfiguration();
embeddedConfiguration.cache().slotCacheSize(64);
Db4oEmbedded.openFile(embeddedConfiguration, "myEmbeddedDb.db4o");
//opening a server for a client/server session
ServerConfiguration serverConfiguration = Db4oClientServer.newServerConfiguration();
serverConfiguration.cache().slotCacheSize(64);
Db4oClientServer.openServer(serverConfiguration, "myServerDb.db4o", PORT);
Do not expect wonders from using the new caches but please do play with all parameters, if you want to get the maximum speed out of db4o. The best cache settings will depend a lot on your individual usecase. We would be very happy if you would tell us about your experiences from testing out the new caches. Thanks in advance!
Enjoy!