Optimizing SQL Based Cache Memory

Necmi Kılıç
2 min readAug 18, 2021

We all know how cache systems are very important for most of the applications. They provide faster access to data we need in spite of accessing databases each time we need. But it comes with a price: memory

Even though you already have a distributed cache system, you may have a limit and need to control the status of memory. If you don’t do it, you may encounter alarms or errors.

The first step to control your cache memory is to cleaning unused cache definitions or columns periodically. Unfortunately, most developers are lazy to drop caches or tables or columns after they retire them from application layer. They are afraid of crushing something or just postpone them. So, it would be efficient to scan your system for unused cache objects especially these you haven’t touch for a while.

Let me explain my main advice : optimizing your sql statement defined in the cache definition. Most of the cache definitions consist of 2 parts: keys and properties. You access the caches by keys and retrieve the data formatted by columns. Here is an example order data from the table Orders:

Suppose we need to retrieve orders by customerId from cache. Our key here is customerId. Our cached SQL probably would be like that:

SELECT customerID, orderId, orderDate, totalPrice, deliveryDate
FROM Orders
WHERE customerId = ?

There is no any problem for this structure. But, if you decide to decrease memory assigned for the cache, you just could aggregate data by the key. In this way, you can avoid caching the key for each row. It can be accomplished by using LISTAGG function for Oracle:

SELECT customerId, 
LISTAGG(orderId||','||orderDate||','||totalPrice||','|| deliveryDate, '|') WITHIN GROUP (ORDER BY orderDate, orderId)order
FROM Orders
WHERE customerId = ?
GROUP BY customerID

Now, instead of multiple rows to cache, you just have 1 row that contains concatenated columns. The price of this method is to parsing the properties at application level. The saved memory depends on the number of the keys and rows for each keys. The method saves more memory when there are multiple keys and more rows.

With this method you will cache below data anymore:

--

--