Recently I investigated a small performance issue in the search function of a Java application in which sampling results had shown that a ton of time (91% of the total duration) was being spent in Oracle fetches.
After figuring out that the individual fetch times were normal/acceptable, I ended up putting together a small benchmark to isolate the round-trip overhead and sort of fine tune the size value. That way I was able to find out how much time I could save by getting rid of unnecessary round-trips between the application server and the oracle database. I have to admit this was close to being a "mind candy" kind of exercise as I had already convinced the developer to come up with a more selective query (he was basically querying and iterating through the entire table, without filtering hardly anything in his WHERE clause, and I knew that this was probably the wrong approach here). Nevertheless I thought the benchmark would be interesting for future reference and also because that way I would optimize that function regardless of how selective the new query would be.
This is what the sampling session looked like in djigger :
You can see the next() and fetch() call sequence taking 91% of the time in there.
The use case was a simple search screen in a Java application. A search returning no results would take 12 seconds to complete (not very nice for the end user). The actual result set iteration and corresponding table causing that 12 second delay didn't even have anything to do with the table against which the actual search was performed. It was just some kind of a security table that needed to be queried before the actual search results would even be retrieved from the business table.
I also found out that, per default, the fetch size was set to 10. So I did a quick benchmark reusing the same select as the one from the application, locally in my eclipse workspace but connecting to the same database. I measured the query duration and more importantly, the time it took to iterate over the result set, and then I changed the fetch size a few times and ran the benchmark again. I made sure the query results where cached by Oracle and I also ran the same benchmark 3 times each time to make sure I wasn't picking up any strange statistical outliers (due to say, network round-trip time variations or jobs running on the database or database host).
I tested the following values : 1, 10, 100, 1000 and 10000. More than that would have been pointless, since the table only contained 88k records.
Using the default size (10) I got pretty much the same duration from iterating over the entire result set as the duration from the real life scenario (just a little bit less, which makes total sense) : 11 seconds. As you can see in the results below, the optimal value seemed to be floating around 1000, sparing us almost 6 seconds (half of the overall response time). I'd say that was worth the couple of hours I spent investigating the issue.
Interestingly enough, the duration of the query execution itself increased slightly as I increased the fetch size. I'm not 100% positive just yet (I could investigate that with djigger but probably won't), but I assume it has something to do with the initial / first fetch. Since I initially set the fetch size to 1000, I assume Oracle already sends a first batch of X rows back to the client when replying to the query execution call (at cursor return time). In any case, the overall duration (executeQuery + Result Set Iteration) was greatly improved.
Here's the code I executed (I had to anonymize a few things, obviously, but you can recreate this very easily if you're interested) :
EDIT : I had a doubt at some point that my benchmark was not realistic enough, since I was running my test from a "local" VM (but still, it only was 1 hop farther away on the internal network) and not from the server itself.
So I uploaded and ran the benchmark from the server itself and I basically got the exact same results. So then I double checked what was the approximate duration I was supposed to be getting with a fetch size of 10, and it turned out that it matched exactly what I had observed initially in the real scenario and measured in the application itself via instrumentation (11 seconds). So everything makes sense now.
Signing off.
Aucun commentaire:
Enregistrer un commentaire