French men can't code: mai 2017

This morning I published a tutorial on youtube to help people get started from scratch with selenium load testing, using step as an execution platform. In less than 30 minutes you'll be able to create a script, design a test plan and use step's agent grid to run your selenium scripts in a parallel and massively scalable way.

Here's a link to the video : https://www.youtube.com/watch?v=_D4PQjdbjMI

The main reason I'm writing this blog entry is that I wanted to publish the code of the script I ended up with at the end of the tutorial, so that people could easily copy-paste it if needed. Other than changing the path of the chrome driver binary, you should be able to use it as it comes.

So here it is:

package scripts;

import java.util.HashMap;

import org.junit.Test;
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

import step.handlers.javahandler.AbstractScript;
import step.handlers.javahandler.Function;
import step.handlers.javahandler.ScriptRunner.ScriptContext;

public class SeleniumScripts extends AbstractScript {
 
 @Function
 public void CreateDriver(){
  System.setProperty("webdriver.chrome.driver", "path_to_chrome\\chromedriver.exe");
  ChromeDriver chrome = new ChromeDriver();
  
  session.put("driver", chrome);
 }
 
 @Function
 public void SampleKeyword(){
  
  long sleepTime = input.getInt("sleepTime");
  
  ChromeDriver chrome = (ChromeDriver) session.get("driver");
  chrome.get("http://exense.ch");

  sleep(sleepTime);
  
  WebElement el = chrome.findElement(By.xpath("//a[text()='Consulting']"));
  el.click();
  chrome.findElement(By.xpath("//h3[text()='Performance Analysis']"));
  
  sleep(sleepTime);

 }
 
 private void sleep(long duration){
  try {
   Thread.sleep(duration);
  } catch (InterruptedException e) {
   e.printStackTrace();
  }
 }
 
 @Function
 public void CloseDriver(){
  ChromeDriver chrome = (ChromeDriver) session.get("driver");
  chrome.quit();
  //chrome.close();
 }
 
 @Test
 public void testSampleKeyword(){
  ScriptContext sc = new ScriptContext(new HashMap< String,String >());
  sc.run("SampleKeyword", "{}");
 }

}

One more thing I would like to point out is that I made a mistake in writing my keyword "on the fly" in the video: in the CloseDriver() method, we really should be using the quit() method instead of close(), because close will just close the window but won't terminate the chrome driver process. So in order to avoid leaking chromedriver.exe process instances, make sure to call quit.

Also here's a link to the docs, which essentially guide you through all of this and provide more details. Keep in mind that not all of step's functionality has been documented yet, and that you'll need to explore and poke around if you want to benefit from all of what the platform has to offer. We're available on github and via the contact form of our website if you need help with any aspect of the tool.

I'm probably going to redo this video because I believe I could do a much better job and pack it in a 5 minute clip. I'm also planning on doing another series of tutorials like this like I did last year to get people started with djigger. We feel step is reaching a maturity level which allows us to start producing this kind of content in an effective way, meaning that the APIs have been stabilized, which means that the content will remain up-to-date, and it really is just that easy to get started with the tool now.

I hope this tutorial will help you get started and make you want to join our community, as we believe step really is bringing a lot to the table for whoever wants to do automation in a both effective and elegant way.

Not only are we addressing important technical issues such as scalability, compatibility and portability, and enhancing comfort and ease of use with a modern, central application for our users, but we're providing a ton of flexiblity.

What I mean with flexibility is that if you adopt step, you won't have to build yet another cluster for every new tool you want to use. You'll be reusing our agent technology no matter what simulation engine you've selected. Hell, it could even be your own custom technology you deploy on our agent grid! JMeter wants you to use its own master/slave logic. Grinder wants you to use its workers. Selenium wants you to use its own grid. Etc, etc. And then it's the same problem when it comes to analyzing results. Each one has their own charts, logs, file formats, etc. So it's time we unify and rationalize this mess, and finally work together on one target platform for the execution and the analysis of test results.

With step, we're doing just that. We're bringing that central solution that fits all of our test automation use cases and addresses the challenges that we've come across as a community in about 10 years of experience testing applications together.

What you'll find in this blog entry : the E2E and step-by-step analysis of a performance problem and a couple of concrete tips to help you spot synchronization issues faster.

I'll conclude this three-part analysis session with a more simple problem which still affected step's maximal keyword throughput in v3.2 after we corrected the aggregation problem (part 1) and the HTTP connection pooling problem (part 2). It seems to be it's a simple type of problem because I feel I can diagnose these quickly, but I think it's definitely worth going over, as I'll be able to showcase some of my favorite features in djigger, our open-source profiler and production monitoring tool.

So if you recall, at this point in our story, we're trying to optimize step's code to reach a desired target Keyword load of 2000 Keyword executions per second or more in a single-controller environment. I should point out that the term "optimization" was not entirely appropriate in our context up until now, since the two first problems were actually major issues affecting the stability of our test executions, thus destroying our ability to service the user which wanted to use step for that particular kind of testing (which again, was new for us/step). In particular, the HTTP connection pooling issue from part 2 was not even a throughput problem - constantly recreating HTTP connections doesn't cost that much CPU and/or time - but definitely more of stability issue revealed only in an endurance test (leaky sockets).

At the end of part 2 however, we can pretty much say we're moving toward "optimization" because we're able to achieve stable throughput at a level close to our target. So now it's all about hunting down the last few bottlenecks and hopefully getting rid of them without having to undertake any major architectural rework.

So again, we started a load scenario with a target of a about 2000 kw/s, with proper corresponding values for threading and pacing, and we found ourselves saturating around the 1600 kw/s mark. And again, I used djigger to produce a picture of the distribution of samples (which can be used as an approximation of elapsed time) across the different packages inside of the controller component.

And looking at the Reverse view with a filter on step's active thread stacks, this is what I found:

See anything interesting in here?

Remember, when we look at the Reverse tree view, we see the top-most methods of the thread stacks. And after excluding park(), wait() and other intentionally "inactive" methods, and focusing on stacks going through our own packages (.*step.*), we'll end up with method calls which are supposed to be doing some concrete thing for us.

We could take the liberty to call these "on-cpu" methods, even though the term wouldn't technically be 100% correct. Either way, these leaves (or roots, depending on the view you've chosen) are usually low-level methods from java or sun packages (for instance, methods involved with data manipulation, I/O, etc).

Knowing that, it makes sense to find methods like socketRead0(). Threads busy with this method call are either blocking synchronously and waiting for a message from a remote process (here in our situation, it could be the DB or an Agent), or are actively reading that message, using up some CPU time. We won't see this distinction just looking at this view because it's a native method and from the Java world, we can't see which line of code inside that method is being executed or which other native methods are called by this method.

Either way, the best way to investigate this time consumer, assuming you have a legitimate reason to do so, would be by unfolding that branch and further analyzing which packages lead to these socketReads0() and whether their associated percentages make sense, according to what you know the application is doing or should be doing.

Here since we're sampling step's controller, they're totally legitimate because the controller should spend most of its time delegating work to its agents, which results in blocking calls to socketRead0().

Now, something which I found a lot more interesting is the method PluginManager.invoke(). First of all, it's the only method belonging to "business code" (i.e non-java, non-sun, etc) in which a significant amount of time is being spent. Secondly, just judging by its name, it's clearly not the kind of method that will do a lot of work on its own (like iterating through a large collection and computing stuff). Its name is "invoke". Of course, there's always a chance that the sampler catches any method, even the dumbest/fastest ones, on top of a thread stack by "accident", but when 18% of the samples point to that method being on top of the stack, it can't be an accident anymore.

If you have prior experience analyzing these kinds of patterns, you should already know what the problem is at this point. For the other readers, let's take a closer look at the context of this method call, by unfolding that branch and displaying code line numbers as a node-differentiating criterion in our tree view:

So first of all, it's interesting to note that after splitting on code lines, we still find all of the samples busy in the exact same spot (i.e, line 72). This should be another strong hint at what is going on with that method.

Unfolding the branch allows us to confirm the origin of that method call and as a developer, I now know that this method is called at the end of the execution of each step of my test plan, which, from the performance standpoint, is a big deal, meaning that its a highly parallel and hence hot code path and that special care should be given to what kind of work is being done here and how it's being done.

Finally, let's take a look at that line of code:

Yeah, of course, it's part of a synchronized block. I say of course, because this was the only logical possibility which could fit with the symptoms we just went over.

Synchronization is a quick and easy way to make stuff safe and it's fine to use. But if you're going to synchronize methods or blocks which are part of very hot code paths such as in this example, you better have a damn good reason to do so.

And in this case, we didn't. It was actually an unnecessary safety precaution for an object that was actually thread safe.

There you have it: we've taken premature optimization to a whole new level, or rather we've created a new branch in that family: premature thread-safety concerns.

In any case, we just removed that synchronized block were finally able to reach our target load.

I hope this series of step-by-step analysis sessions was either helpful or fun to most of the readers. I might do another round of these in the next few months since I've again analyzed and fixed a bunch of performance issues while cleaning up R3.3. Each new feature (especially the ones that involve changes in core packages or in these "hot" code path areas), introduces a new set of risks, calls for a new series of tests, and leads to a new session of old-fashioned analysis and code cleaning.

French men can't code

vendredi 19 mai 2017

A simple selenium load test based on step

mercredi 17 mai 2017

The iterative process of achieving scale (3/3) - Unnecessary synchronization