Pentaho is slow for servers with too many home directories

Over the course of two years, browsing solutions on our Pentaho 5.4 server became progressively slow. It came to a point in which you had to wait 2-3 minutes to see the list of solutions in the Pentaho User Console.

The catalina log didn’t say much and we didn’t have too many solutions (around 200), so I though it was perhaps a database bottleneck. It all came to a screeching halt on a Friday afternoon (as usual) when, after a restart, the Pentaho Console simply stopped responding.

I turned all logging on and found that Pentaho was complaining about a lot of invalid users. Googling around I found that 5.4 performs user permissions tests on first login, calling UserDetailService for each home directory owner in the Home directory. Examining logs, we had over 4000 folders in there, accumulated from two authentication scheme changes. I could not even open the Home folder in the user console.

Pentaho versions 6.1 and over have a config flag to skip this user verification. It’s called skipUserVerificationOnPrincipalCreation, inside pentaho-solutions/system/jackrabbit/security.properties

More info at Jackrabbit Repository Performance Tuning

 

All fine and dandy, but what to do with a Pentaho 5.4 server. Or even, how to fix this after your PUC becomes unresponsive?

I thought that the Pentaho REST API might help and, sure enough, we can delete folders with it. In our case, our users don’t save anything in their home folders, so all we needed to do was to delete these 4000+ folders.

This is a nuclear option, so don’t run this unless you know what you are doing. If your users have solutions saved in their home folders, you need to amplify the following script to check for that and back up the solution and/or avoid deletion.

Open up any browser javascript console, replace <server_url> by your pentaho url and run:

$.getJSON("http://<server_url>:8080/pentaho/api/repo/files/:home/children", function(data){
    $.each(data, function(i, nodes){
        nodes.forEach(function(node){
            console.log(node.path,node.id);
            jQuery.ajax({
                async:false,
                type: "PUT",
                url: "http://<server_url>:8080/pentaho/api/repo/files/deletepermanent",
                data: node.id
            });
        })
    })
});

You can fork the gist for this at https://gist.github.com/danielpradilla/72a603a5d0de71771e0b5836bde05479

 

 

 

By Daniel Pradilla

Soy arquitecto de software y ayudo a la gente a mejorar sus vidas usando la tecnología. En este sitio, intento escribir sobre cómo aprovechar las oportunidades que nos ofrece un mundo hiperconectado. Cómo vivir vidas más simples y a la vez ser más productivos. Te invito a revisar la sección de artículos destacados. Si quieres contactarme, o trabajar conmigo, puedes usar los links sociales que encontrarás abajo.

Discover more from Daniel Pradilla

Subscribe now to keep reading and get access to the full archive.

Continue reading