Unlike in the previous competitions, the organizers determined the "right" difficulty level so that the majority of the participating planners could solve most of the instances and that the instances were still not too easy. This introduced a strong bias that favors planners that are representative of the majority. Of course, most of the participating planners belonged to the HSP-FF-LAMA family.
What this meant in practice is that the best majority planners solved most of the problems (as was intended). For the best SAT-based planners the difficulty level did not match the planners capabilities: several domains were way too easy (all instances were solved in a fraction of a second) and some domains were way too difficult (not one single instance solved.)
If the difficulty level of all domains had been based on the SAT-based planners, we would have gotten exactly the opposite result: best SAT-based planners solve (almost) all instances, and the best planners representing other paradigms would have had difficulties on several domains, in some cases not solving any instances.
Apparently, the possibility of this kind of bias was not noticed ( or it was not considered a major issue) when the procedure for choosing problem instances for the competition was devised.