In my previous article, I showed 10 different ways to get all the Resources in a Resource Group. As I was writing it, I wondered about the relative execution speeds of the various methods. I had a general idea about how fast each out would be, but I couldn’t be sure without doing some benchmark tests. I nailed it on the fastest and slowest methods, but there were some surprises in the intermediate ones.
I thought the results deserved a separate article (this one), and decided that while I was at it, I’d discuss the process of benchmarking. The code below measures the average time it takes to execute each of the 10 functions.
Benchmark Guidelines
To do benchmarking, the first thing you need is a timer, preferably one that can measure elapsed time in microseconds. There is a way to get the time in nanoseconds, but we don’t need that kind of accuracy.
Here’s the code for the timer I used (the full code will be shown later in this article):
/* Uncomment to start with a cleared cache;
will take a *long* time to run */
// $cm = $modx->getCacheManager();
// $cm->refresh();
/* Start timer */
$tStart = microtime(true);
/* Execute the function */
$output .= $func($modx, $funcs[$func]['groupId'], $prefix, $lf);
/* End timer */
$tEnd = microtime(true);
/* get the total execution time for whatever
happened between $tStart and $tEnd */
$totalTime = ($tEnd - $tStart);
As called above with true
as the only argument, the PHP microtime() function returns a floating point number representing the number of seconds since the beginning of the “Unix Epoch” (0:00:00 January 1, 1970 GMT). A microsecond is a millionth of a second. The machine and the software running the code may not have that level of timing precision, but the resulting times should let us get a fair idea of how the functions compare in relative execution time.
In the code, we start the timer, record the time, call the function being executed, then stop the timer check the difference between the two times. The result will be the time in seconds that it took to execute the function. It would be slightly more accurate to put the code inside the function between the time checks, since the function call itself will take a little time that isn’t part of what we want to measure. However, the function call time should be the same for each call. Since we only care about the relative times of the various methods, it shouldn’t have any systematic effect, and doing it this way makes it much easier to create and modify the code.
The somewhat cryptic code between starting and ending the timer will be explained in a bit, after we’ve looked that the array containing the functions names.
First, and foremost, for any benchmark speed test you want only the code you want to measure between the timer-start and timer-stop points. Any setup you need before timing the event should run before the timer starts. Any calculations or reporting of the results should run after the timer stops.
Our Setup
/* This section will only run if we're outside of MODX */
if (! defined('MODX_CORE_PATH')) {
/* get the MODX class file */
/* Important!: Uncomment one of the "require" lines below */
/* MODX 3 */
// require '/path/to/core/directory/src/Revolution/modX.php';
/* MODX 2 */
// require 'path/to/core/directory/model/modx/modx.class.php';
/* Instantiate the $modx object */
$modx = new modX();
if ((!$modx) || (!$modx instanceof modX)) {
echo 'Could not create MODX class';
}
/* initialize MODX and set current context */
$modx->initialize('mgr');
/* load the error handler */
$modx->getService('error', 'error.modError', '', '');
}
/* Set linefeed for browser and command line */
$lf = php_sapi_name() === 'cli'
? "\n"
: '<br>';
/* Make code work in MODX 2 and MODX 3 */
$prefix = $modx->getVersionData()['version'] >= 3
? 'MODX\Revolution\\'
: '';
/* Each iteration runs all functions once */
/* The $iterations value should be a multiple of
the number of functions */
$iterations = 10;
$output = '';
$groupId = 1; /* Resource Group ID */
$allDocsGroupId ='Group1'; /* Resource Group Name */
In the code above, we first instantiate MODX. We’ll need it to run our functions.
The next bit lets our code run in a browser (as it would as a MODX Snippet), and directly from the command line or in a code editor. PHP’s php_sapi_name()
function, returns the string cli
only if the code is running from the command line or in a code editor like PhpStorm that can run code as if it were run at the command line. If we get the “cli” value back, we set the $lf
(for linefeed) variable to
"\n"
since <br>
would be printed literally and the results would all be in one long line. Otherwise, we set $lf
to
"<br>"
since the browser will convert that to a linefeed.
Next, we set the number of iterations, initialize the $output
variable, and set two variables: one ($groupId
) holds the ID of the Resource Group; the other ($allDocsGroupId
) holds the name of the Resource Group. The second one is needed by two of our functions.
$iterations
to 1
will execute each function only once. I found that this does not significantly alter the results. And, it has the advantage that minimizes round errors, and errors that accumulate due to the resolution of the timer.It might be useful, though, for other benchmarking where the code being executed takes very little time, so that you want to average many executions. In those cases, though, you’d need a more accurate time with higher resolution such as ones that use the system clock.
The Function Array
Here’s a part of the array of functions to run:
/* Array of functions to run */
$allDocsGroupId = 'Group1'; // name of the Resource Group
$groupId = 12; // ID of the Resource Group;
$funcs = array(
'useAllDocs' => array(
'groupId' => $allDocsGroupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useAllDocs_iterator' => array(
'groupId' => $allDocsGroupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useOneAlias' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
/* seven more functions here - all use $groupId */
);
The array above has two jobs. One is to provide the name of the function to run and the appropriate Resource Group identifier. The first two functions require the name of the Group. The rest use the ID of the Group. The other job is to provide a place to put the results. The key of each subarray is the name of the function to run. The first member ('groupId
) is the Group identifier.
The 'elapsedTime'
element holds the time it took for the function to run and the 'executions'
member holds the number of times the function has run. Both are incremented on each pass, so the first one holds the sum of all the executions times and the second holds the total number of executions. That way, we can divide the total time by the total number of executions and report the average time it took for the function to run.
After the timer ends for each function, we do the calculations and update the function array:
$totalTime = ($tEnd - $tStart);
/* Set values for results */
$funcs[$func]['executions']++;
$t = $funcs[$func]['elapsedTime'];
$t = $t + $totalTime;
$funcs[$func]['elapsedTime'] = $t;
In the code above, $funcs
is the function array and $funk
is the subarray whose key is the name of the function and whose values provide the Resource Group identifier, the total number of executions for that function ($funcs[$func]['executions'])
and the total elapsed time for its executions ($funcs[$func]['elapsedTime']
). For each one, we get the current value, increment it, then store it back to the function array.
Now that we’ve seen the function array, we can explain the line of code between the timer’s start and end code:
PHP allows us to set a variable to the name of a function as a string, then call the function using that variable. Here’s an example using a simple addition function:
function add($arg1, $arg2) {
return $arg1 + $arg2;
}
$func = 'myFunction';
$sum = $func(2, 3);
In the code above $sum
will be set to 5
.
Here’s an abbreviated version of our code that runs and times the functions:
foreach($funcs as $func) {
/* Start the timer here */
/* Execute the function */
$output .= $func($modx, $funcs[$func]['groupId'],
$prefix, $lf);
/* Stop the timer here */
}
So $func
is set to the name of the function. The first argument ($modx
) is the modX object, which all our functions need to run. The second argument ($funcs[$func]['groupId']
) is the appropriate Resource Group identifier for that function. The last two arguments are the Revo 2 or Revo 3 prefix and the appropriate linefeed, both of which are set near the top of the code. As written, the functions in the previous article return the pagetitle of each Resource. You’d remove that, or comment it out, if you were returning the array of Resources. It’s useful during development, though, so you can make sure each function is working properly.
Setting the $output
variable inside the timing loop adds unnecessary time to each function, but it’s a trivial amount, and it should be the same amount for each function, so it won’t affect the relative times in the results.
The Outer Loop
We saw the inner timing loop above, foreach($funcs as $func)
), above. It’s inside an outer loop that loops once for each iteration. It looks like this:
$count = $iterations;
while ($count > 0) {
foreach($funcs as $func) {
/* inner loop with function execution
and result storage */
}
if ($count) {
$funcs = rotate($funcs);
}
$count--;
}
}
The inner loop executes each function once. At the end of that loop, we rotate the functions in the function array by moving the last one to the top of the array. This is in case the execution speed of a function depends on its position in the function array. That’s why the $iterations
variable’s value should be a multiple of the number of functions (10 in our case). That way each function is executed in every position the same number of times.
Here’s the code of the rotate()
function:
/* Move last function to top for next run */
function rotate($funcs) {
$old = $funcs;
$new = array_merge(array_splice($funcs, -1), $funcs);
return $new;
}
The Report
The benchmark code is useless without a report of the results. Here’s the code that displays the results. It uses the number of iterations and the total time we stored in the last two members of each function’s array:
/* Report */
$output .= $lf . $lf . 'RESULTS' . $lf;
foreach($funcs as $func => $options) {
$output .= $lf . $func . ' -- executions: ' . $options['executions'] .
' -- ' . 'Avg. Time: ' . sprintf("%2.4f s", ($options['elapsedTime'] / $iterations)) ;
}
It loops through the function array and displays the name of the function, the number of iterations, and average iteration time for each function. The average time is displayed to 4 decimal places. Mathematicians will be upset by this, since on many machines, the resolution of the timer is not that accurate, but we don’t really care because it will still show the relative speeds close enough for our purpose.
The Full Code (Almost)
Here’s the full code of the benchmark script. The actual functions have been omitted to save space, so the code won’t run unless you add them all. You can find the full code, and an explanation of each one in my previous article. Another reason I’ve left them out is that you can use the benchmark script to time other things by adding your own functions. You’ll have to modify the function array, but otherwise the code should run fine for any set of functions you want to compare for execution speed.
/* Adjust this path, the code instantiates the $modx object */
require 'c:/xampp/htdocs/addons/assets/mycomponents/instantiatemodx/instantiatemodx.php';
/* Set linefeed for browser and command line */
$lf = php_sapi_name() === 'cli'
? "\n"
: '<br>';
/* Make code work in MODX 2 and MODX 3 */
$prefix = $modx->getVersionData()['version'] >= 3
? 'MODX\Revolution\\'
: '';
$output = '';
/* Each iteration runs all functions once */
/* The $iterations value should be a multiple of
the number of functions */
$iterations = 10;
$output = '';
$groupId = 1; /* Resource Group ID */
$allDocsGroupId = 'Group1'; /* Resource Group Name */
/* Array of functions to run */
$funcs = array(
'useAllDocs' => array(
'groupId' => $allDocsGroupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useAllDocs_iterator' => array(
'groupId' => $allDocsGroupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useOneAlias' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useTwoAliases' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useNoAliases' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useGetCollectionModRgr' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useRgGetResources' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useInnerJoin' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'useGetCollectionGraph' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
'usePDO' => array(
'groupId' => $groupId,
'elapsedTime' => 0.0,
'executions' => 0,
),
);
$count = $iterations;
while ($count > 0) {
foreach ($funcs as $func => $options) {
/* Uncomment to start with a cleared cache;
will take a *long* time to run */
// $cm = $modx->getCacheManager();
// $cm->refresh();
/* Start timer */
$tStart = microtime(true);
/* Execute the function */
$output .= $func($modx, $funcs[$func]['groupId'], $prefix, $lf);
/* End timer */
$tEnd = microtime(true);
/* get the total execution time for whatever
happened between $tStart and $tEnd */
$totalTime = ($tEnd - $tStart);
/* Set values for results */
$funcs[$func]['executions']++;
$t = $funcs[$func]['elapsedTime'];
$t = $t + $totalTime;
$funcs[$func]['elapsedTime'] = $t;
}
if ($count) {
$funcs = rotate($funcs);
}
$count--;
}
/* Report */
$output .= $lf . $lf . 'RESULTS' . $lf;
foreach($funcs as $func => $options) {
$output .= $lf . $func . ' -- executions: ' . $options['executions'] .
' -- ' . 'Avg. Time: ' . sprintf("%2.4f s", ($options['elapsedTime'] / $iterations)) ;
}
displayOutput($output);
/* Move last function to top for next run */
function rotate($funcs) {
$old = $funcs;
$new = array_merge(array_splice($funcs, -1), $funcs);
return $new;
}
/* Functions go here -- here's one example */
function useRgGetResources($modx, $groupId, $prefix = '', $lf = '<br>') {
$output = '';
$resourceGroup = $modx->getObject($prefix . 'modResourceGroup', $groupId);
$resources = $resourceGroup->getResources();
foreach ($resources as $resource) {
// $output .= $lf . $resource->get('pagetitle');
}
if (!empty($output)) {
$output = $lf . $output;
}
return $output;
}
Results
If you’ve read this far, you’re probably curious about the relative speeds of the various functions. Here’s typical result set for 10 iterations, in seconds, from slowest to fastest.
Note that the figure will vary slightly from run to run due to very small rounding errors and the resolution of the timer. The timer resolution on most Windows machines is about 15.6 milliseconds (the timer “ticks” roughly every 64th of a second) , so the times can be off that much in either directory each time we measure the time and the errors can add up depending on the number of iterations. Occasionally, this is enough to change the order of the third through 6th results, but the differences are trivial (a few hundredths of a second).
RESULTS
Function | Executions | Average Time in seconds |
---|---|---|
useAllDocs | 10 | 0.6206 s |
useAllDocs_iterator | 10 | 0.5839 s |
useOneAlias | 10 | 0.0402 s |
useTwoAliases | 10 | 0.0386 s |
useNoAliases | 10 | 0.0383 s |
useGetCollectionModRgr | 10 | 0.0362 s |
useRgGetResources | 10 | 0.0212 s |
useInnerJoin | 10 | 0.0197 s |
useGetCollectionGraph | 10 | 0.0056 s |
usePDO | 10 | 0.0005 s |
The first thing to notice is just how fast most of the methods are, though there’s a pretty dramatic difference between the first two and the rest. The third function is about 15 times faster than the first one.
The slow speed of the first two is understandable, since they’re processing every resource on the site, even the ones that aren’t in any Resource Group.
Between the first two, I was surprised to see that using getIterator()
was consistently faster than using getCollection()
. I expected the reverse to be true. I’ve generally tried to avoid using getIterator()
unless memory use is an issue, but apparently I didn’t need to. Note that getIterator()
is fairly fussy, depending on what you’re doing. I tried to integrate it into some of the other methods, and it often produced no results. This is because it doesn’t actually retrieve the Resources’ fields until they’re used for something.
Results three through six, are close enough to each other that I consider them interchangeable. In fact, the order of them changes from run to run. This was also a surprise to me. I assumed that using useGetCollectionModRgr (querying the modResourceGroupResource
object) would be the fastest of them and that the three “alias” ones would be slower and have consistent, and significant differences.
The useInnerJoin method is slightly faster than the useRgGetResources method, which calls the modResourceGroup
object’s getResources()
method, but the difference is trivial. This makes sense because the getResources()
method does the same inner join that the useInnerJoin method uses. With more accurate results, the useInnerJoin method would consistently faster because it eliminates the function call to getResource
, though the difference would only be a few milliseconds.
I did not expect the useGetCollectionGraph method to be as fast as it is. I thought the need to parse the JSON string argument would slow it down, but apparently, that’s extremely fast, and the call makes only one query to the database, unlike most of the other methods.
A final surprise was the blinding speed of the last method (usePDO). I expected it to be the fasted method, but it’s 11 times faster than the next-fastest method, more than 40 times faster than calling the ResourceGroup’s getResources()
method, and roughly 80 times faster than the “alias” methods.
Final Thoughts
For many use cases, you don’t really care about the speed. For example, when used in a utility function that will only run once. Even if it will run more often, say, when used to display the Resource Groups on a front-end page, if the page is cached, and you don’t clear the cache very often, you may not care about how fast the method is.
When displaying the groups on an uncached page, however, you might choose to use one of the faster methods. Using PDO is by far the fastest (I use it extensively in SiteCheck, which is amazingly fast considering the thousands of complex tests it performs). The downside of using PDO is that it does not return Resource objects, just an associative array of their field values. If you need to modify and save the Resources it finds, you’d have to load each Resource by ID and then process and save them (though there’s a way to speed that up in my previous article). If you just need to display the field values, though, it’s great (assuming that you can create the raw MySQL query).
If you need the Resources, the next best option is getCollectionGraph()
, which is much faster than all other methods. If you’re not up to working out the call to getCollectionGraph()
, the next best method would be calling the modResourceGroupResource
object’s getResources()
method, like this:
$group = $modx->getObject('modResourceGroup', $groupId);
$docs = $group->getResources();
foreach ($docs as $doc) {
/* Do something */
}
The method above is the easiest to write, and the code above (not counting what you do with each Resource) will take about 2 hundredths of a second.
The downside of this method is that it isn’t very adaptable to other uses. The MODX modResourceGroup
object happens to have a built-in function to get its Resources. Most other MODX objects don’t have this. In those cases, though, you can usually use $object->getMany('Alias')
to get what you want, for example: $resource->getMany('Children');
.
See this page to find out what aliases are available for each object. If you know the object you want to check, you can tack #
and the object name onto the end of the URL (e.g., #modResource) to get to that object’s section on the page.
If you care about speed, don’t know MySQL well (or need the objects themselves), and want a really fast method to get related objects. It’s worth learning how to use getCollectionGraph()
. It’s not only very fast, it will let you get multiple related objects with one query. For example, you can use one getCollectionGraph()
call to get the pages a user has created, edited, deleted, and published.
Wrapping Up
In the two articles (this one and my previous article), we’ve seen a lot of different ways to get the Resources in a particular Resource Group, and how they compare in execution speed. As I said in the first article, if you understand these methods, you can use them to get any selection of MODX objects in the database.
Bob Ray is the author of the MODX: The Official Guide and dozens of MODX Extras including QuickEmail, NewsPublisher, SiteCheck, GoRevo, Personalize, EZfaq, MyComponent and many more. His website is Bob’s Guides. It not only includes a plethora of MODX tutorials but there are some really great bread recipes there, as well.