Scala (DSL)
The recommended way to interact with WARP is through the Scala DSL.
This API provides a richer feature set than the Java API, including the ability
to register custom MeasurementCollector
and Arbiter
instances, and add
new tags in the form of String metadata that will be persisted.
The DSL is implemented by using an immutable case class, ExecutionConfig
to hold
all configuration parameters, such as number of invocations, warmups, threadpool size, etc.
We give sane defaults to all parameters.
Calls to the various DSL methods invoke the compiler-generated copy
method to build up
new instances of ExecutionConfig
:
|
|
Note, however, that the DSL itself manages the measurement lifecycle. Thus, we do not recommend using @WarpTest
annotation
together with the DSL, as that would lead to doubly measured tests. The DSL can be especially useful in cases where users
already make heavy use of BeforeEach
/AfterEach
hooks. @WarpTest
annotation is implemented using JUnit before/after hooks, the order
of which cannot be controlled. Thus, it is possible that tests using @WarpTest
will have extra overhead from other hooks included in their
measurement metadata. The DSL is decoupled from JUnit and can be used with other JVM testing frameworks.
Finally, a call-by name block is passed to ExecutionConfig.measuring
:
|
|
Custom Arbiter
and MeasurementCollector
instances can be registered by calling the arbiters
and collectors
methods:
|
|
The arbiters
and collectors
methods both accept a call-by-name function that returns an Iterable
. We use an implicit
to lift a single instance into an Iterable
type.
The threshold defined by the should not exceed
syntax is implemented as a scalatest Matcher[Duration]
.
DSL Operations
The DSL provides a flexible way to describe experimental setups for conducting repeated trials, and supports the following operations:
-
testId(id: TestId)
sets a testId, the name under which results will be recorded in our database. (default"com.workday.warp.Undefined.undefined"
. Typically we use the fully qualified method name of the test being measured.) -
invocations(i: Int)
sets the number of measured trial invocations (default 1). -
warmups(w: Int)
sets the number of unmeasured warmups (default 0). -
threads(p: Int)
sets the thread pool size (default 1). -
distribution(d: DistributionLike)
sets aDistribution
to govern expected delay between scheduling invocations (default 0 delay). -
mode(m: ModeWord)
sets a “mode” for test measurement. This is an advanced feature that only applies to experiments with a threadpool of at least 2 threads. Thesingle
mode will treat the entire schedule of invocations as a single logical test. A single controller will be created to measure the entire schedule. Themulti
mode (which is the default) measures each invocation on an individual basis. -
only defaults
is a no-op included for more English-like readability. -
no arbiters
disables all existing defaultArbiter
. -
arbiters(a: => Iterable[ArbiterLike])
registers a collection of newArbiter
to act on the results of the measured test. -
only these arbiters(a: => Iterable[ArbiterLike])
composes the two above operations; disabling all existing arbiters and registering some new arbiters. -
no collectors
disables all existing default measurement collectors. -
collectors(c: => Iterable[AbstractMeasurementCollector])
registers a collection of new measurement collectors to measure the given test. -
only these collectors(c: => Iterable[AbstractMeasurementCollector])
composes the two above operations; disabling all existing collectors before registering some new collectors. -
tags(t: => Iterable[Tag])
registers a sequence of Tags that will be applied to the trials. -
measure(f: => T)
(and its synonym,measuring
) perform the measurement process wrapped aroundf
. We include themeasuring
synonym solely for more English-like readability when combined with a static threshold using our scalatestMatcher[Duration]
:This is naturally the most complex operation, as it involves creating a threadpool, layering collectors in the correct order, persisting results, etc.1 2
using only defaults measure someExperiment using only defaults measuring someExperiment should not exceed (5 seconds)
DSL Return Type
After we have constructed an ExecutionConfig
that fits the needs of our experiment, we call the measuring
(or measure
, a synonym) method
with the call-by-name function f
that we want to measure. The return type of measuring
is List[TrialResult[T]]
, where T
is the return type of f
. This gives us access to the returned values of each invocation of f
if they are needed for further
analysis.
Additionally, TrialResult
holds a TestExecutionRow
corresponding to the newly written row in our database, and
the measured response time of the trial.
For example, suppose we are interested in measuring the performance of List.fill[Int]
construction:
|
|
After measurement has been completed, we are able to access the return values of the function being measured.
The DSL provides a flexible way to customize the execution schedule of your experiment, including adding new measurement collectors and arbiters for defining failure criteria.