Workload usage

Note

This documentation is relevant to IBM Quantum® Platform Classic. If you need the newer version, go to the new IBM Quantum Platform documentation.

Usage is a measurement of the amount of time the QPU is locked for your workload, and it is calculated differently, depending on which execution mode you're using.

Session usage is the time from when the first job starts until the session goes inactive, is closed, or when its last job completes, whichever happens last. It includes both classical and quantum time (time spent by the QPU complex to process your job).
Batch usage is the sum of quantum time of all jobs in the batch.
Single job usage is the quantum time the job uses in processing.

The usage reported on the dashboard or by using the API is the time a QPU is locked for your workload. Failed or canceled jobs count toward your usage in certain circumstances - see the Failed and canceled jobs section for details.

Your usage has different impacts, depending on which channel you're using:

For Qiskit Runtime on IBM Cloud® users, the usage determines how much the job costs. See Manage cost for details.
For IBM Quantum® Platform users, this translates to shares. The fair-share scheduler prioritizes instances with the most shares left. Thus, the higher your usage, the longer your next job stays in the queue.

Usage for failed and canceled jobs

When a job is failed or canceled, the reported usage is as follows:

$Job or batch mode: The reported usage is the time the QPU was locked for executing your workload until the time it failed or was canceled. Therefore, if the failure or cancellation occurred before the lock, the reported usage is zero. Otherwise, the workload's reported usage is the amount of usage before the workload failed or was canceled. Thus, some failed jobs do not appear in your reported usage and others do.$
$Session mode: The reported usage is the wall-clock time from when the first job started executing in the session until the session terminates, regardless of the number of jobs that fail or are canceled.$

Determine a workload's actual usage

After a workload has completed, there are several ways to view its actual usage:

Run batch.usage() or session.usage() in qiskit-ibm-runtime 0.30 or later. If using an older version of qiskit-ibm-runtime (>= 0.23 and < 0.30), the usage can be still be found in session.details()["usage_time"] and batch.details()["usage_time"].

Call the GET usage REST API directly to see the total usage across all workloads for your account.

Use GET /sessions/{id} to see usage for a specific batch or session.
Use GET /jobs/{id} to see usage for a single job.

Estimate workload usage

After submitting a job to the IBM Quantum channel, you can see an estimation for how much quantum time the job will take to run by using job.usage_estimation . Quantum time is the duration, in seconds, a QPU is committed to fulfilling a user request.

Alternatively, you can view this information on IBM Quantum Platform by opening the job details.

Note

This only applies to jobs that use primitives.

Example:

from qiskit import QuantumCircuit
from qiskit_ibm_runtime import QiskitRuntimeService, SamplerV2 as Sampler
from qiskit.transpiler import generate_preset_pass_manager
 
service = QiskitRuntimeService()
 
# Create a new circuit with two qubits (first argument) and two classical
# bits (second argument)
qc = QuantumCircuit(2, 2)
 
# Add a Hadamard gate to qubit 0
qc.h(0)
 
# Perform a controlled-X gate on qubit 1, controlled by qubit 0
qc.cx(0, 1)
 
# Measure qubit 0 to cbit 0, and qubit 1 to cbit 1
qc.measure(0, 0)
qc.measure(1, 1)
 
# Run on the least-busy device you have access to
backend = service.least_busy(simulator=False,operational=True)
 
# Generate ISA circuits
pm = generate_preset_pass_manager(backend=backend, optimization_level=1)
isa_circuit = pm.run(qc)
 
# Create a Sampler object
sampler = Sampler(backend)
 
# Submit the circuit to the sampler
job = sampler.run([isa_circuit])
 
print(job.usage_estimation)

Output:

{'quantum_seconds': 4.1058720028432445}

Estimate usage before submitting a job

While getting an accurate local estimation is complicated by the extra operations done for error suppression and mitigation, you can use this baseline formula to get an approximation of estimated usage:

<per sub-job overhead> + (rep_delay + <circuit length>) * <num executions>

<per sub-job overhead> is an overhead of approximately 2s per sub-job. This includes operations such as loading the payload into control electronics. Your primitive job may be divided into multiple sub-jobs if it is too large for the execution engine to process all at once.
rep_delay is a user-customizable option, and the default is given by backend.default_rep_delay, which is 250 microseconds on most IBM Quantum backends. Note that lowering rep_delay decreases the total QPU execution time, but at the expense of increased state preparation error rate; see the Dynamic repetition rate execution guide for more information.
<circuit length> is the total instruction length. Each instruction takes different amount of time on the QPU, so the total length varies from circuit to circuit. A measurement, for example, can take 56 times longer than an x gate. backend.target[<instruction>][<qubit>].duration can be used to find the exact duration for each instruction. A typical circuit length is likely between 50-100 microseconds. If you are using error suppression or mitigation techniques with the primitives, extra instructions might be inserted into your circuit, which would increase the total circuit length.
<num executions> is the total number of circuits times the number of shots, where the circuits are those generated after PUB elements are broadcasted. If you are using error-mitigation techniques with the primitives, extra circuits can be run as part of the mitigation process, which would increase the total number of executions. Advanced error-mitigation techniques such as PEA and PEC come with much higher overhead because they require running circuits for noise learning.

If you aren't using any advanced error-mitigation techniques or custom rep_delay, you can use 2+0.00035*<num executions> as a quick formula.

Next steps

Recommendations

Review these tips: Minimize job run time.
Set the Maximum execution time.
Learn how to transpile locally in the Transpile section.
Try the Submit pre-transpiled circuits tutorial.

Was this page helpful?

Report a bug or request content on GitHub.