More detailed benchmark, please check quantum-benchmark.
Single Gate Benchmark
Benchmarks of a) Pauli-X gate; b) Hadamard gate; c) CNOT gate; d) Toffolli gate.
Parameterized Circuit Benchmark
b) Benchmarks of parameterized circuit. c) Benchmarks of parametrized circuit with batched registers (batch size = 1000).
NOTE:
qiskit state vector simulator does not support rotation x/z gate, thus there is no benchmark on the following circuits. PennyLane benchmark contains some overhead from error handling since we do not include measurement in this benchmark (#7) the performance of CUDA may vary on different machine (#6), although the difference is not very huge