Maximum parallel throughput for training workloads

Best OCI Instances for ML Training

CPU-based machine learning training — gradient boosted trees (XGBoost, LightGBM), scikit-learn pipelines, and data preprocessing — scales linearly with parallel CPU throughput. The relevant metric here is peak PassMark at maximum vCPU count, not value per vCPU. Bare Metal shapes with 128–192 vCPUs let you parallelize cross-validation folds and hyperparameter search without network overhead. Flex shapes give you precise cost control: scale up for training runs, scale down for inference serving on the same shape family.

What to look for

ML training jobs parallelize across all available cores. Peak PassMark — the estimated total score at maximum vCPU count — is the ceiling on how fast your training loop runs. Bare Metal shapes avoid hypervisor CPU steal, reducing training time variability across runs.

→Peak PassMark (at max vCPU config)
→Max vCPUs
→PassMark per vCPU
→RAM (GB)

Ranked instances — 49 shapes

#	Shape	Peak PM	$/vCPU/hr	PM/vCPU	Max vCPUs
1	x86BMBM.Standard.E5.192	249,274	$0.0150	649	384
2	x86BMBM.DenseIO.E4.128	169,572	$0.0125	662	256
3	x86BMBM.Standard.E4.128	169,572	$0.0125	662	256
4	x86BMBM.DenseIO.E5.128	166,182	$0.0150	649	256
5	x86BMBM.Standard3.64	164,104	N/A	1,282	128
6	x86VMVM.Standard.E5.Flexflex	163,586	$0.0150	649	252
7	x86VMVM.Standard.E4.Flexflex	151,025	$0.0125	662	228
8	x86VMVM.Standard3.Flexflex	143,591	N/A	1,282	112
9	x86BMBM.Standard.E3.128	139,003	$0.0125	543	256
10	x86VMVM.Standard.E3.Flexflex	123,799	$0.0125	543	228
11	x86BMBM.HPC2.36	103,400	$0.0375	1,436	72
12	ARMBMBM.Standard.A1.160	90,152	$0.0100	563	160
13	x86VMVM.DenseIO.E4.Flexflex	84,786	$0.0125	662	128
14	x86BMBM.Optimized3.36	76,664	N/A	1,065	72
15	x86BMBM.Standard.E2.64	76,221	$0.0150	595	128
16	x86VMVM.DenseIO.E5.Flexflex	62,318	$0.0150	649	96
17	x86BMBM.Standard.B1.44	59,059	$0.0319	671	88
18	x86BMBM.Standard2.52	49,502	N/A	476	104
19	x86BMBM.DenseIO2.52	49,502	N/A	476	104
20	ARMVMVM.Standard.A1.Flexflex	45,076	$0.0100	563	80
21	x86BMBM.Standard1.36	43,962	N/A	611	72
22	x86VMVM.Optimized3.Flexflex	38,332	$0.0270	1,065	36
23	ARMVMVM.Standard.A2.Flexflex	23,336	$0.0070	150	156
24	x86VMVM.Standard2.24	22,847	N/A	476	48
25	x86VMVM.DenseIO2.24	22,847	N/A	476	48
26	x86VMVM.Standard.B1.16	21,476	$0.0319	671	32
27	x86VMVM.Standard1.16	19,539	N/A	611	32
28	x86VMVM.Standard2.16	15,231	N/A	476	32
29	x86VMVM.DenseIO2.16	15,231	N/A	476	32
30	ARMBMBM.Standard.A4.Ax.48	14,361	$0.0069	150	96
31	ARMVMVM.Standard.A4.Ax.Flexflex	13,463	$0.0069	150	90
32	x86VMVM.Standard.B1.8	10,738	$0.0319	671	16
33	x86VMVM.Standard1.8	9,769	N/A	611	16
34	x86VMVM.Standard.E2.8	9,528	$0.0150	595	16
35	x86VMVM.Standard2.8	7,616	N/A	476	16
36	x86VMVM.DenseIO2.8	7,616	N/A	476	16
37	x86VMVM.Standard.B1.4	5,369	$0.0319	671	8
38	x86VMVM.Standard1.4	4,885	N/A	611	8
39	x86VMVM.Standard.E2.4	4,764	$0.0150	595	8
40	x86VMVM.Standard2.4	3,808	N/A	476	8
41	x86VMVM.Standard.B1.2	2,684	$0.0319	671	4
42	x86VMVM.Standard1.2	2,442	N/A	611	4
43	x86VMVM.Standard.E2.2	2,382	$0.0150	595	4
44	x86VMVM.Standard2.2	1,904	N/A	476	4
45	x86VMVM.Standard.B1.1	1,342	$0.0319	671	2
46	x86VMVM.Standard1.1	1,221	N/A	611	2
47	x86VMVM.Standard.E2.1	1,191	$0.0150	595	2
48	x86VMVM.Standard.E2.1.Micro	1,191	$0.0150	595	2
49	x86VMVM.Standard2.1	952	N/A	476	2

More use cases

Best OCI Instances for Web Servers

High concurrency, low cost per request

Best OCI Instances for Databases

Single-thread performance and predictable latency

Best OCI ARM Instances

Ampere Altra and AmpereOne — the value leaders