Ask questionslibtorch does not initialize OpenMP/MKL by default

I find that matrix multiplication is slower in C++ API, so I write the same code in C++ and python and record their execution times, code is as following:


#include <chrono>

int main(){
	torch::Tensor tensor = torch::randn({2708, 1433});
	torch::Tensor weight = torch::randn({1433, 16});
	auto start = std::chrono::high_resolution_clock::now();;
	auto end = std::chrono::high_resolution_clock::now();
	std::cout<< "C++ Operation Time(s) " << std::chrono::duration<double>(end - start).count() << "s" << 	std::endl;
	return 0;


C++ Operation Time(s) 0.082496s


import torch
import torch.nn as nn
import torch.nn.functional as F

tensor = torch.randn(2708, 1433)
weight = torch.randn(1433, 16)
t0 = time.time()
t1 = time.time()
print("Python Operation Time(s) {:.4f}".format(t1 - t0))


Python Operation Time(s) 0.0114

Testing Environment:

ubuntu 16.04
gcc version 5.4.0
python version 3.7.3
pytorch version 1.0.1

It's not a small difference, why is it happen???


Answer questions EsdeathYZH

I add at::init_num_threads and after that the C++ time is similar to python.

C++ Operation Time(s) 0.00281327s

I think it's probably the reason for previous results.


