Skip to content

Commit

Permalink
build: initial TBB support
Browse files Browse the repository at this point in the history
  • Loading branch information
Fomenko, Evarist M committed Sep 12, 2018
1 parent 038c4f5 commit 19588d1
Show file tree
Hide file tree
Showing 12 changed files with 578 additions and 28 deletions.
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ include("cmake/utils.cmake")
include("cmake/options.cmake")
include("cmake/platform.cmake")
include("cmake/OpenMP.cmake")
include("cmake/TBB.cmake")
include("cmake/SDL.cmake")
include("cmake/MKL.cmake")
include("cmake/Doxygen.cmake")
Expand Down
84 changes: 73 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,8 +66,9 @@ Please submit your questions, feature requests and bug reports on
**WARNING** The following functionality has preview status and might change
without prior notification in future releases:
* Convolutions with `s16` data type in source, weights or destination
* Convolutions and auxillary primitives for 3D spatial data
* Convolutions and auxiliary primitives for 3D spatial data
* RNN, LSTM and GRU primitives
* Intel Threading Building (Intel TBB\*) support

## How to Contribute
We welcome community contributions to Intel MKL-DNN. If you have an idea how to improve the library:
Expand Down Expand Up @@ -120,6 +121,7 @@ The implementation uses OpenMP\* 4.0 SIMD extensions. We recommend using
Intel(R) Compiler for the best performance results.

## Installation

Download [Intel MKL-DNN source code](https://github.com/intel/mkl-dnn/archive/master.zip)
or clone the repository to your system

Expand Down Expand Up @@ -147,18 +149,42 @@ libraries using provided script
```

or manually from [GitHub release section](https://github.com/intel/mkl-dnn/releases)
and unpack it to the `external` directory in the repository root.
and unpack it to the `external` directory in the repository root. Intel MKL-DNN
can also be built with full Intel MKL, if the latter is installed on the system.
You might need to set `MKLROOT` environment variable to the path where full
Intel MKL is installed to help cmake locate the library.

You can choose to build Intel MKL-DNN without binary dependency. The resulting
version will be fully functional, however performance of certain convolution
shapes and sizes and inner product relying on SGEMM function may be suboptimal.

> **Note**
>
> Using Intel MKL small libraries currently work for Intel MKL-DNN built with
> OpenMP\* only. Building with Intel TBB requires either full Intel MKL library
> or standalone build.
Intel MKL-DNN uses a CMake-based build system

```
mkdir -p build && cd build && cmake .. && make
mkdir -p build && cd build && cmake $CMAKE_OPTIONS .. && make
```

Here `$CMAKE_OPTIONS` are options to control the build. Along with the standard
cmake options such as `CMAKE_INSTALL_PREFIX` or `CMAKE_BUILD_TYPE`,
user can also pass Intel MKL-DNN specific ones:

|Option | Possible Values (defaults in bold) | Description
|:--- |:--- | :---
|MKLDNN_LIBRARY_TYPE | **SHARED**, STATIC | Defines resulting library type
|MKLDNN_THREADING | **OMP**, TBB | Defines threading type
|WITH_EXAMPLE | **ON**, OFF | Controls building examples
|WITH_TEST | **ON**, OFF | Controls building tests
|VTUNEROOT | *path* | Enables integration with Intel(R) Vtune(tm) Amplifier

Please check [cmake/options.cmake](cmake/options.cmake) for more options
and details.

Intel MKL-DNN includes unit tests implemented using the googletest framework. To validate your build, run:

```
Expand All @@ -181,22 +207,32 @@ will place the header files, libraries and documentation in `/usr/local`. To ch
the installation path, use the option `-DCMAKE_INSTALL_PREFIX=<prefix>` when invoking CMake.

## Linking your application
Intel MKL-DNN include several header files providing C and C++ APIs for
the functionality and several dynamic libraries depending on how Intel MKL-DNN
was built. Intel OpenMP runtime and Intel MKL small libraries are not installed
for standalone Intel MKL-DNN build.

Intel MKL-DNN includes several header files providing C and C++ APIs for
the functionality and one or several dynamic libraries depending on how
Intel MKL-DNN was built. The minimal installation:

|File | Description
|:--- |:---
|include/mkldnn.h | C header
|include/mkldnn.hpp | C++ header
|include/mkldnn_types.h | auxiliary C header
|lib/libmkldnn.so | Intel MKL-DNN dynamic library
|lib/libmkldnn.a | Intel MKL-DNN static library (if built with `MKLDNN_LIBRARY_TYPE=STATIC`)


#### Intel MKL-DNN with OpenMP

If Intel MKL-DNN is built with Intel MKL small libraries the following extra
libraries would be installed:

|File | Description
|:--- |:---
|lib/libiomp5.so | Intel OpenMP* runtime library
|lib/libmklml_gnu.so | Intel MKL small library for GNU* OpenMP runtime
|lib/libmklml_intel.so | Intel MKL small library for Intel(R) OpenMP runtime
|include/mkldnn.h | C header
|include/mkldnn.hpp | C++ header
|include/mkldnn_types.h | auxillary C header

Intel MKL-DNN uses OpenMP* for parallelism and requires an OpenMP runtime
Intel MKL-DNN uses OpenMP\* for parallelism and requires an OpenMP runtime
library to work. As different OpenMP runtimes may not be binary compatible
it's important to ensure that only one OpenMP runtime is used throughout the
application. Having more than one OpenMP runtime initialized may lead to
Expand Down Expand Up @@ -227,6 +263,32 @@ same compiler correct OpenMP runtime will be used.
icpc -std=c++11 -qopenmp -I${MKLDNNROOT}/include -L${MKLDNNROOT}/lib simple_net.cpp -lmkldnn
```

#### Intel MKL-DNN with Intel TBB

Intel MKL-DNN built with Intel TBB doesn't require special handling:
```
g++ -std=c++11 -I${MKLDNNROOT}/include -L${MKLDNNROOT}/lib simple_net.cpp -lmkldnn -ltbb
```

Please note that Intel MKL-DNN has limited optimizations done for Intel TBB
and has some functional limitations if built with Intel TBB.

Functional limitations:
* Convolution with Winograd algorithm is not supported

Performance limitations (mostly less parallelism than in case of OpenMP):
* Batch normalization
* Convolution backward by weights
* mkldnn_sgemm

> **WARNING**
>
> If the library is built with full Intel MKL user is expected to set
> `MKL_THREADING_LAYER` environment variable to either `tbb` or `sequential`
> to force Intel MKL to use Intel TBB for parallelization or to be sequential
> respectively. Without this setting Intel MKL (RT library) by default would
> try to use OpenMP for parallelization.
--------

[Legal Information](doc/legal_information.md)
12 changes: 10 additions & 2 deletions cmake/MKL.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,16 @@ function(detect_mkl LIBNAME)
endif()
endfunction()

detect_mkl("mklml_intel")
detect_mkl("mklml")
# Both mklml_intel and mklml_gnu are OpenMP based.
# So in case of TBB link with Intel MKL (RT library) and either set:
# MKL_THREADING_LAYER=tbb
# to make Intel MKL use TBB threading as well, or
# MKL_THREADING_LAYER=sequential
# to make Intel MKL be sequential.
if(NOT MKLDNN_THREADING STREQUAL "TBB")
detect_mkl("mklml_intel")
detect_mkl("mklml")
endif()
detect_mkl("mkl_rt")

if(HAVE_MKL)
Expand Down
6 changes: 6 additions & 0 deletions cmake/OpenMP.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,16 @@ if(OpenMP_cmake_included)
endif()
set(OpenMP_cmake_included true)

if(NOT MKLDNN_THREADING STREQUAL "OMP")
return()
endif()

include("cmake/MKL.cmake")

if(WIN32 AND ${CMAKE_CXX_COMPILER_ID} STREQUAL MSVC)
add_definitions(/Qpar)
add_definitions(/openmp)
add_definitions(-DMKLDNN_THR=MKLDNN_THR_OMP)
else()
find_package(OpenMP)
#newer version for findOpenMP (>= v. 3.9)
Expand All @@ -44,6 +49,7 @@ else()
endif()
if(OpenMP_CXX_FOUND)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${OpenMP_CXX_FLAGS}")
add_definitions(-DMKLDNN_THR=MKLDNN_THR_OMP)
endif()
endif()

Expand Down
48 changes: 48 additions & 0 deletions cmake/TBB.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#===============================================================================
# Copyright 2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#===============================================================================

# Manage TBB-related compiler flags
#===============================================================================

if(TBB_cmake_included)
return()
endif()
set(TBB_cmake_included true)

if(NOT MKLDNN_THREADING STREQUAL "TBB")
return()
endif()

if (NOT TBBROOT)
if(DEFINED ENV{TBBROOT})
set (TBBROOT $ENV{TBBROOT})
else()
message("FATAL_ERROR" "TBBROOT is unset")
endif()
endif()

if(WIN32)
find_package(TBB REQUIRED tbb HINTS cmake/win)
elseif(APPLE)
find_package(TBB REQUIRED tbb HINTS cmake/mac)
elseif(UNIX)
find_package(TBB REQUIRED tbb HINTS cmake/lnx)
endif()

add_definitions(-DMKLDNN_THR=MKLDNN_THR_TBB)
list(APPEND mkldnn_LINKER_LIBS ${TBB_IMPORTED_TARGETS})

message(STATUS "Intel(R) TBB: ${TBBROOT}")
117 changes: 117 additions & 0 deletions cmake/lnx/TBBConfig.cmake
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Copyright (c) 2017-2018 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
#
#
#

# TBB_FOUND should not be set explicitly. It is defined automatically by CMake.
# Handling of TBB_VERSION is in TBBConfigVersion.cmake.

if (NOT TBB_FIND_COMPONENTS)
set(TBB_FIND_COMPONENTS "tbb;tbbmalloc;tbbmalloc_proxy")
foreach (_tbb_component ${TBB_FIND_COMPONENTS})
set(TBB_FIND_REQUIRED_${_tbb_component} 1)
endforeach()
endif()

# Add components with internal dependencies: tbbmalloc_proxy -> tbbmalloc
list(FIND TBB_FIND_COMPONENTS tbbmalloc_proxy _tbbmalloc_proxy_ix)
if (NOT _tbbmalloc_proxy_ix EQUAL -1)
list(FIND TBB_FIND_COMPONENTS tbbmalloc _tbbmalloc_ix)
if (_tbbmalloc_ix EQUAL -1)
list(APPEND TBB_FIND_COMPONENTS tbbmalloc)
set(TBB_FIND_REQUIRED_tbbmalloc ${TBB_FIND_REQUIRED_tbbmalloc_proxy})
endif()
endif()

set(TBB_INTERFACE_VERSION 10005)

# Intel MKL-DNN changes: use TBBROOT to locate Intel TBB
# get_filename_component(_tbb_root "${CMAKE_CURRENT_LIST_FILE}" PATH)
# get_filename_component(_tbb_root "${_tbb_root}" PATH)
set(_tbb_root ${TBBROOT})

set(_tbb_x32_subdir ia32)
set(_tbb_x64_subdir intel64)

if (CMAKE_SIZEOF_VOID_P EQUAL 8)
set(_tbb_arch_subdir ${_tbb_x64_subdir})
else()
set(_tbb_arch_subdir ${_tbb_x32_subdir})
endif()

if (CMAKE_CXX_COMPILER_LOADED)
set(_tbb_compiler_id ${CMAKE_CXX_COMPILER_ID})
set(_tbb_compiler_ver ${CMAKE_CXX_COMPILER_VERSION})
elseif (CMAKE_C_COMPILER_LOADED)
set(_tbb_compiler_id ${CMAKE_C_COMPILER_ID})
set(_tbb_compiler_ver ${CMAKE_C_COMPILER_VERSION})
endif()

# For non-GCC compilers try to find version of system GCC to choose right compiler subdirectory.
if (NOT _tbb_compiler_id STREQUAL "GNU")
execute_process(COMMAND gcc --version OUTPUT_VARIABLE _tbb_gcc_ver_output ERROR_QUIET)
string(REGEX REPLACE ".*gcc.*([0-9]+\\.[0-9]+)\\.[0-9]+.*" "\\1" _tbb_compiler_ver "${_tbb_gcc_ver_output}")
if (NOT _tbb_compiler_ver)
message(FATAL_ERROR "This Intel TBB package is intended to be used only environment with available 'gcc'")
endif()
unset(_tbb_gcc_ver_output)
endif()

set(_tbb_compiler_subdir gcc4.1)
foreach (_tbb_gcc_version 4.1 4.4 4.7)
if (NOT _tbb_compiler_ver VERSION_LESS ${_tbb_gcc_version})
set(_tbb_compiler_subdir gcc${_tbb_gcc_version})
endif()
endforeach()

unset(_tbb_compiler_id)
unset(_tbb_compiler_ver)

get_filename_component(_tbb_lib_path "${_tbb_root}/lib/${_tbb_arch_subdir}/${_tbb_compiler_subdir}" ABSOLUTE)

foreach (_tbb_component ${TBB_FIND_COMPONENTS})
set(_tbb_release_lib "${_tbb_lib_path}/lib${_tbb_component}.so.2")
set(_tbb_debug_lib "${_tbb_lib_path}/lib${_tbb_component}_debug.so.2")

if (EXISTS "${_tbb_release_lib}" AND EXISTS "${_tbb_debug_lib}")
add_library(TBB::${_tbb_component} SHARED IMPORTED)
set_target_properties(TBB::${_tbb_component} PROPERTIES
IMPORTED_CONFIGURATIONS "RELEASE;DEBUG"
IMPORTED_LOCATION_RELEASE "${_tbb_release_lib}"
IMPORTED_LOCATION_DEBUG "${_tbb_debug_lib}"
INTERFACE_INCLUDE_DIRECTORIES "${_tbb_root}/include")

# Add internal dependencies for imported targets: TBB::tbbmalloc_proxy -> TBB::tbbmalloc
if (_tbb_component STREQUAL tbbmalloc_proxy)
set_target_properties(TBB::tbbmalloc_proxy PROPERTIES INTERFACE_LINK_LIBRARIES TBB::tbbmalloc)
endif()

list(APPEND TBB_IMPORTED_TARGETS TBB::${_tbb_component})
set(TBB_${_tbb_component}_FOUND 1)
elseif (TBB_FIND_REQUIRED AND TBB_FIND_REQUIRED_${_tbb_component})
message(FATAL_ERROR "Missed required Intel TBB component: ${_tbb_component}")
endif()
endforeach()

unset(_tbb_x32_subdir)
unset(_tbb_x64_subdir)
unset(_tbb_arch_subdir)
unset(_tbb_compiler_subdir)
unset(_tbbmalloc_proxy_ix)
unset(_tbbmalloc_ix)
unset(_tbb_lib_path)
unset(_tbb_release_lib)
unset(_tbb_debug_lib)
Loading

0 comments on commit 19588d1

Please sign in to comment.