ESP-NN

The library contains optimised NN (Neural Network) functions for various Espressif chips.

Supported platforms:
- TensorFlow Lite Micro (TFLite Micro). Repo can be found here
Supported ESP chips include:
- ESP32-S3 (Assembly versions optimised to benefit from vector instructions of ESP32-S3)
- ESP32-P4 (Optimised using PIE/QACC SIMD instructions)
- ESP32 (Generic optimisations)
- ESP32-C3 (Generic optimisations)

Performance

Kernelwise performance for s8 versions:

Kernelwise performance on ESP32-P4 chip

Numbers are ticks taken for kernel to execute
Chip config: 360MHz, SPI-RAM: HEX 200MHz, L2-Cache: 128KB

Function	ANSI C	Optimized	Opt Ratio	Data info	Memory
elementwise_add	187971	173104	--	size = 1615	External
elementwise_mul	79898	71245	--	size = 1615	External
convolution	4005512	572459	7.00	input(10,10), filter(64x1x1x64), pad(0,0), stride(1,1)	External
convolution	249389	98319	2.54	input(8,8), filter(16x1x1x16), pad(0,0), stride(1,1)	External
convolution	816975	533318	1.53	input(10,10), filter(64x3x3x3), pad(0,0), stride(1,1)	External
depthwise conv	962834	482389	2.00	input (16, 16), pad(0,0), stride(1,1) filter: 1x3x3x16	External
depthwise conv	1365066	703989	1.94	input (12, 12), pad(1,1), stride(1,1) filter: 8x5x5x4	External
max pool	601843	592189	--	input(16,16), filter (1x3x3x16)	Internal
avg pool	392947	380527	--	input(16,16), filter (1x3x3x16)	Internal
fully connected	7692	7616	--	len: 271, ch = 3	Internal
prelu (relu6)	22487	18963	--	size, 1615	Internal

Kernelwise performance on ESP32-S3 chip

Numbers are ticks taken for kernel to execute
Chip config: 240MHz, SPI: QPI 80MHz, Data cache: 64KB

Function	ANSI C	Optimized	Opt Ratio	Data info	Memory
elementwise_add	281337	74440	3.78	size = 1615	External
elementwise_mul	122703	35002	3.51	size = 1615	External
convolution	4712500	331008	14.24	input(10,10), filter(64x1x1x64), pad(0,0), stride(1,1)	External
convolution	312754	39022	8.01	input(8,8), filter(16x1x1x16), pad(0,0), stride(1,1)	External
convolution	2193289	394842	5.55	input(8,8), filter(64x3x3x3), pad(0,0), stride(1,1)	External
depthwise conv	1159831	184176	6.30	input(18,18), pad(0,0), stride(1,1), filter: 1x3x3x16	External
depthwise conv	1671363	372435	4.49	input(12,12), pad(1,1), stride(1,1), filter: 8x5x5x4	External
max pool	376294	48069	7.83	input(16,16), filter(1x3x3x16)	Internal
avg pool	427293	118052	3.62	input(16,16), filter(1x3x3x16)	Internal
fully connected	8443	1078	7.83	len: 271, ch = 3	Internal
softmax	15209	11107	1.37	h: 8, w: 32	Internal
prelu (relu6)	1125	98	11.48	size: 1615	Internal

Model-level performance:

Person Detection (Visual Wake Words, INT8 quantized — from esp-tflite-micro)
- Numbers are time (ms) for invoke() call, using internal memory
Chip CPU Freq without ESP-NN with ESP-NN

ESP32-P4 360MHz 1395ms 73ms

ESP32-S3 240MHz 2300ms 54ms

ESP32 240MHz 4084ms 380ms

ESP32-C3 160MHz 3355ms 426ms
MobileNetV3 Small (INT8 quantized, 224x224x3, 1000 classes)

Chip CPU Freq without ESP-NN with ESP-NN

ESP32-S3 240MHz 26000ms 1434ms

ESP32-P4 360MHz 11600ms 1305ms

Note:

The above is time taken for execution of the invoke() call
SPIRAM used for TensorArena.
Person detection on ESP32-S3 with internal RAM: 47ms
ESP32-P4 optimisation is work in progress

Without ESP-NN case is when esp-nn is completely disabled by removing below flag from CMakeLists.txt:

  # enable ESP-NN optimizations by Espressif
  target_compile_options(${COMPONENT_LIB} PRIVATE -DESP_NN)

Configuration

To configure, please use idf.py menuconfig and under ESP-NN select NN_OPTIMIZATIONS
There are two options presented:
- Optimized versions
- ANSI C
Default selection is for Optimized versions. For ESP32-S3 and ESP32-P4, assembly versions are automatically selected, whereas for other chips (viz., ESP32, ESP32-C3), generic optimisations are selected.
For debugging purposes, you may want to select ANSI C reference versions.

Contributing

If you encounter an issue with ESP-NN, or wish to submit a feature request, please use the Issues section on the Github.

For general questions related to this library, please use the esp32.com forum.

Please check CONTRIBUTING.md for further information if you'd like to contribute to ESP-NN.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
.github/workflows		.github/workflows
include		include
src		src
test_app		test_app
tests		tests
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
Kconfig.projbuild		Kconfig.projbuild
LICENSE		LICENSE
README.md		README.md
idf_component.yml		idf_component.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ESP-NN

Performance

Kernelwise performance for s8 versions:

Model-level performance:

Configuration

Contributing

Copyrights and License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Chip	CPU Freq	without ESP-NN	with ESP-NN
ESP32-P4	360MHz	1395ms	73ms
ESP32-S3	240MHz	2300ms	54ms
ESP32	240MHz	4084ms	380ms
ESP32-C3	160MHz	3355ms	426ms

Folders and files

Latest commit

History

Repository files navigation

ESP-NN

Performance

Kernelwise performance for s8 versions:

Model-level performance:

Configuration

Contributing

Copyrights and License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages