Cryptography is the science of securing communication and data through mathematical techniques, ensuring confidentiality, integrity, and authenticity. It is widely used in web services, load balance proxies, databases etc.
Cryptography
Cryptography can be divided into 3 categories:
Symmetric key cryptography
Also known as secret key encryption, this method uses the same key for encryption and decryption. Popular symmetric key algorithms include Advanced Encryption Systems (AES), Data Encryption Systems (DES), ChaCha20, SM1, and SM4.
Asymmetric key cryptography
This method uses two keys or a key pair: a public key and a corresponding private key. The public key is publicly distributed, and can be used by anyone to encrypt messages, but only the recipient, who holds the corresponding private key, can decrypt those messages.
Popular algorithms used in asymmetric key cryptography are Rivest–Shamir–Adleman (RSA), Digital Signature Algorithm (DSA), Elliptic Curve Cryptography (ECC), Diffie-Hellman(DH), and SM2.
Hash functions
Hash functions are used to transform plaintext data of any size into a unique ciphertext or ‘fingerprint’ of fixed size. It is commonly used in message authentication, data integrity check, and digital signatures. Examples include MD5, SHA-1, SHA-2, SHA-3, and SM3.
Cipher Suites
Cipher suites are sets of instructions that enable secure network connections through the Transport Layer Security (TLS), also known as Secure Sockets Layer (SSL). These suites provide a set of algorithms and protocols required to secure communications between clients and servers.
During SSL connection initiation, the web server and the client perform a SSL handshake. This involves the two parties agreeing on a mutual cipher suite, which is used to negotiate a secure HTTPS connection.
Cipher suites contain four components: a key exchange algorithm, authentication, key encryption algorithm, and a message authentication algorithm. For example, the “ECDHE-ECDSA-AES256-GCM-SHA384” cipher suite indicates ECDHE for key exchange, ECDSA for authentication, AES256-GCM for encryption, and SHA364 for message integrity.
Open-Source Crypto libraries
In this section, we will list the most widely used crypto libraries used in data center workloads.
OpenSSL
Repository: https://github.com/openssl/openssl/
OpenSSL is a general-purpose cryptographic library. It is the most used crypto library and is pre-installed on most OS distributions. It implements basic crypto algorithms and supports different hardware architectures.
The OpenSSL version provided with different OS distros may vary, including 1.1.1, 3.0.x, 3.1.x, and 3.2.x. The current LTS version is 3.0.
BoringSSL
Repository: https://github.com/google/boringssl
BoringSSL is a fork of OpenSSL that is designed to meet Google's needs. It is not intended for general use because there are no guarantees of API or ABI stability.
AWS-LC
Repository: https://github.com/aws/aws-lc
AWS-LC is a general-purpose cryptographic library maintained by the AWS Cryptography team for AWS and their customers. It is based on code from the Google BoringSSL project and the OpenSSL project. AWS-LC adds several optimizations for both x86 and Arm processors.
AArch64cryptolib
Repository: https://github.com/ARM-software/AArch64cryptolib
AArch64cryptolib is a ‘from scratch’ implementation of cryptographic primitives aiming for optimal performance on Arm A-class cores. This library currently supports AES-GCM and AES-CBC optimized code.
OpenSSL provides AES-GCM and AES-CBC implementations, but the performance is just as good with AArch64cryptolib on Arm Neoverse N1, Arm Neoverse V1, Ampere Altra, and AmpereOne processors.
IPSec MB:
Repository: https://gitlab.arm.com/arm-reference-solutions/ipsec-mb
This is the Multi-Buffer Crypto library for IPSec on Arm64 processor. It is based on the intel-ipsec-mb library and provides the SNOW3G and ZUC algorithms on Arm64. These algorithms are widely use in Telco and 5G workloads.
In some use cases for packet processing workloads, these crypto libraries are used in conjunction with the DPDK Crypto Poll Mode Driver (PMD). Please refer to this tuning guide for details.
Performance of Crypto libraries on Ampere processors
In this section, we will compare the performance of different libraries on Ampere processors.
Ampere Altra family
Ampere Altra family products are designed to meet the requirements of modern Cloud Native Computing environments with features like predictable performance and with core counts ranging from 32 to 128. These processors are Armv8.2+ ISA compatible.
All the open-source libraries listed above provide good support for the Ampere Altra family of processors.
Let’s start by comparing OpenSSL 3.3.0 and AWS-LC (master code, commit id 9921cd9) on a single core of Altra Max M128-30 using the bssl tool provided in AWS-LC.
Figure 1 and Figure 2 show AWS-LC can provide better performance for both RSA and ECDSA signing and verifying.
ECDSA is preferred for asymmetric signs.




Figure 3 and Figure 4 show that AWS-LC is better than OpenSSL 3.3.0 for AES-CTR and AES-GCM with small block (16 bytes, 256 bytes, 1350 bytes); OpenSSL 3.3.0 is better for ChaCha20-Poly1350 and AES-GCM with large block(8192 bytes, 16384 block).
AES is preferred for Symmetric encryption and decryption.
AmpereOne family of Processors
AmpereOne processors provide up to 192 cores, which are Armv8.6+ ISA compatible. The support of SHA3, SHA512 and RNG improves the performance of cryptography.
OpenSSL optimized the performance for AmpereOne in this commit, which starts from version 3.4.0. To get better performance with OpenSSL 3.2.x or 3.3.x, a backport is needed. Please refer to this repository for patches
The AWS-LC library does not target specific optimizations for AmpereOne. To get better performance on AmpereOne, please refer to this fork.
We will compare 3 different libraries:
- OpenSSL 3.3.0-opt: OpenSSL 3.3.0 with the optimization for AmpereOne
- AWS-LC: The AWS-LC library with commit id 9921cd9
- AWS-LC-opt: AWS-LC library with AmpereOne optimizations
Figure 5 and Figure 6 show AWS-LC-opt provide the equivalent performance as OpenSSL 3.3.0 on RSA sign, but better performance for RSA verification and ECDSA algorithms.
Figure 7 and Figure 8 show AWS-LC-opt provide better performance for AES-GCM and AES-CTR than OpenSSL-3.3.0-opt. And OpenSSL-3.3.0-opt performs better with ChaCha20-Poly1305.



Performance Scaling with core count
Per-core cipher performance is critical to faster TLS handshake performance in latency-sensitive usages like web servers. Similarly, performance and scalability of ciphers is important at the processor level, especially as you scale out with cores.
In this section, we will use the speed test provided by OpenSSL library to show the performance scaling with core count on AmpereOne A192-32X processor and compare scalability with a similar class of processor.
The following command is used to test the AES-128-GCM throughput for 1024 bytes with different threads. “numactl” is used to affinitize the test to number of cores equaling the number of openssl threads.
numactl -C 0-N openssl speed -multi $threads --bytes 1024 --seconds 10 -evp aes-128-gcm
For reference, we compared AmpereOne A192-32X with AMD Genoa 9654; both processors have 192 threads. On AmpereOne, each thread is a physical core, but on AMD Genoa, there are 96 physical cores, each with 2 Simultaneous Multithreads (SMT). SMT can lead to poor scaling beyond 50% because of the underlying technology constraints.
Figure 9 shows linear performance scaling with core count on AmpereOne processor. And this linear performance scaling makes a 1.37x throughput on AmpereOne A192-32X compared to AMD Genoa 9654, as Figure 10 illustrates.

Summary
From a performance perspective, it is recommended to prefer ECDSA over RSA for digital signatures. ECDSA generally offers better performance and security efficiency, especially with smaller key sizes, which can lead to faster signature generation compared to RSA.
For symmetric encryption and decryption, AES-GCM should be preferred over ChaCha20-Poly1305.
When using cryptographic libraries, it is advisable to use the version of OpenSSL or AWS-LC that includes optimizations for the AmpereOne architecture. These optimizations leverage the new features on AmpereOne processors to enhance performance.
Furthermore, AWS-LC, which offers additional performance benefits with different implementations, should be preferred over OpenSSL.
Finally, the cryptographic performance on AmpereOne scales linearly with the core count. That means AmpereOne can bring more benefits when more cores are utilized.
References
https://www.geeksforgeeks.org/what-is-a-symmetric-encryption/
https://www.geeksforgeeks.org/asymmetric-key-cryptography/
https://www.geeksforgeeks.org/cryptography-hash-functions/
https://amperecomputing.com/tuning-guides/dpdk-cryptography-build-and-tuning-guide
https://amperecomputing.com/products/processors
https://github.com/openssl/openssl
https://github.com/aws/aws-lc/
https://github.com/ARM-software/AArch64cryptolib
https://gitlab.arm.com/arm-reference-solutions/ipsec-mb
https://github.com/AmpereComputing/openssl
https://github.com/AmpereComputing/aws-lc/tree/dev-ampereone

