Here is my last source code https://github.com/vDorst/wireguard/tree/mips32r2
Including the long history of try and fail ;).
But also good ideas like try to optimize the code for better data dependency.
Which makes the code less readable but more efficient.

This is the assembly part https://github.com/vDorst/wireguard/blob/mips32r2/src/crypto/chacha20-mips32r2.S

Created functions:
* asmlinkage void chacha20_keysetup(struct chacha20_ctx *ctx, const u8 key[static 32], const u8 nonce[static 8]);
* asmlinkage void chacha20_generic_block(struct chacha20_ctx *ctx);
* asmlinkage unsigned int poly1305_generic_blocks(struct poly1305_ctx *ctx, const u8 *src, unsigned int srclen, u32 hibit);

poly1305_generic_blocks is fixed in the last commit.

Code is written for MIPS32r2 Big endian.
Code has some define for __ORDER_BIG_ENDIAN__ which enable the endian swap for that data but is not tested for Litte endian.

Todo:
* Change the C code to see how fast that works and set benchmark baseline.
* Look if I can optimize assembler version even more.

Greats,

René van Dorst.


_______________________________________________
WireGuard mailing list
WireGuard@lists.zx2c4.com
http://lists.zx2c4.com/mailman/listinfo/wireguard

Reply via email to