Hello everyone!
I'm particularly interested in how Transformers are being applied to image recognition. The attention mechanism seems incredibly powerful.
Has anyone experimented with different numbers of layers or heads?
Posted: January 26, 2024, 10:35 AM