x-transformers

A simple but complete full-attention transformer with a set of promising experimental features from various papers. Proposes adding learned memory key/values prior to attending. They were able to remove feedforwards altogether and attain a similar performance to the original transformers. I have found that keeping the feedforwards and adding the memory key/values leads to even better performance. Proposes adding learned tokens, akin to CLS tokens, named memory tokens, that is passed through the attention layers alongside the input tokens. You can also use the l2 normalized embeddings proposed as part of fixnorm. I have found it leads to improved convergence when paired with small initialization (proposed by BlinkDL). The small initialization will be taken care of as long as l2norm_embed is set to True.

Features

Decoder-only (GPT-like)
Encoder-only (BERT-like)
State of the art image classification
Augmenting Self-attention with Persistent Memory
Transformers Without Tears
Root Mean Square Layer Normalization

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow x-transformers

x-transformers Web Site

User Reviews

Be the first to post a review of x-transformers!

Additional Project Details

Programming Language

Python

Related Categories

Python Machine Learning Software

Registered

2022-08-11

Similar Business Software

Speechmatics

Speechmatics is the most accurate and inclusive speech-to-text API ever released. Speechmatics is the world’s leading expert in Speech Intelligence, combining the latest breakthroughs in AI and ML to unlock the business value in human speech. Businesses use Speechmatics worldwide to...

See Software
Dialogflow

Dialogflow from Google Cloud is a natural language understanding platform that makes it easy to design and integrate a conversational user interface into your mobile app, web application, device, bot, interactive voice response system, and so on. Using Dialogflow, you can provide new and...

See Software
Google Cloud BigQuery

BigQuery is a serverless, multicloud data warehouse that simplifies the process of working with all types of data so you can focus on getting valuable business insights quickly. At the core of Google’s data cloud, BigQuery allows you to simplify data integration, cost effectively and securely...

See Software

Report inappropriate content