Amazon interview question

How does Multi-headed attention work? What is serialization?