Position Encoding의 종류와 분석

Prologue

Sinusoidal PE for Transformer

Permutation equivariance of Multi-Head Self-Attention

Sinusoidal PE for representing order of elements

On the learnability of PE

PE for CNN

CoordConv / Spatial Broadcast Decoder

StyleGAN

Height-driven Attention Networks (HANet)

Zero-padding

Other types of PE for Transformer variants

Absolute PE

Relative PE

Complex PE

No PE — Convolutional context

Related Topics​

MLP-based Neural Rendering

PE for representing timestep of iterative network

Hypernetworks

Epilogue

References

Sinusoidal PE for Transformer​

PE for CNN​

  • An Intriguing Failing of Convolutional Neural Networks and the CoordConv Solution (arXiv:1807.03247)
  • Spatial Broadcast Decoder: A Simple Architecture for Learning Disentangled Representations in VAEs (arXiv:1901.07017)
  • A Style-Based Generator Architecture for Generative Adversarial Networks (arXiv:1812.04948)
  • Positional Encoding as Spatial Inductive Bias in GANs (arXiv:2012.05217)
  • Cars Can’t Fly up in the Sky: Improving Urban-Scene Segmentation via Height-driven Attention Networks (arXiv:2003.05128)
  • How Much Position Information Do Convolutional Neural Networks Encode? (arXiv:2001.08248)​

Other types of PE for Transformer variants​

  • Self-Attention with Relative Position Representations (arXiv:1803.02155)
  • Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (arXiv:1901.02860)
  • An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (arXiv:2010.11929)
  • Encoding word order in complex embeddings (arXiv:1912.12333)
  • Transformers with convolutional context for ASR (arXiv:1904.11660)
  • wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (arXiv:2006.11477)​

Related Topics

  • NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis (arXiv:2003.08934)
  • Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains (arXiv:2006.10739)​

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Team Deepest

Team Deepest

More from Medium

Computer Vision, Deep Learning and Object Detection

TUSKER NO CODE AI: Empowering Global Businesses With Automation

Speech Emotion Recognition using Convolution Neural Networks

Fashion Image Search Engine