Skip to yearly menu bar Skip to main content


Poster

Masked Hard-Attention Transformers Recognize Exactly the Star-Free Languages

Andy Yang · David Chiang · Dana Angluin

East Exhibit Hall A-C #2310
[ ]
Wed 11 Dec 4:30 p.m. PST — 7:30 p.m. PST

Abstract:

The expressive power of transformers over inputs of unbounded size can be studied through their ability to recognize classes of formal languages. We consider transformer encoders with hard attention (in which all attention is focused on exactly one position) and strict future masking (in which each position only attends to positions strictly to its left), and prove that they are equivalent to linear temporal logic (LTL), which defines exactly the star-free languages. A key technique is the use of Boolean RASP as a convenient intermediate language between transformers and LTL. We then take numerous results known for LTL and apply them to transformers, characterizing how position embeddings, strict masking, and depth increase expressive power.

Live content is unavailable. Log in and register to view live content