Fascinating work: “These models hold the promise to have context lengths of millions… or maybe even a billion!” Also shows huge power of new primitives in ML: even if there were *no* more research, remixing of existing ideas would take us a long way…