PaLM-E
Google · March 6, 2023
● activeClosedencoder decodermultimodal
Parameters562B
Context WindowN/A tokens
Description
Landmark embodied multimodal language-vision model that directly controls robotic actuators by projecting sensor inputs into language embedding space.
Key Innovations
Multimodal
MultimodalProcessing multiple types of input (text, images, audio, video) in a single model.
robotics
embodied