Crossmodal Semantic Representations
Recently at GA-CCRi, we have been doing a lot of research in the area of reduced dimensional semantic embedding models: models where semantically similar objects possess similar representations. In a previous post, Nick discussed how such a model can be learned for relational data; since relationships between entities are explicitly provided in this data, the resulting representations are also able to capture these relationships. There are also models which learn representations from plain text; the most popular such model is word2vec by Tomas Mikolov, et al., which is able to quickly train on immensely large corpora to produce state of the art results [1,2]. At a basic level, word2vec works by learning the representation of a word based upon the representations of the words around it. Amazingly, although relationships are not explicitly stated in plain text corpora, this model is able to discover latent relationships in the text and capture them in the model. This process is traditionally described as “analogy completion”, as it answers questions of the form “Paris is to France as Berlin is to _” (which can be thought of as the representations effectively learning to capture the “capital of” predicate).