Word Embedding Download


Word2Vector

Chinese Word Vectors 中文词向量

英文词向量

Fasttext

Find more fasttext pretrained model at: fastText.

The first line of the file contains the number of words in the vocabulary and the size of the vectors. Each line contains a word followed by its vectors, like in the default fastText text format. Each value is space separated. Words are ordered by descending frequency. These text models can easily be loaded in Python using the following code:

import io

def load_vectors(fname):
    fin = io.open(fname, 'r', encoding='utf-8', newline='\n', errors='ignore')
    n, d = map(int, fin.readline().split())
    data = {}
    for line in fin:
        tokens = line.rstrip().split(' ')
        data[tokens[0]] = map(float, tokens[1:])
    return data

Bert

其他资源

NLP 常用模型和数据集高速下载

fastNLP可加载embedding与数据集


文章作者: Passerby-W
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 Passerby-W !
评论
  目录