text2vec: Modern Text Mining Framework for R

Fast and memory-friendly tools for text vectorization, topic modeling (LDA, LSA), word embeddings (GloVe), similarities. This package provides a source-agnostic streaming API, which allows researchers to perform analysis of collections of documents which are larger than available RAM. All core functions are parallelized to benefit from multicore machines.

Version: 0.6.4
Depends: R (≥ 3.6.0), methods
Imports: Matrix (≥ 1.5-2), Rcpp (≥ 1.0.3), R6 (≥ 2.3.0), data.table (≥ 1.9.6), rsparse (≥ 0.3.3.4), stringi (≥ 1.1.5), mlapi (≥ 0.1.0), lgr (≥ 0.2), digest (≥ 0.6.8)
LinkingTo: Rcpp, digest (≥ 0.6.8)
Suggests: magrittr, udpipe (≥ 0.6), glmnet, testthat, covr, knitr, rmarkdown, proxy
Published: 2023-11-09
DOI: 10.32614/CRAN.package.text2vec
Author: Dmitriy Selivanov [aut, cre, cph], Manuel Bickel [aut, cph] (Coherence measures for topic models), Qing Wang [aut, cph] (Author of the WaprLDA C++ code)
Maintainer: Dmitriy Selivanov <selivanov.dmitriy at gmail.com>
BugReports: https://github.com/dselivanov/text2vec/issues
License: GPL-2 | GPL-3 | file LICENSE [expanded from: GPL (≥ 2) | file LICENSE]
URL: http://text2vec.org
NeedsCompilation: yes
Materials: README, NEWS
In views: NaturalLanguageProcessing
CRAN checks: text2vec results [issues need fixing before 2025-11-15]

Documentation:

Reference manual: text2vec.html , text2vec.pdf
Vignettes: GloVe Word Embeddings (source, R code)
Analyzing Texts with the text2vec Package (source, R code)

Downloads:

Package source: text2vec_0.6.4.tar.gz
Windows binaries: r-devel: text2vec_0.6.4.zip, r-release: text2vec_0.6.4.zip, r-oldrel: text2vec_0.6.4.zip
macOS binaries: r-release (arm64): text2vec_0.6.4.tgz, r-oldrel (arm64): text2vec_0.6.4.tgz, r-release (x86_64): text2vec_0.6.4.tgz, r-oldrel (x86_64): text2vec_0.6.4.tgz
Old sources: text2vec archive

Reverse dependencies:

Reverse imports: blocking, conText, manydata, NUSS, occupationMeasurement, PsychWordVec, regtools, text2emotion, text2map, textmineR, ttgsea, wactor, wordsalad
Reverse suggests: fdm2id, lime, oolong, polyglotr, sentiment.ai, textrecipes
Reverse enhances: quanteda

Linking:

Please use the canonical form https://CRAN.R-project.org/package=text2vec to link to this page.

mirror server hosted at Truenetwork, Russian Federation.