tm.plugin.factiva-package {tm.plugin.factiva}R Documentation

A plug-in for the tm text mining framework to import articles from Factiva

Description

This package provides a tm Source to create corpora from articles exported from Dow Jones's Factiva content provider as XML or HTML files.

Details

Typical usage is to create a corpus from a XML or HTML files exported from Factiva (here called myFactivaArticles.xml). Setting language=NA allows the language to be set automatically from the information provided by Factiva:

    # Import corpus
    source <- FactivaSource("myFactivaArticles.xml")
    corpus <- Corpus(source, list(language=NA))

    # See how many articles were imported
    corpus

    # See the contents of the first article and its meta-data
    inspect(corpus[1])
    meta(corpus[[1]])
  

Currently, only HTML files saved in French are supported. Please send the maintainer examples of Factiva files in your language if you want it to be supported.

See FactivaSource for more details and real examples.

Author(s)

Milan Bouchet-Valat <nalimilan@club.fr>

References

https://global.factiva.com/


[Package tm.plugin.factiva version 1.8.1 Index]