class GoodNews::Scraper

Constants

HOMEPAGEURL

A constant to store the homepage.

Public Class Methods

get_articles() click to toggle source

This method is used to get and store each topic's articles. Calls Topic's @@all Class variable array to loop through Topic objects. Instantiates new Article object. Saves article's web address and title to Article object. Pushes Article object into the Topic object's articles attribute(an array).

# File lib/good_news/scraper.rb, line 28
def self.get_articles
    GoodNews::Topic.all.each do |topic|
        doc = self.get_page(topic.web_addr)
        doc.css("h3.entry-title a").each do |info| 
            new_article = GoodNews::Article.new
            new_article.web_addr = info.attribute("href").value
            new_article.title = info.text
            topic.articles.push(new_article) 
        end
    end
end
get_page(url) click to toggle source

Uses open-uri and nokogiri to grab and parse the HTML. Returns the parsed page in a array which sets it up for a search using CSS selectors.

# File lib/good_news/scraper.rb, line 7
def self.get_page(url)
    return Nokogiri::HTML(open(url))
end
get_topics() click to toggle source

This method grabs Topics and stores them. Uses Class method get_page and saves to doc. Instantiates a Topic object and stores the topic name and web address in the Topic object. Saves each Topic object in the Topic Class variable @@all using the save method.

# File lib/good_news/scraper.rb, line 14
def self.get_topics
    doc = self.get_page(HOMEPAGEURL)
    doc.css("ul.td-category a").each do |topic|
        new_topic = GoodNews::Topic.new
        new_topic.name = topic.text
        new_topic.web_addr = topic.attribute("href").value
        new_topic.save
    end
end