class GoodNews::Scraper
Constants
- HOMEPAGEURL
A constant to store the homepage.
Public Class Methods
get_articles()
click to toggle source
This method is used to get and store each topic's articles. Calls Topic's @@all Class variable array to loop through Topic objects. Instantiates new Article object. Saves article's web address and title to Article object. Pushes Article object into the Topic object's articles attribute(an array).
# File lib/good_news/scraper.rb, line 28 def self.get_articles GoodNews::Topic.all.each do |topic| doc = self.get_page(topic.web_addr) doc.css("h3.entry-title a").each do |info| new_article = GoodNews::Article.new new_article.web_addr = info.attribute("href").value new_article.title = info.text topic.articles.push(new_article) end end end
get_page(url)
click to toggle source
Uses open-uri and nokogiri to grab and parse the HTML. Returns the parsed page in a array which sets it up for a search using CSS selectors.
# File lib/good_news/scraper.rb, line 7 def self.get_page(url) return Nokogiri::HTML(open(url)) end
get_topics()
click to toggle source
This method grabs Topics and stores them. Uses Class method get_page and saves to doc. Instantiates a Topic object and stores the topic name and web address in the Topic object. Saves each Topic object in the Topic Class variable @@all using the save method.
# File lib/good_news/scraper.rb, line 14 def self.get_topics doc = self.get_page(HOMEPAGEURL) doc.css("ul.td-category a").each do |topic| new_topic = GoodNews::Topic.new new_topic.name = topic.text new_topic.web_addr = topic.attribute("href").value new_topic.save end end