Getting HTML Pages From Hyper

Now that our program is working, let's write a new function that utilizes the hyper crate for downloading the HTML page directly to get a live source. Let's start by adding the use cases for hyper and enabling it in our source code.

use hyper::Client;
use hyper::header::Connection;
use std::io::Read;

Downloading Phoronix's Homepage

And now we get to implementing the actual function. This function, open_phoronix(), will create a Client that will get() the homepage of Phoronix. It will then store the response inside a variable named aptly: response. Finally, we will restore the contents of the response inside a String named body and return body.

fn open_phoronix() -> String {
    let client = Client::new();
    let mut response = client.get("https://www.phoronix.com/").
        header(Connection::close()).send().unwrap();
    let mut body = String::new();
    response.read_to_string(&mut body).unwrap();
    return body;
}

open_phoronix() Usage

Now we just need to make a small change in our Article::get_articles() function to get it's input from open_phoronix() instead of open_testing().

impl Article {
    fn get_articles() -> Vec<Article> {
        Document::from_str(&open_phoronix())
            .find(Name("article")).iter()
            .map(|node| Article::new(&node)).collect()
    }
    ...
}

Feel free to comment out open_testing() now that it is no longer required. Full source code should look as follows:

main.rs Code Example

extern crate hyper;
use hyper::Client;
use hyper::header::Connection;
use std::io::Read;
extern crate select;
use select::document::Document;
use select::predicate::{Class,Name};
use select::node::Node;

fn main() {
    let phoronix_articles = Article::get_articles();
    for article in phoronix_articles.iter().rev() {
        println!("Title:   {}", article.title);
        println!("Link:    https://www.phoronix.com/{}", article.link);
        println!("Details: {}", article.details);
        println!("Summary: {}\n", article.summary);
    }
}

// fn open_testing() -> &'static str {
//     include_str!("phoronix.html")
// }

fn open_phoronix() -> String {
    let client = Client::new();
    let mut response = client.get("https://www.phoronix.com/").
        header(Connection::close()).send().unwrap();
    let mut body = String::new();
    response.read_to_string(&mut body).unwrap();
    return body;
}

struct Article {
    title:   String,
    link:    String,
    details: String,
    summary: String,
}

impl Article {
    fn get_articles() -> Vec<Article> {
        Document::from_str(&open_phoronix()).find(Name("article")).iter()
            .map(|node| Article::new(&node)).collect()
    }
    fn new(node: &Node) -> Article {
        let header = node.find(Name("a")).first().unwrap();
        let mut link = String::from(header.attr("href").unwrap());
        if link.starts_with("/") { assert_eq!(link.remove(0), '/'); }
        let details = node.find(Class("details")).first().unwrap().text();
        if details.contains("Add A Comment") {
            details = details.replace("Add A Comment", "0 Comments");
        }
        let summary = node.find(Name("p")).first().unwrap().text();
        Article { title: header.text(), link: link, details: details, summary: summary }
    }
}