Getting HTML Pages From Hyper
Now that our program is working, let's write a new function that utilizes the hyper
crate for downloading the HTML page directly to get a live source. Let's start by adding the use
cases for hyper
and enabling it in our source code.
use hyper::Client;
use hyper::header::Connection;
use std::io::Read;
Downloading Phoronix's Homepage
And now we get to implementing the actual function. This function, open_phoronix()
, will create a Client
that will get()
the homepage of Phoronix. It will then store the response inside a variable named aptly: response
. Finally, we will restore the contents of the response inside a String
named body
and return body
.
fn open_phoronix() -> String {
let client = Client::new();
let mut response = client.get("https://www.phoronix.com/").
header(Connection::close()).send().unwrap();
let mut body = String::new();
response.read_to_string(&mut body).unwrap();
return body;
}
open_phoronix()
Usage
Now we just need to make a small change in our Article::get_articles()
function to get it's input from open_phoronix()
instead of open_testing()
.
impl Article {
fn get_articles() -> Vec<Article> {
Document::from_str(&open_phoronix())
.find(Name("article")).iter()
.map(|node| Article::new(&node)).collect()
}
...
}
Feel free to comment out open_testing()
now that it is no longer required. Full source code should look as follows:
main.rs
Code Example
extern crate hyper;
use hyper::Client;
use hyper::header::Connection;
use std::io::Read;
extern crate select;
use select::document::Document;
use select::predicate::{Class,Name};
use select::node::Node;
fn main() {
let phoronix_articles = Article::get_articles();
for article in phoronix_articles.iter().rev() {
println!("Title: {}", article.title);
println!("Link: https://www.phoronix.com/{}", article.link);
println!("Details: {}", article.details);
println!("Summary: {}\n", article.summary);
}
}
// fn open_testing() -> &'static str {
// include_str!("phoronix.html")
// }
fn open_phoronix() -> String {
let client = Client::new();
let mut response = client.get("https://www.phoronix.com/").
header(Connection::close()).send().unwrap();
let mut body = String::new();
response.read_to_string(&mut body).unwrap();
return body;
}
struct Article {
title: String,
link: String,
details: String,
summary: String,
}
impl Article {
fn get_articles() -> Vec<Article> {
Document::from_str(&open_phoronix()).find(Name("article")).iter()
.map(|node| Article::new(&node)).collect()
}
fn new(node: &Node) -> Article {
let header = node.find(Name("a")).first().unwrap();
let mut link = String::from(header.attr("href").unwrap());
if link.starts_with("/") { assert_eq!(link.remove(0), '/'); }
let details = node.find(Class("details")).first().unwrap().text();
if details.contains("Add A Comment") {
details = details.replace("Add A Comment", "0 Comments");
}
let summary = node.find(Name("p")).first().unwrap().text();
Article { title: header.text(), link: link, details: details, summary: summary }
}
}