Implementing the Article Struct: Part 2
The only items that we have yet to gather for the Article struct are link
, details
and summary
. Let's implement these items in our existing Article::new()
function.
Obtaining Links
The link is also inside the first <a>
tag so we can simply obtain the href
information directly, ignoring the text. This can be achieved with:
let mut link = String::from(header.attr("href").unwrap();
Links obtained on Phoronix's page don't include the URL of the homepage itself. We will manually add this later when printing the link. However, because some of the links start with /
and others don't, this will cause us to print a double //
. We will want to remove the first /
if it exists. To do this, we can simply issue this line of code:
if link.starts_with("/") { assert_eq!(link.remove(0), '/'); }
This will remove the first character of link
if the first character is /
and will check to make sure that the character that we removed is a /
. The assert_eq!()
macro will panic with an error if something other than /
is removed, so this will help us by error checking our code for us.
Obtaining Details
Now that we have the links taken care of, we need to get the details. The details are obtained within a HTML class, named details
so we can just perform a find()
for it as such:
let details = node.find(Class("details")).first().unwrap().text();
The above will search for each instance of details
, only collect the first result, unwrap it and return it as a String
. That's all we need here.
Replace Add A Comment
with 0 Comments
There is only one minor detail that we may want to fix here, which is to replace any instance of Add A Comment
on details with no comments with 0 Comments
so as to not confuse yourself later. Update the previous declaration of let details
with let mut details
and add the following lines beneath it:
if details.contains("Add A Comment") {
details = details.replace("Add A Comment", "0 Comments");
}
This will tell our program to check the contents of details
and if the String
contains Add A Comment
, to replace it with a new String
with Add A Comment
replaced with 0 Comments
.
Obtaining Summaries
The last piece of information we want to collect is the summaries of each article. To do this, we will notice that each summary is stored in a single <p>
tag. If we needed to collect multiple paragraphs we would write this a bit differently, but we only need one.
let summary = node.find(Name("p")).first().unwrap().text();
Collecting it as an Article
We can finally collect all of this information inside of a single Article
by returning the following expression:
Article {
title: header.text(),
link: link,
details: details,
summary: summary,
}
Example
impl Article {
...
fn new(node: &Node) -> Article {
let header = node.find(Name("a")).first().unwrap();
let mut link = String::from(header.attr("href").unwrap());
if link.starts_with("/") { assert_eq!(link.remove(0), '/'); }
let details = node.find(Class("details")).first().unwrap().text();
if details.contains("Add A Comment") {
details = details.replace("Add A Comment", "0 Comments");
}
let summary = node.find(Name("p")).first().unwrap().text();
Article {
title: header.text(),
link: link,
details: details,
summary: summary,
}
}
}
Testing Code
Now we only need to make a couple changes to our main()
function so that we can see our new code in action:
fn main() {
let phoronix_articles = Article::get_articles();
for article in phoronix_articles {
println!("Title: {}", article.title);
println!("Link: https://www.phoronix.com/{}", article.link);
println!("Details: {}", article.details);
println!("Summary: {}\n", article.summary);
}
}
Now execute cargo run
and see that our new program is almost complete.
Printing in Reverse Chronological Order
You might notice that the terminal output isn't ideal for printing to a terminal because the newest articles will be burried behind older articles. In order to remedy this problem and ensure that the newest articles are printed last, we can loop in reverse using the rev()
function combined with iter()
.
fn main() {
let phoronix_articles = Article::get_articles();
for article in phoronix_articles.iter().rev() {
println!("Title: {}", article.title);
println!("Link: https://www.phoronix.com/{}", article.link);
println!("Details: {}", article.details);
println!("Summary: {}\n", article.summary);
}
}