Friday, September 13, 2019

Post#34.Handling Links Part-6 (Finding broken links)


Finding broken links

What are Broken Links?

Broken links are links or URLs that are not reachable. They may be down or not functioning due to some server error.

Why should you check broken links?

You should always make sure that there are no broken links on the site because the user should not land into an error page.

Manual checking of links is a tedious task, because each webpage may have a large number of links & manual process has to be repeated for all pages.

There are few HTTP status codes that you should know. With these status codes, you can mark a link either as a valid or a broken link. For example, if a link returns 200, it means a valid link. The status indicating 404 code suggests the link is not accessible. Similarly, you can check for other status codes such as 400 – bad request, 403 – Forbidden, and 422 – unable to process etc. So, please get any additional information about the HTTP status codes from https://www.w3.org/Protocols/HTTP/HTRESP.html

Steps to Follow:
ü  Collect all the links in the web page based on <a> tag.
ü  Send HTTP request for the link and read HTTP response code.
ü  Find out whether the link is valid or broken based on HTTP response code.
ü  Repeat this for all the links captured.
ü  Display Total Valid and Invalid Links

package pack4;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import java.util.concurrent.TimeUnit;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;

public class BrokenLinks {
             
              public static int brokenLinks;
              public static int validLinks;
             
              public static void main(String[] args) throws InterruptedException {
                            
System.setProperty("webdriver.gecko.driver", "C:/path/geckodriver.exe");
                             //Instantiating FirefoxDriver
                             WebDriver driver = new FirefoxDriver();
                             //Maximize the browser
                             driver.manage().window().maximize();
                             //Implicit wait for 10 seconds
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
                             //To launch google.co.in
                             driver.get("http://www.testingbar.com/");
                             //Wait for 5 seconds
                             Thread.sleep(5000);
                            
//Used tagName method to collect the list of items with tagName "a"
//findElements - to find all the elements with in the current page. It returns a list of all webelements or an empty list if nothing matches
List<WebElement> links = driver.findElements(By.tagName("a"));         
                             //To print the total number of links
                             System.out.println("Total links are "+links.size());    
                             //used for loop to
                             for(int i=0; i<links.size(); i++) {
                                WebElement element = links.get(i);
//By using "href" attribute, we could get the url of the requried link
                                String url=element.getAttribute("href");
//calling verifyLink() method here. Passing the parameter as url which we collected in the above link
//See the detailed functionality of the verifyLink(url) method below
                               verifyLink(url);           
                             }
System.out.println("Total broken links found# " + brokenLinks);
System.out.println("Total valid links found#" + validLinks);
              }
              // The below function verifyLink(String urlLink) verifies any broken links and return the server status.
                             public static void verifyLink(String urlLink) {
                      //Sometimes we may face exception "java.net.MalformedURLException". Keep the code in try catch block to continue the broken link analysis
                      try {
   //Use URL Class - Create object of the URL Class and pass the urlLink as parameter
                                URL link = new URL(urlLink);
     // Create a connection using URL object (i.e., link)
HttpURLConnection httpConn =(HttpURLConnection)link.openConnection();
                     //Set the timeout for 2 seconds
                      httpConn.setConnectTimeout(2000);
                      //connect using connect method
                      httpConn.connect();
                      //use getResponseCode() to get the response code.
            if(httpConn.getResponseCode()== 200) {  
                  ++validLinks;
                   System.out.println(urlLink+" - "+httpConn.getResponseMessage());
                 }
            if(httpConn.getResponseCode()== 404) {
                  ++brokenLinks;
              System.out.println(urlLink+" - "+httpConn.getResponseMessage());
                  }
                  }
//getResponseCode method returns = IOException - if an error occurred connecting to the server.
                 catch (Exception e) {
                       //e.printStackTrace();
                  }
                  }
                            
              }

OutPut:
Total broken links found# 1
Total valid links found#59

No comments:

Post a Comment