I was developing a Web Crawler with PHP. But whenever I was crawling a url to get the links from that url, I found that most of the websites use relative path to link to their internal pages. The browser is smart enough to convert those relative paths to absolute url and request the correct page whenever those relative paths are clicked. But, as it seems, PHP does not have any inbuilt support to convert those relative paths to absolute urls.
Therefore, I decided to define a method to do the job. Here I am sharing my method of converting relative path to absolute url.
The first thing to mention is the base url. The relative paths are, actually, relative to the base url. If the page contains any
<base> tag with its
href attribute set, the browser will treat that url as base url. Otherwise, the browser will treat the url of the current page as the base url.
Now first define a base url:
So we are assuming that we are currently on this page and this url is our base url as defined with
As the base url is set, its time to start defining the method:
Here we will pass the relative path as the
$relativeUrl and the base url as the
$baseUrl. I decided to pass each and every path I encounter while crawling. Therefore the path may be an absolute one. So we have to keep that in mind.
Now here is the body of the method:
Now in the method you have found two another method
onlySitePath(). Here is those methods: