Internet – Beneath the surface!!
Internet – Beneath the surface!!
What do you think is the greatest invention of this century? The one thing that actually matters in your daily life. The thing you excessively use in your everyday lives. Vehicles? Computers? Electricity?. What about the Internet? Well, we use it daily. Of course, at an excess amount than we could even think of. From watching videos on YouTube to connecting with peoples on social medias, from googling the meaning of a simple word to finding a research paper, the Internet is the one thing we always turn to. It’s always there for us to answer our queries.
History of the Internet goes back to the late 1900s when the US army developed ARPANET (Advanced Research Projects Agency Network) to carry out their secure communications. ARPANET eventually became commercial Internet run by the private telecoms that we use today. ARPA was created by US department of defense to share data over their research labs with the universities. For those familiar with networking, it was the first package switching network, which is the foundation for modern networking. Since then, the advancements in this sector have been so huge that it became accessible to navies, then to corporate offices and is now available in the hands of every individual around the globe. Today, we use it more than ever in the form of World Wide Web(WWW).
But, what if I told you that the thing we use daily, the thing that we think we’ve explored so much is just a small part of what’s actually out there. That, what we’ve seen so far is just the tiniest fraction of an enormous iceberg. Yes. The Internet is more than what we’ve imagined and much more than what we’ve seen yet. In fact, the part of the Internet you and I access only makes about 4% of the total Internet and is what we call ‘The Surface Web.’ The rest is filled with an area called ‘The Deep Web.’ The majority of the population visits the Surface Web, but what is under the
surface? Located under the surface, The Deep Web isn’t indexed by search engines such as Google, and you need specially configured software to access it, such as the TOR browser for example.
The Surface Web vs. The Deep Web
The Internet is comprised of two pieces. Those two pieces are the Surface Web and the Deep Web. The Surface Web is the area of the Internet that the average person visits, such as visiting Facebook, Google, or YouTube. These areas can be accessed using standard software, such as a web browser. The other area of the Internet is
called the Deep Web. The Deep Web is made up of the Dark Web, Deep Web Databases, and much more. You need specialized access, software and a higher level of encryption to interact with the Deep Web. The distinction between these two areas of the Internet is very important.
The Surface Web
The Surface Web is an area of the Internet that is indexable by search engines, such as
Google or DuckDuckGo or Yahoo. Other names for this area of the Internet are
Visible Web, Lightnet, Indexed Web, Clearnet, or Indexable Web. Just to put it in
perspective, there are currently over 4 billion indexed web pages and are all included
just within the Surface Web.
The Deep Web
The Deep Web is a complex and mysterious area of the Internet. There are many reasons that its content can be used for legitimate or illegitimate purposes. There is plenty of content available in the Deep Web, such as Deep Web Hidden Services and Deep Web Databases to name a couple. Special software, such as TOR, is required to
access the Deep Web.The Deep Web is an area of the Internet that is not indexable by search engines and not linked to pages on the Surface Web. Other names for this area of the Internet are Deep Net, Hidden Web or Invisible Web. This part of the Internet makes up 96% of it, which is obviously significantly larger than the Surface Web. The Deep web is 500 times bigger than the Surface Web. There are many reasons to why a web page is not crawlable. The web page could be password protected, which would prevent a web crawler from accessing it. Another scenario could be that the web page is only allowed to be accessed a certain number of times, then it becomes unavailable. If that threshold is met before a crawler reaches the web page, then it wouldn’t be crawled. Another way that a web page
cannot be crawled is if the site’s robots.txt file explicitly says not to crawl it. A robots.txt file is located in the root of a website and will let web crawler’s know which directories are not allowed to be crawled on its site and which user agent’s the rule applies to. The last scenario that could cause a web page to be uncrawlable is if the page is simply hidden or not linked on any other page of the website. For a hidden page, somebody would need previous knowledge of the path to the page in order to visit. The average Internet user is not going to use the Deep Web, so its use should be considered suspicious.
The Dark Web
The Dark Web is an area that resides on the Deep Web. Several people confuse the Deep web and the Dark Web thinking they are the same thing. However, this is not the case. The Dark Web is mainly accessed via a software client called TOR. TOR is a special browser that allows you to navigate the Dark Web. One traditional use of the
Dark Web is in relation to malware. A significant amount of malware is using Dark Web to communicate with their Command and Control (C&C) servers. An advantage of using the Hidden Web for C&C communication is that the traffic is encrypted, so it masks the origin, destination, and payload. The use of Hidden Services on the Dark
Web is compelling.
TOR(The Onion Router)
TOR is a node based decentralized anonymity network initially researched by US Naval Research Lab and since then has been handed to the TOR project (Visit:https://www.torproject.org). About 80% budget of TOR still comes from the USNavy. TOR was introduced to the public because it’s the only way it could work. TOR works by having a relay and exit Node. A node is just a computer or software running TOR network. A relay node takes request, peels off a layer of encryption and hands it off to the next node in the route still encrypted. An exit node now takes an unencrypted request when it still reaches the end of the route and makes the call to whatever you’re trying to access, and the response comes back in reverse through the nodes. Hidden websites in the Deep Web are similar to the standard sites but end with .onion address and can only be accessed through TOR like network. These sites name
contains 16 alphanumeric characters which regularly change so that they are less memorable.