Google Dorking, also known as Google Hacking, is a technique using sophisticated search queries to uncover information on the internet not easily accessible through typical search queries. It leverages the capabilities of Google’s search algorithms to locate specific text strings within search results. Contrary to the illicit connotations of "hacking," Google Dorking itself is legal – although accessing files found in the search results after performing a search perhaps might not be – and is often utilized by security professionals to identify vulnerabilities in systems.
The purpose of this blog post is to help website operators understand the types of searches that can result in vulnerabilities on their own site, so they can identify and fix security issues.
Google Dorking involves using advanced search operators in combination with keywords or strings, directing Google’s search algorithm to look for specific information. It can locate files of a particular type, search within a specific website, find keywords in web page titles, or identify pages that link to a particular URL. The core of Google Dorking is exploiting the extensive indexing of webpages by Google.
Its important to note that all results found from Google Dorking are found on publicly accessible documents, which Google has found and indexed. If sensitive information appears within these files, it’s a risk created by the site owner and up to them to resolve the issue.
These types of searches, which can be used in combination with one another, are commonly used by SEO professionals and other Google power users for legitimate purposes. They help users dive deeper into Google search results.
A couple of basic examples of Google Advanced Search Operators include:
These operators can be combined in creative ways by unscrupulous humans and bots to find confidential information, details about a website’s infrastructure and more.
For a basic example, a search for intitle:"index of" inurl:ftp can expose open FTP servers. This query could be refined to focus on specific words in the documents, such as intitle:"index of" inurl:ftp intext:confidential.
Another basic example would be a search for filetype:txt inurl:"email.txt" which can expose text lists of email addresses.
When combined with specific site: searches, it is possible to begin finding unintentionally crawled and indexed information on a particular website.
These types of searches can be applied for default phrases and paths for specific technologies and CMS systems, such as "Index of" inurl:phpmyadmin or "SquirrelMail version" "By the SquirrelMail development Team".
While a powerful tool for legitimate purposes, Google Dorking can reveal sensitive information that is unintentionally public, posing risks of privacy violations and cyber-attacks. It can expose unprotected databases, server credentials, or private documents, potentially leading to data breaches, identity theft, and other cybercrimes. Users must understand the legal and ethical boundaries to avoid infringing on privacy laws or Google's terms of service.
Although it is against Google’s terms of service, there are plenty of bots and automation tools that allow individuals to conduct massive amounts of searches quickly without manually combing through the results. A person could come up with a list of hundreds or thousands of common Google Dorks and run them against your site with an automated tool to collect all problematic results in one fell swoop.
Guarding against Google Dorking involves a combination of technical and procedural measures to prevent sensitive information from being easily accessible via search engines. Here are the best practices:
By combining these technical and procedural strategies, organizations can significantly reduce their vulnerability to Google Dorking and enhance their overall cybersecurity posture.
Over the years, Google has deprecated many search operators such as “link:” and “inpostauthor:”. The following table includes a complete list of known working Google Search Operators, some of which can be used for Google Dorking.
Search operator | How it works | Example |
“ ” | Locate pages that include specific terms or expressions. | “buttercup the pwny" |
OR | Find content associated with either A or B. | buttercup OR pwny |
| | This functions identically to OR: | buttercup | pwny |
AND | Look up content that pertains to both X and Y. | buttercup AND pwny |
- | Identify pages that exclude certain terms or expressions. | buttercup -splunk |
* | Matches any sequence of characters in search queries. | buttercup * splunk |
( ) | Consolidate several search queries into one. | (buttercup OR pwny) splunk |
define: | Query the meaning of terms or expressions. | define:pony |
cache: | Retrieve the latest stored version of a website. | cache:splunk.com |
filetype: | Look for specific file formats, like PDFs. | splunk filetype:pdf |
ext: | This is synonymous with filetype: | splunk ext:pdf |
site: | Obtain results exclusively from a certain website. | site:splunk.com |
related: | Find websites that are part of a specific domain. | related:splunk.com |
intitle: | Look for webpages with certain terms in their title tag. | intitle:splunk |
allintitle: | Identify webpages with several terms in their title tag. | allintitle:splunk enterprise |
inurl: | Locate webpages with a specific term in their URL. | inurl:splunk |
allinurl: | Search for webpages that include several terms in their URL. | allinurl:splunk enterprise |
intext: | Find content that contains a specific term. | intext:splunk enterprise |
allintext: | Look for content that contains a combination of terms. | allintext:splunk enterprise |
weather: | Get the current weather forecast for a specific area. | weather:birmingham |
stocks: | Retrieve trading details for a specific stock symbol. | stocks:splk |
map: | Compel Google to present map-based results. | map:birmingham, al |
movie: | Gather details regarding a particular film. | movie:ponies |
in | Translate one measurement unit into another. | 16oz in lb |
before: | Filter results to show only those before a specified date. | splunk before:2018-01-01 |
after: | Filter results to show only those after a specified date. | splunk after:2018-01-01 |
Google Dorking is a nuanced and potent method for information gathering, with applications ranging from cybersecurity to investigative research. Its critical for organizations to understand what parts of their website are accessible within Google results inadvertently and take appropriate measure to mitigate the vulnerabilities and risks created by exposing the wrong files to public search engines.
See an error or have a suggestion? Please let us know by emailing ssg-blogs@splunk.com.
This posting does not necessarily represent Splunk's position, strategies or opinion.
The Splunk platform removes the barriers between data and action, empowering observability, IT and security teams to ensure their organizations are secure, resilient and innovative.
Founded in 2003, Splunk is a global company — with over 7,500 employees, Splunkers have received over 1,020 patents to date and availability in 21 regions around the world — and offers an open, extensible data platform that supports shared data across any environment so that all teams in an organization can get end-to-end visibility, with context, for every interaction and business process. Build a strong data foundation with Splunk.