StackZero
  • Homepage
  • Cryptography and Privacy
  • Ethical Hacking
  • Reverse Engineering
  • Contacts
  • About Me
No Result
View All Result
StackZero
No Result
View All Result

Subdomain scanner made easy – with Python!

December 8, 2021
in Ethical Hacking
0 0
Subdomain scanner made easy – with Python!
0
SHARES
293
VIEWS
Share on FacebookShare on Twitter

Table of Contents

Toggle
  • Subomain scanner with python? Why?
  • Prerequisites
  • Introduction
  • Auxiliary methods
  • Put all together and create the python subdomain scanner
  • Further readings

Subomain scanner with python? Why?

Do you want to build your subdomain scanner with python? But why?

There are many reasons why you might want to develop your own subdomain scanner.

Maybe you’re a security researcher who wants to find vulnerabilities in websites.
Maybe you’re a penetration tester who needs to assess the security of a client’s website.
Maybe you’re just curious about how websites work and want to learn more about how to find vulnerabilities in them.
Whatever your reason, developing your own subdomain scanner it’s not so challenging but rewarding process. It requires a bit of knowledge about network and coding.
But the end result can be a powerful tool that can help you find vulnerabilities in websites and protect yourself and others from cyber attacks.

Prerequisites

To be able to write our own tool you don’t need much, you just need to:

  • install python3
  • install the “requests” library
  • find a file with a set of possible subdomains
  • get a working connection

If we imagine we are on a kali Linux virtual machine, we probably have everything already, but let’s see the terminal commands if we are on a ubuntu machine:

sudo apt install python3
sudo apt install python3-pip
pip3 install requests
pip3 install optparse

The installation of a progress bar is optional:

pip3 install progress

Now that we’ve installed everything we need, let’s get a list of possible domains that we’re going to browse, for the tutorial I’ll use subdomains-top1million-5000.txt in this repository SecListRepo, more precisely at this address:
SecListsFile.

Introduction

Now we are ready we want our program to take as input the main domain and a file containing the list of domains to be iterated. It might also be interesting to add the possibility of saving the output to a file. You can find a few versions of a subdomain scanner online, but they all turn out to be quite slow, so let’s try using threads to try to do something better, and perhaps with the possibility of passing as a parameter the number of threads we need. So let’s see how to use the optparse library to collect the arguments we need.

def get_args():

    parser = optparse.OptionParser()

    parser.add_option('-d', '--main', dest='domain_name',
                        help='The domain name', metavar='DOMAIN_NAME')
    parser.add_option("-i", "--input", dest="input_list",
                  help="read the list from INPUT_FILE", metavar="INPUT_FILE")
    
    parser.add_option("-f", "--file", dest="output_file", default="",
                  help="write report to FILE", metavar="FILE")
    
    parser.add_option("-t", "--threads", type=int, dest='n_threads', help="Set the number of threads", metavar="N_THREADS", default=1)
    return parser.parse_args()

As we can see, the get_args method takes as input the arguments we need:

-d the main domain
-i the input file
-f the output file, if we want to save everything to a file
-t the number of threads to launch

finally, it returns the arguments that will be used in the main.

Auxiliary methods

Before writing the whole main, let’s define the methods and global variables we might need:

q = queue.Queue()
bar = None

active_domains = []
lock = threading.Lock()
def from_file(filename):
    with open(filename, 'r') as f:
        subdomains = f.read().split('\n')
        return subdomains

This is the method that reads a list of subdomains given a file and returns it to the calling method, but now let’s look at something more interesting:

def check_subdomain(domain, sub):
    subdomain = f"http://{sub.strip()}.{domain}"
    try:
        requests.get(subdomain, timeout=2)
    except (requests.exceptions.ConnectionError, requests.exceptions.ReadTimeout) as e:

        return False
     
    return True

The check_subdomain method first takes a domain and a subdomain as arguments. It then sends a get request through the requests library and waits 2 seconds if it doesn’t get an immediate response.
If the request is successful, the method returns True, otherwise, if it throws an exception, the return value is False.

The last auxiliary method is append_if_exists, which will be used to insert a subdomain in a global list of existing domains.

It also uses a lock in order to avoid concurrency’s errors

def append_if_exists(host, sub):
    
    if(check_subdomain(host, sub)):
        with lock:
            active_domains.append(f"{sub}.{host}")

Finally we have the get_active method.

def get_active():
    global q
    

    while True:
        i = q.get()

        append_if_exists( domain_name, i)
        
        bar.next()
        q.task_done()

This method iterates over a queue until it’s empty, being the queue common to all threads we want to avoid race conditions, even if they are unlikely and not very dangerous, so we can manage that using the class queue.Queue.
Inside the loop, the first thing the method does is popping the element,append the domain, update the bar and then notify the task done.

Put all together and create the python subdomain scanner

In the main we’ll put everything together, we’ll call all the defined methods, whose behaviour we already know.
The queue will contain all the subdomains from which the threads will take the next value to check, and active_domains will be a list in which each thread will insert positive results.
Into the for loop we create all threads, set the thread.daemon as True (the thread will end with the main) amd everyone will call the get_active method.
With t.start() we launch all threads and then wait for the queue’s emptying with q.join().

We will use a try-catch to be able to stop the scan using CTRL+C without losing the results.
And finally, we decide whether to print the input to the screen or save it to a file.
Having done everything, let’s see the main inside the complete code (working with a simple copy and paste for the lazy ones).

import requests
import threading
import time
import queue
from progress.bar import Bar
import optparse

q = queue.Queue()
bar = None

active_domains = []
lock = threading.Lock()

def from_file(filename):
    with open(filename, 'r') as f:
        subdomains = f.read().split('\n')
        return subdomains


def check_subdomain(domain, sub):
    subdomain = f"http://{sub.strip()}.{domain}"
    try:
        requests.get(subdomain, timeout=2)
    except (requests.exceptions.ConnectionError, requests.exceptions.ReadTimeout) as e:

        return False
    return True

def append_if_exists(host, sub):
    
    if(check_subdomain(host, sub)):
        with lock:
            active_domains.append(f"{sub}.{host}")
        

def get_args():

    parser = optparse.OptionParser()

    parser.add_option('-d', '--main', dest='domain_name',
                        help='The domain name', metavar='DOMAIN_NAME')
    parser.add_option("-i", "--input", dest="input_list",
                  help="read the list from INPUT_FILE", metavar="INPUT_FILE")
    
    parser.add_option("-f", "--file", dest="output_file", default="",
                  help="write report to FILE", metavar="FILE")
    
    parser.add_option("-t", "--threads", type=int, dest='n_threads', help="Set the number of threads", metavar="N_THREADS", default=12)
    return parser.parse_args()

def get_active():
    global q
    

    while True:
        i = q.get()

        append_if_exists( domain_name, i)
        
        bar.next()
        q.task_done()


if __name__ == "__main__":
    

    options, args = get_args()
    for s in from_file(options.input_list):
        q.put(s)
    

    bar = Bar("Subdomain scanning...", max=q.qsize())
    domain_name = options.domain_name


    try:
        pre_time = time.time()
        
        for i in range(options.n_threads):
            t = threading.Thread(target=get_active)
            t.daemon = True
            t.start()
            
        q.join()

    
    except KeyboardInterrupt:
        pass
        
    finally:

        if options.output_file:
            with open(options.output_file, 'w') as f:
                f.write("\n".join(active_domains))
        
        else:
            print("\n")
            for e in active_domains:
                print(e)
        
        print(f"\nFound {len(active_domains)} subdomains")
        print("Executed in %s seconds" % (time.time()-pre_time))

Now let’s suppose call the file main.py, this is how to use it:

python3 main.py -d <MAIN_DOMAIN> -i <SUBDOMAIN_INPUT_FILE> -f <OUTPUT_FILE> -t <THREAD_NUMBER>

# Example:

python3 main.py -d google.com -i subdomains.txt -f output.txt -t 30

Further readings

If you found it interesting to read, I recommend you the following articles:

  • How to easily change your Windows Mac Address in Python
  • How to create a host port scanner in Python in just a few lines of code!
How to create network scanner tool in a few lines of code!
Trending
How to create network scanner tool in a few lines of code!

Tags: cybersecuritydomain-scannerethical-hackinghackingnetwork-securitypython
Previous Post

How to create network scanner tool in a few lines of code!

Next Post

What is malware analysis and why is it important?

Next Post
What is malware analysis and why is it important?

What is malware analysis and why is it important?

You might also like

Cryptographic functions

Cryptographic Hash Functions in Python: Secure Your Data Easily

November 3, 2024
Malware Obfuscation Techniques: All That You Need To Know

Malware Obfuscation Techniques: All That You Need To Know

March 25, 2024
How To Do Process Enumeration: An Alternative Way

How To Do Process Enumeration: An Alternative Way

March 4, 2024
How To Do DLL Injection: An In-Depth Cybersecurity Example

How To Do DLL Injection: An In-Depth Cybersecurity Example

February 8, 2024
Process Injection By Example: The Complete Guide

Process Injection By Example: The Complete Guide

January 24, 2024
How To Build Your Own: Python String Analysis for Malware Insights

How To Build Your Own: Python String Analysis for Malware Insights

November 10, 2023

StackZero

StackZero is a specialized technical blog dedicated to the realm of cybersecurity. It primarily provides insightful articles and comprehensive tutorials designed to educate readers on developing security tools. The blog encompasses a broad spectrum of subjects, starting from the foundational principles of cryptography and extending to more sophisticated areas such as exploitation and reverse engineering. This makes StackZero an invaluable resource for both beginners and professionals in the field of cybersecurity.
The blog covers a wide range of topics, from the basics of cryptography to the more advanced topics of exploitation and reverse engineering.

Tags

application security blind sqli blind sql injection bruteforce c cesar cipher command injection cryptography ctf cybersecurity debugging dom-based xss dvwa ethical-hacking ethical hacking exploitation file inclusion gdb hacking injection javascript malware malware analysis malware evasion network-security pentesting lab picoctf pico ctf python reflected xss reverse engineering sql sqli sql injection static analysis stored xss substitution substitution cipher vulnerable application web application security web exploitation web security windows windows api xss
  • About Me
  • Contacts
  • HomePage
  • Opt-out preferences
  • Privacy Policy
  • Terms and Conditions

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
View preferences
{title} {title} {title}
No Result
View All Result
  • Homepage
  • Cryptography and Privacy
  • Ethical Hacking
  • Reverse Engineering
  • Contacts
  • About Me