Website-Recherche-Tool, das Informationen über Domains anzeigt (wie WHOIS, IP, Backlinks usw.)

Suchen Sie nach einem Tool, das Folgendes kann:

  • Eingabe eines Domainnamens (oder Hostnamen wie shop.example.com) erhalten
  • Erstellen Sie einen Bericht für die Domain/den Host für solche Informationen wie
    • WER IST
    • IP, IP-Eigentümer, Sites auf derselben IP
    • gegen Spam-Blacklist prüfen
    • Überprüfung anhand der Sicherheit, Malware-Liste
    • Link zu Internet Archive-Aufzeichnungen
    • Liste der Backlinks zum Host
    • allgemeiner Site-Popularity-Score wie Google PageRank, Alexa Traffic Rank etc.

Alle oben genannten Informationen sollten vom Programm aus den verschiedenen Quellen abgerufen werden, ohne dass eine oder nur eine geringe Benutzeraktion erforderlich ist.

Das Programm muss in der Lage sein, alle Informationen in einem einzigen Bericht zusammenzufassen. Der Bericht kann im Text-, Dokument- oder HTML-Format vorliegen.

Die Zeit, die zum Erstellen des Berichts benötigt wird, ist nicht so wichtig. Wenn jedoch eine Schätzung der Zeit zum Erstellen des Berichts für das Tool verfügbar ist, können Sie dies gerne erwähnen.

Kostenlos oder bezahlt sind in Ordnung.

Ein Windows-GUI-Programm wird bevorzugt, aber ein gehosteter Dienst ist auch akzeptabel.

Es scheint, dass viele Websites diese Art von Informationen bereitstellen. Bitte beschreiben Sie, warum einige der häufigsten für Sie nicht ausreichen, damit wir Ihnen besser helfen können.
Könnten Sie bitte ein paar Dinge erläutern: Wie möchten Sie den Bericht? Was meinst du mit PageRank? Was meinst du mit Alexa? (vielleicht ihre Rangnummer?) Was meinst du mit Internet Archive? (Wie viele Snapshots hat es vielleicht?) Was meinst du mit Backlinks? (Anzahl der Backlinks?)
@NicolasRaoul, Websites, von denen ich scheine, dass sie meistens eine oder einige der Informationen liefern. Ich bin auf keine Website gestoßen, die alle Informationen in einem Schritt zusammenfassen und einen Bericht erstellen kann. Wenn Sie eine solche Seite kennen, teilen Sie uns bitte die Antwort mit.
@Janekmuric, Frage basierend auf Ihrer Klarstellung aktualisiert.
@NicolasRaoul Nur eine Frage: Wie lange würde es maximal dauern, einen Bericht zu erstellen?
@Janekmuric: Ich? Wollten Sie den Fragesteller erwähnen? :-)
Ja, ich wollte @kenchew fragen
@Janekmuric hat die Frage aktualisiert, um die Zeit einzuschließen.

Antworten (1)

Ich habe keine Websites oder Programme gefunden, die das können, was Sie brauchen, also habe ich, anstatt Geographie zu lernen, ein Python 2.7-Programm erstellt, das es für Sie erledigt.

Ein großes Dankeschön an jgamblin auf Github für sein Open-Source-Blacklist-Finder-Skript.

from pprint import pprint, pformat
from sys import exit, argv
from socket import gethostbyname, gethostbyaddr
from os.path import exists, isdir
from json import dumps
from time import strftime, time
from re import findall
from urllib2 import urlopen, Request, build_opener
from bs4 import BeautifulSoup as bs
import dns.resolver
from ipwhois import IPWhois

# Code below downloaded from GitHub page:
# https://github.com/jgamblin/isthisipbad/blob/master/isthisipbad.py
# Modified by Janek to fix errors.

def content_test(url, badip):
    """
    Test the content of url's response to see if it contains badip.
        Args:
            url -- the URL to request data from
            badip -- the IP address in question
        Returns:
            Boolean
    """

    try:
        request = Request(url)
        html_content = build_opener().open(request).read()

        matches = findall(badip, html_content)

        return len(matches) == 0
    except Exception, e:
        return False

bls = ["b.barracudacentral.org", "bl.spamcannibal.org", "bl.spamcop.net",
       "blacklist.woody.ch", "cbl.abuseat.org", "cdl.anti-spam.org.cn",
       "combined.abuse.ch", "combined.rbl.msrbl.net", "db.wpbl.info",
       "dnsbl-1.uceprotect.net", "dnsbl-2.uceprotect.net",
       "dnsbl-3.uceprotect.net", "dnsbl.cyberlogic.net",
       "dnsbl.sorbs.net", "drone.abuse.ch", "drone.abuse.ch",
       "duinv.aupads.org", "dul.dnsbl.sorbs.net", "dul.ru",
       "dyna.spamrats.com", "dynip.rothen.com",
       "http.dnsbl.sorbs.net", "images.rbl.msrbl.net",
       "ips.backscatterer.org", "ix.dnsbl.manitu.net",
       "korea.services.net", "misc.dnsbl.sorbs.net",
       "noptr.spamrats.com", "ohps.dnsbl.net.au", "omrs.dnsbl.net.au",
       "orvedb.aupads.org", "osps.dnsbl.net.au", "osrs.dnsbl.net.au",
       "owfs.dnsbl.net.au", "pbl.spamhaus.org", "phishing.rbl.msrbl.net",
       "probes.dnsbl.net.au", "proxy.bl.gweep.ca", "rbl.interserver.net",
       "rdts.dnsbl.net.au", "relays.bl.gweep.ca", "relays.nether.net",
       "residential.block.transip.nl", "ricn.dnsbl.net.au",
       "rmst.dnsbl.net.au", "smtp.dnsbl.sorbs.net",
       "socks.dnsbl.sorbs.net", "spam.abuse.ch", "spam.dnsbl.sorbs.net",
       "spam.rbl.msrbl.net", "spam.spamrats.com", "spamrbl.imp.ch",
       "t3direct.dnsbl.net.au", "tor.dnsbl.sectoor.de",
       "torserver.tor.dnsbl.sectoor.de", "ubl.lashback.com",
       "ubl.unsubscore.com", "virus.rbl.jp", "virus.rbl.msrbl.net",
       "web.dnsbl.sorbs.net", "wormrbl.imp.ch", "xbl.spamhaus.org",
       "zen.spamhaus.org", "zombie.dnsbl.sorbs.net"]

URLS = [
    #TOR
    ('http://torstatus.blutmagie.de/ip_list_exit.php/Tor_ip_list_EXIT.csv',
     'is not a TOR Exit Node',
     'is a TOR Exit Node',
     False),

    #EmergingThreats
    ('http://rules.emergingthreats.net/blockrules/compromised-ips.txt',
     'is not listed on EmergingThreats',
     'is listed on EmergingThreats',
     True),

    #AlienVault
    ('http://reputation.alienvault.com/reputation.data',
     'is not listed on AlienVault',
     'is listed on AlienVault',
     True),

    #BlocklistDE
    ('http://www.blocklist.de/lists/bruteforcelogin.txt',
     'is not listed on BlocklistDE',
     'is listed on BlocklistDE',
     True),

    #Dragon Research Group - SSH
    ('http://dragonresearchgroup.org/insight/sshpwauth.txt',
     'is not listed on Dragon Research Group - SSH',
     'is listed on Dragon Research Group - SSH',
     True),

    #Dragon Research Group - VNC
    ('http://dragonresearchgroup.org/insight/vncprobe.txt',
     'is not listed on Dragon Research Group - VNC',
     'is listed on Dragon Research Group - VNC',
     True),

    #OpenBLock
    ('http://www.openbl.org/lists/date_all.txt',
     'is not listed on OpenBlock',
     'is listed on OpenBlock',
     True),

    #NoThinkMalware
    ('http://www.nothink.org/blacklist/blacklist_malware_http.txt',
     'is not listed on NoThink Malware',
     'is listed on NoThink Malware',
     True),

    #NoThinkSSH
    ('http://www.nothink.org/blacklist/blacklist_ssh_all.txt',
     'is not listed on NoThink SSH',
     'is listed on NoThink SSH',
     True),

    #Feodo
    ('http://rules.emergingthreats.net/blockrules/compromised-ips.txt',
     'is not listed on Feodo',
     'is listed on Feodo',
     True),

    #antispam.imp.ch
    ('http://antispam.imp.ch/spamlist',
     'is not listed on antispam.imp.ch',
     'is listed on antispam.imp.ch',
     True),

    #dshield
    ('http://www.dshield.org/ipsascii.html?limit=10000',
     'is not listed on dshield',
     'is listed on dshield',
     True),

    #malc0de
    ('http://malc0de.com/bl/IP_Blacklist.txt',
     'is not listed on malc0de',
     'is listed on malc0de',
     True),

    #MalWareBytes
    ('http://hosts-file.net/rss.asp',
     'is not listed on MalWareBytes',
     'is listed on MalWareBytes',
     True)]

def blacklist(badip):
    BAD = 0
    GOOD = 0

    for url, succ, fail, mal in URLS:
        test = content_test(url, badip)
        if test == True:
            GOOD = GOOD + 1
        elif test == False:
            BAD = BAD + 1

    BAD = BAD
    GOOD = GOOD

    for bl in bls:
        try:
                my_resolver = dns.resolver.Resolver()
                query = '.'.join(reversed(str(badip).split("."))) + "." + bl
                my_resolver.timeout = 5
                my_resolver.lifetime = 5
                answers = my_resolver.query(query, "A")
                answer_txt = my_resolver.query(query, "TXT")
                BAD = BAD + 1

        except dns.resolver.NXDOMAIN:
            GOOD = GOOD + 1

        except dns.resolver.Timeout:
            pass

        except dns.resolver.NoNameservers:
            pass

        except dns.resolver.NoAnswer:
           pass

    return str(BAD) + "/" + str(BAD+GOOD)

# Code ABOVE downloaded from GitHub page:
# https://github.com/jgamblin/isthisipbad/blob/master/isthisipbad.py
# Modified by Janek to fix errors.  

def get_rank(domain_to_query):
    result = {'Global':'', "Country":''}
    url = "http://www.alexa.com/siteinfo/" + domain_to_query
    page = urlopen(url).read()
    soup = bs(page, "html.parser")
    for span in soup.find_all('span'):
        if span.has_attr("class"):
            if "globleRank" in span["class"]:
                for strong in span.find_all("strong"):
                    if strong.has_attr("class"):
                        if "metrics-data" in strong["class"]:
                            result['Global'] = strong.text.replace("\n", "").replace(" ", "")
            if "countryRank" in span["class"]:
                image = span.find_all("img")
                for img in image:
                    if img.has_attr("title"):
                        country = img["title"]
                for strong in span.find_all("strong"):
                    if strong.has_attr("class"):
                        if "metrics-data" in strong["class"]:
                            result["Country"] = country + ": " + strong.text.replace("\n", "").replace(" ", "")
    return result

def parseData(ip, data, whdata, blk, rank, tim, iphost):
    whois = whdata["nets"]
    dnet = data["network"]
    ob = data["objects"]
    curtime = strftime("%d %B %Y %H:%M:%S")

    if curtime[0] == "0":
        curtime = curtime[1:]

    timee = str(tim).split(".")[0] + "." + str(tim).split(".")[1][:2]
    outStr = "WARNING: Data below may be inaccurate\nTarget: " + ip + "\nGenerated: " + curtime + "\nTime took to generate: " + timee + " seconds" + "\n\n"

    outStr += "IP host: " + iphost + "\n\n"

    outStr += "Blacklist: " + blk + "\n\n"
    outStr += "Archive: http://web.archive.org/web/*/" + iphost + "\n"
    outStr += "Global Alexa rank: " + rank["Global"] + "\n"
    outStr += "Country Alexa rank: "+ rank["Country"] + "\n\n"


    outStr += "Legacy Whois:\n"
    net = 1
    for i in whois:
        outStr += "  Network " + str(net) + ":\n"
        try:
            outStr += "    IP: " + str(whdata["query"]) + "\n"
        except:
            outStr += "    IP: Not found\n"
        try:
            outStr += "    Name: " + str(i["name"]) + "\n"
        except:
            outStr += "    Name: Not found\n"
        try:
            outStr += "    Abuse E-mails: " + str(i["abuse_emails"]) + "\n"
        except:
            outStr += "    Abuse E-mails: Not found\n"
        try:
            outStr += "    Adress: " + str(i["adress"]) + "\n"
        except:
            outStr += "    Adress: Not found\n"
        try:
            outStr += "    Country: " + str(i["country"]) + "\n"
        except:
            outStr += "    Country: Not found\n"
        try:
            outStr += "    City: " + str(i["city"]) + "\n"
        except:
            outStr += "    City: Not found\n"
        try:
            outStr += "    Postal code: " + str(i["postal_code"]) + "\n"
        except:
            outStr += "    Postal code: Not found\n"
        try:
            outStr += "    Created: " + str(i["created"]) + "\n"
        except:
            outStr += "    Created: Not found\n"
        try:
            outStr += "    Description: " + str(i["description"]) + "\n"
        except:
            outStr += "    Description: Not found\n"
        try:
            outStr += "    Handle: " + str(i["handle"]) + "\n"
        except:
            outStr += "    Handle: Not found\n"
        try:
            outStr += "    Misc E-mails: " + str(i["misc_emails"]) + "\n"
        except:
            outStr += "    Misc E-mails: Not found\n"
        try:
            outStr += "    IP range: " + str(i["range"]) + "\n"
        except:
            outStr += "    IP range: Not found\n"
        try:
            outStr += "    State: " + str(i["state"]) + "\n"
        except:
            outStr += "    State: Not found\n"
        try:
            outStr += "    Tech E-mails: " + str(i["tech_emails"]) + "\n"
        except:
            outStr += "    Tech E-mails: Not found\n"
        try:
            outStr += "    Updated: " + str(i["updated"]) + "\n\n"
        except:
            outStr += "    Updated: Not found\n\n"

        net += 1

    outStr += "\nRDAP (HTTP) Whois:\n"

    try:
        outStr += "  Name: " + str(dnet["name"]) + "\n"
    except:
        outStr += "  Name: Not found\n"
    try:
        outStr += "  Start adress: " + str(dnet["start_adress"]) + "\n"
    except:
        outStr += "  Start adress: Not found\n"
    try:
        outStr += "  End adress: " + str(dnet["end_adress"]) + "\n"
    except:
        outStr += "  End adress: Not found\n"
    try:
        outStr += "  IP verion: " + str(dnet["ip_versoin"]) + "\n"
    except:
        outStr += "  IP version: Not found\n"

    outStr += "  Events:\n"

    e = 1
    for i in dnet["events"]:
        outStr += "    Event " + str(e) + ":\n"
        try:
            outStr += "      Action: " + str(i["action"]) + "\n"
        except:
            outStr += "      Action: Not found\n"
        try:
            outStr += "      Actor: " + str(i["actor"]) + "\n"
        except:
            outStr += "      Actor: Not found\n"
        try:
            outStr += "      Timestamp: " + str(i["timestamp"]) + "\n"
        except:
            outStr += "      Timestamp: Not found\n"

        e += 1

    outStr += "\n  Objects:\n"
    for i in ob:
        z = ob[i]["contact"]
        outStr += "    " + str(i) + ":\n"
        try:
            outStr += "      Name: " + str(z["name"]) + "\n"
        except:
            outStr += "      Name: Not found\n"
        try:
            outStr += "      E-mail: " + str(z["email"][0]["value"]) + "\n"
        except:
            outStr += "      E-mail: Not found\n"
        try:
            outStr += "      Kind: " + str(z["kind"]) + "\n"
        except:
            outStr += "      Kind: Not found\n"
        try:
            outStr += "      Phone: " + str(z["phone"][0]["value"]) + "\n"
        except:
            outStr += "      Phone: Not found\n"
        try:
            outStr += "      Title: " + str(z["title"]) + "\n"

        except:
            outStr += "      Title: Not found\n"
        try:
            outStr += "      Links: "
        except:
            outStr += "      Links: Not found\n"

        if ob[i]["links"]:
            if not len(ob[i]["links"]) == 0:
                for j in ob[i]["links"]:
                    outStr += str(j).replace("\n", " ")
                    outStr += "  "
        else:
            outStr += "Not found"

        outStr += "\n      Contact: "
        if z["address"]:
            for j in z["address"]:
                outStr += str(j["value"].replace("\n", ", "))

        else:
            outStr += "Not found"
        outStr += "\n\n"
    return outStr

def getData(ip):
    obj = IPWhois(ip)
    results = obj.lookup_rdap(depth=1)
    return results

def whgetData(ip):
    obj = IPWhois(ip)
    results = obj.lookup()
    return results

def showHelp():
    name = argv[0].replace("\\","/").split("/")[-1]
    print("""
Simple website report (Whois) generator
Usage:
    %s website [-json]
            [-file <file>]
            [-raw]

Arguments:
    website       Website IPv4 or domain
    -json         Convert output to json 
    -file         Save output to a file
    -raw          Output full Whois lookup in raw format

""" % name)

if __name__ == "__main__":
    args = argv

    if len(args) < 2:
        print("Too few arguments!")
        showHelp()
        exit(1)

    if len(args) > 5:
        print("Too many arguments!")
        showHelp()
        exit(1)

    allowedArgs = ["-file", "-json", "-raw"]
    noLast = ["-file"]

    b = 2
    for i in args[2:]:
        if i not in allowedArgs and args[b-1] not in noLast:
            print("Invalid option (%s)" % i)
            showHelp()
            exit(1)
        b += 1

    for i in args[2:]:
        if args.count(i) > 1 and i in allowedArgs:
            print("Option appearing more then once (%s)" % i)
            showHelp()
            exit(1)

    if args[-1] in noLast:
        print("Option has no arguments (%s)" % args[-1])
        showHelp()
        exit(1)

    if "-json" in args and "-raw" in args:
        print("Can't use -json and -raw in the same time")
        showHelp()
        exit(1)

    rawIP = args[1]
    if rawIP.replace(".","").isdigit() == True:
        ip = rawIP
    else:
        ip = gethostbyname(rawIP)

    if "-file" in args:
        filename = args[args.index("-file") + 1]
        if exists(filename) == True:
            print("File already exists (%s)" % filename) 
            showHelp()
            exit(1)

        directory = "/".join(filename.replace("\\","/").split("/")[0:-1])

        if len(directory) > 0 and isdir(directory) == False:
            print("Directory does not exist (%s)" % directory) 
            showHelp()
            exit(1)

        fileOn = True
    else:
        fileOn = False

    t1 = time()
    data = getData(ip)
    whdata = whgetData(ip)
    blk = blacklist(ip)

    if rawIP.replace(".","").isdigit() == True:
        try:
            iphost = gethostbyaddr(ip)
        except:
            iphost = "Not found"
    else:
        iphost = rawIP

    rank = get_rank(iphost)
    t2 = time()

    t = t2-t1
    if "-file" not in args:
        if "-json" in args:
            all = {"http_whois":data, "legacy_whois":whdata, "blacklist":blk, "rank":rank, "time":t}
            print(dumps(all, ensure_ascii=False) + "\n")
            exit(0)

        elif "-raw" in args:
            all = {"http_whois":data, "legacy_whois":whdata, "blacklist":blk, "rank":rank, "time":t}
            pprint(all)
            exit(0)

        else:
            parsed = parseData(ip, data, whdata, blk, rank, t, iphost)
            print(parsed)
            exit(0)

    else:
        if "-json" in args:
            all = {"http_whois":data, "legacy_whois":whdata, "blacklist":blk, "rank":rank, "time":t}
            forWrite = dumps(all, ensure_ascii=False) + "\n"
            f = open(filename,"w")
            f.write(forWrite)
            f.close()
            print("File created!")
            exit(0)

        elif "-raw" in args:
            all = {"http_whois":data, "legacy_whois":whdata, "blacklist":blk, "rank":rank, "time":t}
            forWrite = pformat(all)
            f = open(filename,"w")
            f.write(forWrite)
            f.close()
            print("File created!")
            exit(0)

        else:
            parsed = parseData(ip, data, whdata, blk, rank, t, iphost)
            forWrite = parsed
            f = open(filename,"w")
            f.write(forWrite)
            f.close()
            print("File created!")
            exit(0)

Was zu tun ist:

  1. Laden Sie Python 2.7.x herunter
  2. Installieren Sie es (aktivieren Sie bei der Installation das Kontrollkästchen "python.exe zum Pfad hinzufügen")
  3. Kopieren Sie den obigen Code und fügen Sie ihn in eine Datei mit dem Namen einyournamehere.py
  4. Jetzt können Sie das Programm cmdmit der folgenden Abfrage ausführen:

python yournamehere.py www.google.com

HINWEIS: Sie müssen die folgenden Bibliotheken installieren, damit der Code funktioniert dnspython, bs4,ipwhois

Um sie zu installieren, gehen Sie zu cmd und geben Sie Zeile für Zeile ein:

pip install dnspython
pip install ipwhois
pip install bs4

Hilfe:

Simple website report (Whois) generator
Usage:
        netreport.py website [-json]
                        [-file <file>]
                        [-raw]

Arguments:
        website           Website IPv4 or domain
        -json             Convert output to json
        -file             Save output to a file
        -raw          Output full Whois lookup in raw format

Wenn Sie den vollständigen Ausgabebericht in einem Pythonic-Wörterbuch formatieren möchten, verwenden Sie `-raw.

Wenn Sie die Ausgabe in json formatieren möchten, verwenden Sie "-json".

-rawund -jsonkönnen nicht gleichzeitig verwendet werden.

Wenn Sie die Ausgabe in einer Datei speichern möchten, verwenden Sie-file

Beispiele:

python yournamehere.py www.google.com
python yournamehere.py www.google.com -json
python yournamehere.py www.google.com -file myreport.txt
python yournamehere.py www.google.com -raw -file myfile.txt
python yournamehere.py www.google.com -file C:\Users\Jan\Desktop\file.txt

Geschwindigkeit:

Es ist ziemlich langsam. Denn www.google.comes dauert ~30 Sekunden, um den Bericht zu erstellen. Mein Internet ist auch langsam: 4 Mbit/s Down, 0,5 Mbit/s Up (ja, sehr langsam)

Nicht ganz das, was ich brauche, aber definitiv eine positive Bewertung für den (riesigen!) Aufwand, den Code und die Antwort zusammenzustellen. Danke!
Nun, das ist alles, was ich habe. Wie auch immer, ich bin mir nicht sicher, wonach Sie suchen, weil das Programm außer Backlinks alles auf Ihrer Liste macht. Ich muss jetzt Erdkunde lernen. (und dies ist die beste Website, die ich finden konnte, um das zu tun, was Sie brauchen, aber es fehlen viele Informationen centralops.net/co/DomainDossier.aspx )
Vielen Dank, dass Sie sich die Zeit genommen haben, die Antwort zusammenzustellen. Sehr geschätzt.