nitrxgen
MD5 Database - Information

This page will try to explain as much as possible regarding the MD5 Database tool.

Go to main page: MD5 Database.

Terms of Service

This service provides direct access to a database of which specific purpose is educational and informational with the exception that you wish to recover data that is your own. Under no circumstance is this service to be used as means to commit any illegal activities or gain unauthorised access to data or services not your own. By using this service and using the data gained from this service, you agree to accept full responsibility for all misconduct and/or illegal activities and agree not to hold this service, this website, or myself to blame for any such misconduct and/or illegal activities.

You are free to implement the API (provided below) in your applications and/or scripts; however, it is preferable that you keep the request rate to something modest like 3 to 5 requests per second. Excessive request limits over a prolonged duration may result in a temporarily disabled service. If such behaviour continues, it may be necessary introduce API keys to control access to the API. In conclusion: Keep in mind that the API usage may change in the future. Please check this article in the future for updates. There will be plenty of notice.

I, the creator of www.nitrxgen.net and the MD5 Database, do NOT operate beyond the scope of this website. I do NOT associate with any other website, script(s), or any other kind of executeable file(s). I strongly recommend you DO NOT use other websites, scripts, or executeables that claim to associate with www.nitrxgen.net in any way as I have zero control over them and they may potentially be malicious.

This service is intended to be 100% free (at no cost) for the public to use. Under no circumstance should any person utilise this free service in exchange for any financial gains. This includes but is not limited to selling software which heavily relies on this service. DO NOT BUY THESE PROGRAMS.


Dictionaries

I can't remember where I got most of the dictionaries from that are used. There's a healthy amount of over 3,000 individual dictionaries used (filtered for unique entries only). Various different languages, encoding types. Most of them have been processed for applying rules. Rules are just a set of instructions to alter a word (by appending, prepending, substitution, replacing, repeating, reversing, etc.).

Many of the larger dictionaries are too large to apply a healthy list of rules too because the sheer volume of created passwords would cause a problem. About 60,000 to 70,000 rules were applied on the smaller dictionaries which means 1 word ends up creating that many new words.

Lots of logs and Internet documents have been added to the list. I plan to create a web crawler that will sniff around for new words and exerpts of new sentences.

I would like to give a special thanks to the websites: hashes.org (RIP), hashkiller.co.uk, and hashmob.net for their regularly updated wordlists and continued support from their communities.

Bruteforcing

As you should be aware at this time, a bruteforce is an attack on a hash that tries every single combination of a given set of characters (character set) and a given password length. These produce a finite number of possible passwords to generate and match with a query hash. By varying the character set and the length of the password, you can create a combination that is hopefully an effective range and not take so long to process via a processor.

Because my service is focused on trying to provide an instant password upon query, one of my goals is to try and provide a kind of instant bruteforce which can only be done by precomputing and storing the necessary data. Storing bruteforces is generally discouraged due to the plain and simple fact that it can take up copious amounts of storage for what otherwise would be a few minutes processing via a processor. I'm still willing to make that sacrifice to provide an instant service however.

Below is a table which define the character sets used throughout this document:

Shorthand
Length
Character Set
Notes
l
26
abcdefghijklmnopqrstuvwxyz
The lowercase basic Latin alphabet.
u
26
ABCDEFGHIJKLMNOPQRSTUVWXYZ
The uppercase basic Latin alphabet.
d
10
0123456789
The common Hindu-Arabic numerals.
s
33
 !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
Common symbols and punctuation (includes a space).
f
36
l d
A mix of the lowercase alphabet and numerals.
m
52
l u
A mix of the lower- and uppercase basic Latin alphabet.
a
95
l u d s
All of the above combined into a single character set.
b
256
0x00 to 0xFF
Every possible byte, or every 8-bit combination.

Now that's out of the way, this table describes the bruteforce ranges that are stored to disk:

Pattern
Length
Candidates
Notes
b
1
256
Online. Any byte value to 1 byte in length.
bb
2
65,536
Online. Any byte value to 2 bytes in length.
bbb
3
16,777,216
Online. Any byte value to 3 bytes in length.
bbbb
4
4,294,967,296
Online. Any byte value to 4 bytes in length.
bbbbb
5
1,099,511,627,776
Coming soon. Any byte value to 5 bytes in length.
aaaaa
5
7,737,809,375
Online. All printable bytes to 5 bytes in length.
aaaaaa
6
735,091,890,625
Offline. All printable bytes to 6 bytes in length.
fffffff
7
78,364,164,096
Online. All lowercase alphanumerics to 7 bytes in length.
uffffff
7
56,596,340,736
Coming soon.
udddddd
7
26,000,000
Online. 1 uppercase letter + 6 digits.
ulddddd
7
67,600,000
Online. 1 uppercase + 1 lowercase letter + 5 digits.
ulldddd
7
175,760,000
Online. 1 uppercase + 2 lowercase letters + 4 digits.
ulllddd
7
456,976,000
Online. 1 uppercase + 3 lowercase letters + 3 digits.
ulllldd
7
1,188,137,600
Online. 1 uppercase + 4 lowercase letters + 2 digits.
ullllld
7
3,089,157,760
Online. 1 uppercase + 5 lowercase letters + 1 digit.
ullllll
7
8,031,810,176
Online. 1 uppercase letter + 6 lowercase letters.
dddddddd
8
100,000,000
Online. All numeric values to 8 bytes in length.
mddddddd
8
520,000,000
Online. 1 letter (mixed case) + 7 digits.
mldddddd
8
1,352,000,000
Online. 1 letter (mixed case) + 1 lowercase letter + 6 digits.
mllddddd
8
3,515,200,000
Online. 1 letter (mixed case) + 2 lowercase letters + 5 digits.
mllldddd
8
9,139,520,000
Online. 1 letter (mixed case) + 3 lowercase letters + 4 digits.
mllllddd
8
23,762,752,000
Online. 1 letter (mixed case) + 4 lowercase letters + 3 digits.
llllllll
8
208,827,064,576
Online. All lowercase letters to 8 bytes in length.
ffffffff
8
2,821,109,907,456
Future! All lower alphanumeric values of 8 bytes in length.
ddddddddd
9
1,000,000,000
Online. All numeric values to 9 bytes in length.
mdddddddd
9
5,200,000,000
Online. 1 letter (mixed case) + 8 digits.
mlddddddd
9
13,520,000,000
Online. 1 uppercase + 1 lowercase letter + 7 digits.
dddddddddd
10
10,000,000,000
Online. All numeric values to 10 bytes in length.
ddddddddddd
11
100,000,000,000
Coming soon. All numeric values to 11 bytes in length.
dddddddddddd
12
1,000,000,000,000
Coming soon. All numeric values to 12 bytes in length.
0.0.0.0 - 255.255.255.255
7 - 15
4,294,967,296
Online. All dot-decimal IPv4 addresses.
00000000 - FFFFFFFF
8
4,294,967,296
Online. All uppercase hexadecimal (excluding all duplicates).
aaaaaaaa - ffffffff
8
1,679,616
Online. All lowercase hexadecimal (excluding all duplicates).
Total
*
1,124,069,266,720
The grandtotal count of the bruteforce ranges (non-dictionary).

This is—to the best of my knowledge—correct and up to date.


API: Application Programming Interface Reference

Due to the nature of this service, it may be preferable to some to have a more direct method to query the database. To address this, I've developed this API. The API allows you to implement your queries with ease in your scripts and applications. I hope to please all audiences by offering a large range of output formats. More information coming soon.

Using the API is incredibly simple. Simply append the MD5 hash to the end of the main page URL (as shown in red below).

The simplest API call would be this:
http://www.nitrxgen.net/md5db/dca57be223efc2741bc98adce0ec5141

Usage: If there is no result, expect a blank page. If the result contains non-printable bytes (control codes, Unicode, etc.) then the result will be displayed in a hexadecimal format wrapped between $HEX[ and ]. Remember to check for a positive MD5 hash match before carrying out the hexadecimal conversion! Any invalid input will just be ignored and return a blank page. There is no HTML character escaping either as the content is delivered as text/plain.

If you require other data formatting, please read further.

  • JSON / JavaScript Object Notation (.json):

    The output structure of the JSON extension was updated on 04/01/2017 to include some more information and be more uniform with information alongside other extensions. Anyone previously using this extension will notice it no longer works in your script so you will need to make some minor changes. Sorry for any inconvenience caused. This is not scheduled to change any time in the future.

    The JSON output conforms to RFC 4627 and supports outputting of all binary bytes.

    Example usage: http://www.nitrxgen.net/md5db/dca57be223efc2741bc98adce0ec5141.json

    Example result: {
      "result": {
        "found":   true,
        "hash":    "dca57be223efc2741bc98adce0ec5141",
        "pass":    "imtheadmin",
        "hexpass": "696d74686561646d696e",
        "hits":    99999,
        "sysmsg":  "",
        "credit": {
          "name": "nitrxgen",
          "link": "http:\/\/www.nitrxgen.net\/md5db\/"
        }
      }
    }

    Please note that the data returned by actual use of the API will all be on a single line. This document shows a "pretty" version of an example result so that it's more human-readable so that you can implement it in your code how you wish.

  • Plain Text (.txt or unspecified/unknown extensions):

    This one is the simplest of them all. If the password is found, that will be the only data in the document. No metadata, no bulk, nothing but the password.

    It will display passwords as-is as long as they contain the 95 printable characters 0-9, a-z, A-Z, and a selection of single byte keyboard symbols and punctuation. If the password does in fact contain characters outside of this character set then it'll convert the data to hexadecimal and sandwich it between $HEX[ and ].

    Please remember to check for results for $HEX[ and ] and convert it from hexadecimal to binary.

    For example, hello will simply be returned as hello. However, hello\n (with a line feed byte appended to it) will result in $HEX[68656c6c6f0a]. It is also definitely worth noting that passwords containing non-Latin or extended Latin alphabets will also trigger this conversion since it'll be encoded in such a way that will make use of bytes considered unprintable.

    The decision for this hexadecimal conversion comes from hash cracking activities where storing lines of found passwords proved difficult without any conversion of control codes or unprintable bytes which may cause problems.

    If no result is found or an error occured then expect a blank document.

    Example usage: http://www.nitrxgen.net/md5db/dca57be223efc2741bc98adce0ec5141.txt

  • YAML (.yaml or .yml):

    This file extension goes into a lot of detail which I'm not prepared to write about here. Feel free to read about this format on yaml.org. There should be enough detail there to implement it into your own application.

    All the keys are always present and anything unavailable will be set to null instead of being dynamically excluded. The pass and passhex have been designed to "fold" or word wrap YAML-style to accommodate lengthy values.

    In all honesty, this is the first time I've ever used or implemented YAML so please let me know if you have any problems with it. The current implementation tests as valid where the test vector tries every single byte from 0x00 to 0xFF and the validator seems to be at peace with it.

    Example usage: http://www.nitrxgen.net/md5db/dca57be223efc2741bc98adce0ec5141.yaml

    More information about this encoding can be found on WikiPedia: https://en.wikipedia.org/wiki/YAML.


System Messages

System messages are reported back to the user in both the HTTP header responses and, if applicable, in the response text.

SERVICE_OFFLINE = Sadly this server is not a high-end professional setup and I must maintain it myself whenever possible which often means downtime for this service. If this flag is encountered, please STOP making requests as all subsequent requests will not return a result. Maybe check back in a few minutes or hours, it is not necessary to continue requesting multiple hashes per second if this flag is present. You may continue to make requests as normal when this flag disappears.

FOUND_NOT_MD5 = Although the system may indicate there was no found result (no text, or "found" set as false), this flag indicates the hash was indeed found but it is not an MD5 hash. It is a hash of a different algorithm, potentially truncated from algorithms with longer fixed-length outputs. Credit information will still be given since people still worked hard to eliminate the hash from the unfound list.

ALGO_* = If found to be non-MD5 hash, this flag is present to identify which algorithm it really is. The plain text result will not be shown to deter users submitting any algorithms they want and cluttering up my service. This is an MD5 only service.

POTENTIAL_SALT = This flag indicates the result of your requested hash matches a pattern recognised as a salt (or pepper). Essentially this means the result provided may not be the final password you require despite the result 100% matching the hash. Some softwares prepend or append additional data to user's passwords before storing it in attempt to battle services like mine. If this flag is present, please take notice of the "candidates" data in YAML and JSON formats as the service will attempt to provide un-obfuscated results, however this may vary.


Frequently Asked Questions
  • “Can I buy your database?”

    No. The database is in the magnitude of several terabytes and stored in a nonconventional format. Preparing and storing a second copy of the database to be used by the general public (paid or not) will be very inconvenient for me and probably very bandwidth intensive.

  • “How are you able to store and operate this site for free?”

    Mostly just my generosity. The cost mostly goes towards the drives used to store the data and the GPUs used to crack the unfound hashes. They are mostly one-off costs. Running costs aren't that high but there is a donation page if anyone ever wishes to help me expand the database quicker.

  • “Why is the service telling me my hash is different to MD5?”

    This service is designed only for MD5 hashes. The list of unsuccessful hashes tends to build up over time so a team of users will work hard to crack as many of these as possible. Although it is strictly an MD5 only service, a lot of the hashes stored are discovered to be non-MD5 hashes. This is determined by processing the list of unsuccessful hashes as other similar algorithms. These findings are used to exclude them from the list in effort to reduce the amount of wasted resources.

  • “So it was found under a different algorithm, can you still tell me what it was?”

    No. Although the password was found—which is how the hash's real algorithm was determined—this service still remains an MD5-only service until further notice. By allowing users to submit whatever non-MD5 junk they have and still providing them a service, this will convey the idea that it's ok to submit all kinds of algorithms and still have success but that just increases the workload on those of us who are continually trying to create effective means of cracking passwords with lists we trust to be a single algorithm but are actually not. In the end, my answer is a resounding no simply as a type of punishment for not knowing what you're doing.

  • “Why won't you support other algorithms?”

    The MD5 algorithm is by far the most common algorithm in use to this day. There are still services out there that salt and pepper their hashes but MD5 is still the parent algorithm. Even supporting MD5 only, the demand through the API is very high at all times. I suspect if I were to support another algorithm the demand would only increase. Lastly, disk storage would need to double since all the data would need to be stored in a completely unique way.

  • “What data do you store with submitted hashes?”

    No information about the client submitting the hash is stored. Only statistic-related information is kept, such as timestamps of the first and last times it was submitted and the timestamp of when it was actually cracked if it wasn't immediately found (and the user who cracked it), and finally the number of times it was submitted.

  • “Will the database grow larger?”

    As of March 2021, I cannot say for certain. It is an interest of mine to keep the service up and running but sadly my interests have very much changed from this project. I do not know if I am prepared to spend the time, energy, and money to create larger databases. I would like to say yes to provide a continually better service but this is a non-profit project so there isn't much of an incentive to do so right now.

  • “What database engine are you running?”

    There is no engine exactly. It's a basic lookup database of my own making which basically serves to store as much data as possible but take as little time as possible. This is done by splitting said database into millions of organised pieces so that only a tiny chunk of it needs to be read and processed at any one time. It's primitive, but it works.

  • “My password is found in your database. Can you remove it?”

    No. This service does not work like that. The databases that are built are done so with an allocated amount of data per part of it. You're asking me to remove your 1 password from a thing that contains trillions of passwords. Trillions. Think of the number one (your single password); good, now add twelve zeros to the end of it. No chance. Finally, if your password was found here, your password is weak and you have a problem you need to address, not hide. Change your password to something more strong right away.

  • “The database gave me another hash-lookalike result. What's this?”

    As said many times around this page, this is an MD5-only service. This may also include double-MD5 hashes where you will be given the first MD5 hash, so you may wish to pass the resulting hash back into the database to get the real password. If this doesn't work, either you just need to wait for that first hash to be found or the first hash is not MD5; maybe SHA1 then MD5, or similar.

  • “Can you briefly explain the history of this project?”

    Sure. I started in 2007 when I first learned about the MD5 algorithm. The community I was a part of took a bit of interest in 'reversing' these hashes and they actually produced a PHP script that would brute force a hash. I was new to PHP at the time and I played around with it. Soon after, I discovered there was a whole new world behind all of this. There are buzzing communities built specifically for cracking hashes (MD5 was the most popular choice back then). I found myself a home on Plain-Text.org in mid 2008 and became a participating member by late 2008 by use of a 100 GiB word list with about 4.5 billion passwords. A couple of years later, the project slowly died as the website was no longer hosted, but remained on IRC for a while. Now I'm on my own. I soon had better equipment and rebuilt a 895 GiB database (compressed) with 130 billion by early 2013. Things only got better by early 2015 when I produced a 2.7 TiB data with 502 billion passwords! By the beginning of 2017, I decided to add another bruteforce range (all printable bytes to 6 bytes in length) to the service increasing the count by (not to) 735 billion.


© Copyright 2008-2024: Nitrxgen, all rights reserved.
Not XHTML 1.0 valid or CSS3 valid.
Source last modified 1188 days ago.