Skip to content

Commit 477f102

Browse files
authored
Add Password Strength Checker script with ML and NN (DhanushNehru#349)
* Add password strengtch checker source code * Update README.md * Fix README.md * Fix typo in model README.md
1 parent 88ee0ad commit 477f102

File tree

11 files changed

+262
-0
lines changed

11 files changed

+262
-0
lines changed

Password Strength Checker/README.md

+27
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# Password Strength Checker
2+
3+
## Description
4+
A password strength checker that utilizes machine learning to classify the strength of passwords. This project provides a simple interface for users to input their passwords and receive feedback on their strength based on various criteria.
5+
6+
## Features
7+
- Classifies password strength into multiple categories.
8+
9+
## Installation
10+
1. Clone the repository:
11+
```bash
12+
git clone https://github.com/DhanushNehru/Python-Scripts
13+
cd "Password Strength Checker"
14+
15+
2. Create and activate a virtual environment:
16+
```bash
17+
python3 -m venv venv
18+
source venv/bin/activate # On Windows use `venv\Scripts\activate`
19+
20+
3. Install the required packages:
21+
```bash
22+
pip install -r requirements.txt
23+
24+
## Usage
25+
To run the password strength checker:
26+
```bash
27+
python main.py

Password Strength Checker/main.py

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
from model.model import predict # import model
2+
3+
def main():
4+
password_to_test = input("Enter a password to check its strength: ") # get password from terminal
5+
predicted_class = int(predict(password_to_test)) # evaluate password strength
6+
print(f"Password strength classification: {predicted_class} / 2") # output 0 - weak, 1 - moderate, or 2 - strong
7+
8+
if __name__ == "__main__": main()
+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Password Strength Classification Model
2+
3+
## Overview
4+
This model is designed to evaluate the strength of passwords using machine learning techniques. It analyzes input passwords and classifies them based on their strength, providing feedback for users to create stronger passwords.
5+
6+
## Model Architecture
7+
- **Input Layer**: The model accepts passwords as input.
8+
- **Dense Layers**: A series of dense layers with activation functions (e.g., ReLU) process the input features.
9+
- **Output Layer**: The final layer outputs a classification score indicating password strength (e.g., weak - 0, medium - 1, strong - 2).
10+
11+
## Training
12+
- The model is trained on a labeled dataset of passwords classified by strength.
13+
14+
## Future improvements
15+
- In feature engineering, columns about the amount of common used passwords (etc. 'password') or common used words should be added and be taken into consideration properly in model training.
Binary file not shown.
+68
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
# disable debugging messages
2+
def warn(*args, **kwargs):
3+
pass
4+
import warnings
5+
warnings.warn = warn
6+
warnings.filterwarnings("ignore", category=DeprecationWarning)
7+
import os
8+
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
9+
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
10+
from silence_tensorflow import silence_tensorflow
11+
silence_tensorflow("WARNING")
12+
13+
import pandas as pd
14+
import pickle
15+
16+
from tensorflow.keras.models import Sequential
17+
from tensorflow.keras.layers import Dense
18+
from model.utils.functions import calculate_entropy, count_repeats, count_sequential
19+
from model.utils.preprocessing import run_preprocessing
20+
from model.utils.training import run_training
21+
22+
# run preprocessing and training
23+
# run_preprocessing() # uncomment to run preprocessing
24+
# run_training() # uncomment to train the model
25+
26+
def prepare_input(password): # function to prepare input features from password
27+
# create a dataframe for a single input
28+
data = {
29+
'length': [len(password)], # calculate password length
30+
'lowercase_count': [sum(c.islower() for c in password)], # count lowercase characters
31+
'uppercase_count': [sum(c.isupper() for c in password)], # count uppercase characters
32+
'digit_count': [sum(c.isdigit() for c in password)], # count digits
33+
'special_count': [sum(not c.isalnum() for c in password)], # count special characters
34+
'entropy': [calculate_entropy(password)], # calculate entropy
35+
'repetitive_count': [count_repeats(password)], # count repetitive characters
36+
'sequential_count': [count_sequential(password)] # count sequential characters
37+
}
38+
39+
with open('model/scaler.pkl', 'rb') as file: # load the fitted scaler from file
40+
scaler = pickle.load(file)
41+
42+
# convert to dataframe
43+
input_df = pd.DataFrame(data)
44+
45+
# normalize using the previously fitted scaler
46+
normalized_input = scaler.transform(input_df)
47+
48+
return pd.DataFrame(normalized_input, columns=input_df.columns) # return normalized input as dataframe
49+
50+
def predict(password): # function to predict password strength
51+
# load the model
52+
model = Sequential() # create a sequential model
53+
model.add(Dense(128, activation='relu', input_shape=(8,))) # add input layer with 128 neurons
54+
model.add(Dense(64, activation='relu')) # add hidden layer with 64 neurons
55+
model.add(Dense(3, activation='softmax')) # add output layer with softmax activation
56+
57+
# load trained weights
58+
model.load_weights('model/deep_learning_model.h5') # load weights from the trained model file
59+
60+
# prepare the input
61+
password_to_test = password # assign password to test
62+
input_features = prepare_input(password_to_test) # prepare input features
63+
64+
# make the prediction
65+
prediction = model.predict(input_features, verbose=0) # predict using the model
66+
predicted_class = prediction.argmax(axis=-1) # get the predicted class index
67+
68+
return predicted_class # return the predicted class
851 Bytes
Binary file not shown.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import numpy as np
2+
3+
def calculate_entropy(password): # function to calculate the entropy of a password
4+
if len(password) == 0: # check if the password is empty
5+
return 0 # return 0 for empty passwords
6+
char_counts = np.array(list(password)) # convert password to a numpy array
7+
unique, counts = np.unique(char_counts, return_counts=True) # get unique characters and their counts
8+
probabilities = counts / len(password) # calculate the probability of each character
9+
entropy = -np.sum(probabilities * np.log2(probabilities)) # compute the entropy using the probabilities
10+
return entropy # return the calculated entropy
11+
12+
def count_repeats(password): # function to count consecutive repeated characters in the password
13+
return sum(password[i] == password[i + 1] for i in range(len(password) - 1)) # sum the repeated characters
14+
15+
def count_sequential(password): # function to count sequential characters in the password
16+
sequences = [''.join(chr(i) for i in range(start, start + 3)) for start in range(ord('a'), ord('z') - 1)] # generate sequences of 3 lowercase letters
17+
sequences += [''.join(str(i) for i in range(start, start + 3)) for start in range(10)] # generate sequences of 3 digits
18+
return sum(1 for seq in sequences if seq in password) # count how many of the sequences are in the password
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
import pandas as pd
2+
import pickle
3+
4+
from model.utils.functions import calculate_entropy, count_repeats, count_sequential
5+
from sklearn.preprocessing import StandardScaler
6+
7+
def run_preprocessing():
8+
# import data
9+
dataframe = pd.read_csv('model/passwords.csv', on_bad_lines='skip') # read csv data file
10+
dataframe = dataframe.dropna() # remove rows with empty values
11+
dataframe = dataframe.drop_duplicates(subset='password') # remove duplicates
12+
13+
# add new columns
14+
dataframe['length'] = dataframe['password'].str.len() # column for password length
15+
dataframe['lowercase_count'] = dataframe['password'].apply(lambda x: sum(c.islower() for c in x)) # column for amount of lowercase characters
16+
dataframe['uppercase_count'] = dataframe['password'].apply(lambda x: sum(c.isupper() for c in x)) # column for amount of uppercase characters
17+
dataframe['digit_count'] = dataframe['password'].apply(lambda x: sum(c.isdigit() for c in x)) # column for amount of digits
18+
dataframe['special_count'] = dataframe['password'].apply(lambda x: sum(not c.isalnum() for c in x)) # column for amount of special characters
19+
dataframe['entropy'] = dataframe['password'].apply(calculate_entropy) # column for entropy
20+
dataframe['repetitive_count'] = dataframe['password'].apply(count_repeats) # column for amount of repetitive characters
21+
dataframe['sequential_count'] = dataframe['password'].apply(count_sequential) # column for amount of sequential characters
22+
23+
scaler = StandardScaler() # use standard scaler because there is a gaussian distribution in passwords.csv
24+
numerical_features = ['length', 'lowercase_count', 'uppercase_count', 'digit_count', 'special_count', 'entropy', 'repetitive_count', 'sequential_count']
25+
dataframe[numerical_features] = scaler.fit_transform(dataframe[numerical_features])
26+
27+
# save scaler model for future use
28+
with open('model/scaler.pkl', 'wb') as file:
29+
pickle.dump(scaler, file)
30+
31+
# save preprocessed data
32+
dataframe.to_csv('model/output.csv', index=False, header=True)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
# disable debugging messages
2+
def warn(*args, **kwargs):
3+
pass
4+
import warnings
5+
warnings.warn = warn
6+
warnings.filterwarnings("ignore", category=DeprecationWarning)
7+
import os
8+
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
9+
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
10+
from silence_tensorflow import silence_tensorflow
11+
silence_tensorflow("WARNING")
12+
13+
import pandas as pd
14+
15+
from sklearn.model_selection import train_test_split
16+
from tensorflow.keras.models import Sequential
17+
from tensorflow.keras.layers import Dense
18+
from tensorflow.keras.utils import to_categorical
19+
20+
21+
def run_training(): # function to run the training process
22+
dataframe = pd.read_csv('model/output.csv') # load the processed data from output.csv
23+
24+
# split the data into features and target variable
25+
X = dataframe[['length', 'lowercase_count', 'uppercase_count', 'digit_count', 'special_count', 'entropy', 'repetitive_count', 'sequential_count']] # feature columns
26+
y = dataframe['strength'] # target variable
27+
28+
# convert target variable to categorical
29+
y = to_categorical(y) # convert labels to categorical format for multi-class classification
30+
31+
# split into training and test sets
32+
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # 80-20 split
33+
34+
# initialize the model
35+
model = Sequential() # create a sequential model
36+
model.add(Dense(128, activation='relu', input_shape=(X_train.shape[1],))) # add input layer with 128 neurons
37+
model.add(Dense(64, activation='relu')) # add hidden layer with 64 neurons
38+
model.add(Dense(y.shape[1], activation='softmax')) # add output layer with softmax activation
39+
40+
# compile the model
41+
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # compile the model with adam optimizer
42+
43+
# train the model
44+
model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2) # fit the model on training data
45+
46+
# save the model to a file
47+
model.save('model/deep_learning_model.h5') # save the trained model
+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
absl-py==2.1.0
2+
astunparse==1.6.3
3+
certifi==2024.8.30
4+
charset-normalizer==3.4.0
5+
flatbuffers==24.3.25
6+
gast==0.6.0
7+
google-pasta==0.2.0
8+
grpcio==1.67.0
9+
h5py==3.12.1
10+
idna==3.10
11+
joblib==1.4.2
12+
keras==3.6.0
13+
libclang==18.1.1
14+
Markdown==3.7
15+
markdown-it-py==3.0.0
16+
MarkupSafe==3.0.1
17+
mdurl==0.1.2
18+
ml-dtypes==0.4.1
19+
namex==0.0.8
20+
numpy==1.26.4
21+
opt_einsum==3.4.0
22+
optree==0.13.0
23+
packaging==24.1
24+
pandas==2.2.3
25+
protobuf==4.25.5
26+
Pygments==2.18.0
27+
python-dateutil==2.9.0.post0
28+
pytz==2024.2
29+
requests==2.32.3
30+
rich==13.9.2
31+
scikit-learn==1.5.2
32+
scipy==1.14.1
33+
setuptools==75.2.0
34+
silence_tensorflow==1.2.2
35+
six==1.16.0
36+
tensorboard==2.17.1
37+
tensorboard-data-server==0.7.2
38+
tensorflow-cpu==2.17.0
39+
termcolor==2.5.0
40+
threadpoolctl==3.5.0
41+
typing_extensions==4.12.2
42+
tzdata==2024.2
43+
urllib3==2.2.3
44+
Werkzeug==3.0.4
45+
wheel==0.44.0
46+
wrapt==1.16.0

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,7 @@ More information on contributing and the general code of conduct for discussion
101101
| OTP Verification | [OTP Verification](https://github.com/DhanushNehru/Python-Scripts/tree/master/OTP%20%20Verify) | An OTP Verification Checker. |
102102
| Password Generator | [Password Generator](https://github.com/DhanushNehru/Python-Scripts/tree/master/Password%20Generator) | Generates a random password. |
103103
| Password Manager | [Password Manager](https://github.com/nem5345/Python-Scripts/tree/master/Password%20Manager) | Generate and interact with a password manager. |
104+
| Password Strength Checker | [Password Strength Checker](https://github.com/nem5345/Python-Scripts/tree/master/Password%20Strength%20Checker) | Evaluates how strong a given password is. |
104105
| PDF Merger | [PDF Merger](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20Merger) |Merges multiple PDF files into a single PDF, with options for output location and custom order.|
105106
| PDF to Audio | [PDF to Audio](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20to%20Audio) | Converts PDF to audio. |
106107
| PDF to Text | [PDF to text](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20to%20text) | Converts PDF to text. |

0 commit comments

Comments
 (0)