Add Password Strength Checker script with ML and NN (DhanushNehru#349)

iHaz32 · web-flow · commit 477f1022231e · 2024-10-21T18:45:03.000+06:00
* Add password strengtch checker source code

* Update README.md

* Fix README.md

* Fix typo in model README.md
diff --git a/Password Strength Checker/README.md b/Password Strength Checker/README.md
@@ -0,0 +1,27 @@
+# Password Strength Checker
+
+## Description
+A password strength checker that utilizes machine learning to classify the strength of passwords. This project provides a simple interface for users to input their passwords and receive feedback on their strength based on various criteria.
+
+## Features
+- Classifies password strength into multiple categories.
+
+## Installation
+1. Clone the repository:
+   ```bash
+   git clone https://github.com/DhanushNehru/Python-Scripts
+   cd "Password Strength Checker"
+
+2. Create and activate a virtual environment:
+   ```bash
+   python3 -m venv venv
+   source venv/bin/activate  # On Windows use `venv\Scripts\activate`
+
+3. Install the required packages:
+   ```bash
+   pip install -r requirements.txt
+
+## Usage
+To run the password strength checker:
+   ```bash
+   python main.py
diff --git a/Password Strength Checker/main.py b/Password Strength Checker/main.py
@@ -0,0 +1,8 @@
+from model.model import predict   # import model
+
+def main():
+    password_to_test = input("Enter a password to check its strength: ")   # get password from terminal
+    predicted_class = int(predict(password_to_test))   # evaluate password strength
+    print(f"Password strength classification: {predicted_class} / 2")   # output 0 - weak, 1 - moderate, or 2 - strong
+
+if __name__ == "__main__":  main()
diff --git a/Password Strength Checker/model/README.md b/Password Strength Checker/model/README.md
@@ -0,0 +1,15 @@
+# Password Strength Classification Model
+
+## Overview
+This model is designed to evaluate the strength of passwords using machine learning techniques. It analyzes input passwords and classifies them based on their strength, providing feedback for users to create stronger passwords.
+
+## Model Architecture
+- **Input Layer**: The model accepts passwords as input.
+- **Dense Layers**: A series of dense layers with activation functions (e.g., ReLU) process the input features.
+- **Output Layer**: The final layer outputs a classification score indicating password strength (e.g., weak - 0, medium - 1, strong - 2).
+
+## Training
+- The model is trained on a labeled dataset of passwords classified by strength.
+
+## Future improvements
+- In feature engineering, columns about the amount of common used passwords (etc. 'password') or common used words should be added and be taken into consideration properly in model training.
diff --git a/Password Strength Checker/model/deep_learning_model.h5 b/Password Strength Checker/model/deep_learning_model.h5
diff --git a/Password Strength Checker/model/model.py b/Password Strength Checker/model/model.py
@@ -0,0 +1,68 @@
+# disable debugging messages
+def warn(*args, **kwargs):
+    pass  
+import warnings   
+warnings.warn = warn   
+warnings.filterwarnings("ignore", category=DeprecationWarning)  
+import os
+os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0' 
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
+from silence_tensorflow import silence_tensorflow
+silence_tensorflow("WARNING")
+
+import pandas as pd
+import pickle
+
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
+from model.utils.functions import calculate_entropy, count_repeats, count_sequential
+from model.utils.preprocessing import run_preprocessing
+from model.utils.training import run_training
+
+# run preprocessing and training
+# run_preprocessing()  # uncomment to run preprocessing
+# run_training()  # uncomment to train the model
+
+def prepare_input(password):  # function to prepare input features from password
+    # create a dataframe for a single input
+    data = {
+        'length': [len(password)],  # calculate password length
+        'lowercase_count': [sum(c.islower() for c in password)],  # count lowercase characters
+        'uppercase_count': [sum(c.isupper() for c in password)],  # count uppercase characters
+        'digit_count': [sum(c.isdigit() for c in password)],  # count digits
+        'special_count': [sum(not c.isalnum() for c in password)],  # count special characters
+        'entropy': [calculate_entropy(password)],  # calculate entropy
+        'repetitive_count': [count_repeats(password)],  # count repetitive characters
+        'sequential_count': [count_sequential(password)]  # count sequential characters
+    }
+
+    with open('model/scaler.pkl', 'rb') as file:  # load the fitted scaler from file
+        scaler = pickle.load(file)
+    
+    # convert to dataframe
+    input_df = pd.DataFrame(data)
+    
+    # normalize using the previously fitted scaler
+    normalized_input = scaler.transform(input_df)
+    
+    return pd.DataFrame(normalized_input, columns=input_df.columns)  # return normalized input as dataframe
+
+def predict(password):  # function to predict password strength
+    # load the model
+    model = Sequential()  # create a sequential model
+    model.add(Dense(128, activation='relu', input_shape=(8,)))  # add input layer with 128 neurons
+    model.add(Dense(64, activation='relu'))  # add hidden layer with 64 neurons
+    model.add(Dense(3, activation='softmax'))  # add output layer with softmax activation
+
+    # load trained weights
+    model.load_weights('model/deep_learning_model.h5')  # load weights from the trained model file
+
+    # prepare the input
+    password_to_test = password  # assign password to test
+    input_features = prepare_input(password_to_test)  # prepare input features
+
+    # make the prediction
+    prediction = model.predict(input_features, verbose=0)  # predict using the model
+    predicted_class = prediction.argmax(axis=-1)  # get the predicted class index
+
+    return predicted_class  # return the predicted class
diff --git a/Password Strength Checker/model/scaler.pkl b/Password Strength Checker/model/scaler.pkl
diff --git a/Password Strength Checker/model/utils/functions.py b/Password Strength Checker/model/utils/functions.py
@@ -0,0 +1,18 @@
+import numpy as np
+
+def calculate_entropy(password):   # function to calculate the entropy of a password
+    if len(password) == 0:   # check if the password is empty
+        return 0   # return 0 for empty passwords
+    char_counts = np.array(list(password))   # convert password to a numpy array
+    unique, counts = np.unique(char_counts, return_counts=True)   # get unique characters and their counts
+    probabilities = counts / len(password)   # calculate the probability of each character
+    entropy = -np.sum(probabilities * np.log2(probabilities))   # compute the entropy using the probabilities
+    return entropy  # return the calculated entropy
+
+def count_repeats(password):   # function to count consecutive repeated characters in the password
+    return sum(password[i] == password[i + 1] for i in range(len(password) - 1))   # sum the repeated characters
+
+def count_sequential(password):   # function to count sequential characters in the password
+    sequences = [''.join(chr(i) for i in range(start, start + 3)) for start in range(ord('a'), ord('z') - 1)]   # generate sequences of 3 lowercase letters
+    sequences += [''.join(str(i) for i in range(start, start + 3)) for start in range(10)]   # generate sequences of 3 digits
+    return sum(1 for seq in sequences if seq in password)   # count how many of the sequences are in the password
diff --git a/Password Strength Checker/model/utils/preprocessing.py b/Password Strength Checker/model/utils/preprocessing.py
@@ -0,0 +1,32 @@
+import pandas as pd
+import pickle
+
+from model.utils.functions import calculate_entropy, count_repeats, count_sequential
+from sklearn.preprocessing import StandardScaler
+
+def run_preprocessing():
+    # import data
+    dataframe = pd.read_csv('model/passwords.csv', on_bad_lines='skip')   # read csv data file
+    dataframe = dataframe.dropna()   # remove rows with empty values
+    dataframe = dataframe.drop_duplicates(subset='password')   # remove duplicates
+
+    # add new columns
+    dataframe['length'] = dataframe['password'].str.len()   # column for password length
+    dataframe['lowercase_count'] = dataframe['password'].apply(lambda x: sum(c.islower() for c in x))   # column for amount of lowercase characters
+    dataframe['uppercase_count'] = dataframe['password'].apply(lambda x: sum(c.isupper() for c in x))   # column for amount of uppercase characters
+    dataframe['digit_count'] = dataframe['password'].apply(lambda x: sum(c.isdigit() for c in x))   # column for amount of digits
+    dataframe['special_count'] = dataframe['password'].apply(lambda x: sum(not c.isalnum() for c in x))   # column for amount of special characters
+    dataframe['entropy'] = dataframe['password'].apply(calculate_entropy)  # column for entropy
+    dataframe['repetitive_count'] = dataframe['password'].apply(count_repeats)  # column for amount of repetitive characters
+    dataframe['sequential_count'] = dataframe['password'].apply(count_sequential)  # column for amount of sequential characters
+
+    scaler = StandardScaler()   # use standard scaler because there is a gaussian distribution in passwords.csv
+    numerical_features = ['length', 'lowercase_count', 'uppercase_count', 'digit_count', 'special_count', 'entropy', 'repetitive_count', 'sequential_count']
+    dataframe[numerical_features] = scaler.fit_transform(dataframe[numerical_features])
+
+    # save scaler model for future use
+    with open('model/scaler.pkl', 'wb') as file:
+        pickle.dump(scaler, file)
+
+    # save preprocessed data
+    dataframe.to_csv('model/output.csv', index=False, header=True)
diff --git a/Password Strength Checker/model/utils/training.py b/Password Strength Checker/model/utils/training.py
@@ -0,0 +1,47 @@
+# disable debugging messages
+def warn(*args, **kwargs):   
+    pass  
+import warnings   
+warnings.warn = warn  
+warnings.filterwarnings("ignore", category=DeprecationWarning)  
+import os
+os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0' 
+os.environ['TF_CPP_MIN_LOG_LEVEL'] = '1'
+from silence_tensorflow import silence_tensorflow
+silence_tensorflow("WARNING")
+
+import pandas as pd
+
+from sklearn.model_selection import train_test_split
+from tensorflow.keras.models import Sequential
+from tensorflow.keras.layers import Dense
+from tensorflow.keras.utils import to_categorical
+
+
+def run_training():   # function to run the training process
+    dataframe = pd.read_csv('model/output.csv')  # load the processed data from output.csv
+
+    # split the data into features and target variable
+    X = dataframe[['length', 'lowercase_count', 'uppercase_count', 'digit_count', 'special_count', 'entropy', 'repetitive_count', 'sequential_count']]  # feature columns
+    y = dataframe['strength']  # target variable
+
+    # convert target variable to categorical
+    y = to_categorical(y)  # convert labels to categorical format for multi-class classification
+
+    # split into training and test sets
+    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  # 80-20 split
+
+    # initialize the model
+    model = Sequential()  # create a sequential model
+    model.add(Dense(128, activation='relu', input_shape=(X_train.shape[1],)))  # add input layer with 128 neurons
+    model.add(Dense(64, activation='relu'))  # add hidden layer with 64 neurons
+    model.add(Dense(y.shape[1], activation='softmax'))  # add output layer with softmax activation
+
+    # compile the model
+    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])  # compile the model with adam optimizer
+
+    # train the model
+    model.fit(X_train, y_train, epochs=50, batch_size=32, validation_split=0.2)  # fit the model on training data
+
+    # save the model to a file
+    model.save('model/deep_learning_model.h5')  # save the trained model
diff --git a/Password Strength Checker/requirements.txt b/Password Strength Checker/requirements.txt
@@ -0,0 +1,46 @@
+absl-py==2.1.0
+astunparse==1.6.3
+certifi==2024.8.30
+charset-normalizer==3.4.0
+flatbuffers==24.3.25
+gast==0.6.0
+google-pasta==0.2.0
+grpcio==1.67.0
+h5py==3.12.1
+idna==3.10
+joblib==1.4.2
+keras==3.6.0
+libclang==18.1.1
+Markdown==3.7
+markdown-it-py==3.0.0
+MarkupSafe==3.0.1
+mdurl==0.1.2
+ml-dtypes==0.4.1
+namex==0.0.8
+numpy==1.26.4
+opt_einsum==3.4.0
+optree==0.13.0
+packaging==24.1
+pandas==2.2.3
+protobuf==4.25.5
+Pygments==2.18.0
+python-dateutil==2.9.0.post0
+pytz==2024.2
+requests==2.32.3
+rich==13.9.2
+scikit-learn==1.5.2
+scipy==1.14.1
+setuptools==75.2.0
+silence_tensorflow==1.2.2
+six==1.16.0
+tensorboard==2.17.1
+tensorboard-data-server==0.7.2
+tensorflow-cpu==2.17.0
+termcolor==2.5.0
+threadpoolctl==3.5.0
+typing_extensions==4.12.2
+tzdata==2024.2
+urllib3==2.2.3
+Werkzeug==3.0.4
+wheel==0.44.0
+wrapt==1.16.0
diff --git a/README.md b/README.md
@@ -101,6 +101,7 @@ More information on contributing and the general code of conduct for discussion
 | OTP Verification                     | [OTP Verification](https://github.com/DhanushNehru/Python-Scripts/tree/master/OTP%20%20Verify)                                                | An OTP Verification Checker.                                                                                        |
 | Password Generator                   | [Password Generator](https://github.com/DhanushNehru/Python-Scripts/tree/master/Password%20Generator)                                         | Generates a random password.                                                                                        |
 | Password Manager                     | [Password Manager](https://github.com/nem5345/Python-Scripts/tree/master/Password%20Manager)                                                  | Generate and interact with a password manager.                                                                      |
+| Password Strength Checker                     | [Password Strength Checker](https://github.com/nem5345/Python-Scripts/tree/master/Password%20Strength%20Checker)                                                  | Evaluates how strong a given password is.                                                                      |
 | PDF Merger                           | [PDF Merger](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20Merger)                                                         |Merges multiple PDF files into a single PDF, with options for output location and custom order.|
 | PDF to Audio                         | [PDF to Audio](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20to%20Audio)                                                   | Converts PDF to audio.                                                                                              |
 | PDF to Text                         | [PDF to text](https://github.com/DhanushNehru/Python-Scripts/tree/master/PDF%20to%20text)                                                   | Converts PDF to text.                                                                                              |