larymak · larymak · Feb 8, 2023 · Jan 26, 2023 · Jan 26, 2023 · Jan 26, 2023
diff --git a/PYTHON APPS/PDF-Text-Extractor/README.md b/PYTHON APPS/PDF-Text-Extractor/README.md
@@ -0,0 +1,77 @@
+# PDF-Text-Extractor
+This GUI Application allows you to extract the texgt from the PDF files. The project is build using the PyPDF2 library for extracting text from PDFs, and the tkinter library for creating the GUI.
+
+## Getting Started
+To run the project, you will need to have Python and pip installed on your system.
+
+### Installation
+1. Clone or download the repository to your local machine.
+
+   ```
+   git clone https://github.com/SamAddy/PDF-Extract-Text.git
+   ```
+
+2. Enter the working directory.
+
+   ```
+   cd PDF-Extract-Text
+   ```
+
+3. Use pip to install the required libraries.
+
+   ```
+   pip install -r requirements.txt
+   ```
+
+### Usage
+1. Run the app using the following command:
+
+   ```
+   python app.py
+   ```
+
+2. A GUI window will appear, with a button to selecgt the PDF file you want to extract text from. 
+
+3. Once you have selected the file, the text will be extracted and displayed in the text box. 
+
+4. You can also save the text to a file by clicking 'Save' button.
+
+<!--
+<p align="center">
+<img src="https://github.com/SamAddy/PDF-Extract-Text/blob/main/Stage1.png" width=50% alt="Browse file"/>
+<img src="https://github.com/SamAddy/PDF-Extract-Text/blob/main/Stage2.png" width=50% alt="Display extractedtext">
+</p>
+
+
+<p align="center">
+![Browse file](https://github.com/SamAddy/PDF-Extract-Text/blob/main/Stage1.png)
+![Diplay text in textbox](https://github.com/SamAddy/PDF-Extract-Text/blob/main/Stage2.png)
+</p>
+-->
+
+<table align="center">
+  <tr>
+    <td>
+      <img src="https://github.com/SamAddy/PDF-Extract-Text/blob/main/Stage1.png" alt="image1" width="400"/>
+    </td>
+    <td>
+      <img src="https://github.com/SamAddy/PDF-Extract-Text/blob/main/Stage2.png" alt="image2" width="400"/>
+    </td>
+  </tr>
+</table>
+
+
+
+### Note 
+Please keep in mind that not all pdfs are created equal, and some pdfs may have text in an image format or other format that may not be extractable with PyPDF2.
+
+### Built With
+ * [Python](https://www.python.org/) - The programming language used.
+ * [PYPDF2](https://pypi.org/project/PyPDF2/) - A library for extracting text from PDF files.
+ * [Tkinter](https://docs.python.org/3/library/tk.html) - A library for creating GUI in Python.
+
+### Contributing 
+Contributions are absolutely welcome. If you have an idea for an improvement, please open an issue or submit a pull request.
+
+### Acknowledgement
+* Inspiration [Mariya Sha](https://github.com/MariyaSha/PDFextract_text)
diff --git a/PYTHON APPS/PDF-Text-Extractor/Stage1.png b/PYTHON APPS/PDF-Text-Extractor/Stage1.png
diff --git a/PYTHON APPS/PDF-Text-Extractor/Stage2.png b/PYTHON APPS/PDF-Text-Extractor/Stage2.png
diff --git a/PYTHON APPS/PDF-Text-Extractor/app.py b/PYTHON APPS/PDF-Text-Extractor/app.py
@@ -0,0 +1,60 @@
+import tkinter as tk
+import PyPDF2
+from PIL import Image, ImageTk
+from tkinter.filedialog import askopenfile
+
+root = tk.Tk()
+root.title('PDF to TEXT')
+root.iconbitmap('./logo.png')
+root.resizable(False, False)
+
+
+canvas = tk.Canvas(root, width=600, height=400)
+canvas.grid(columnspan=3, rowspan=3)
+
+# Insert logo into the window
+logo = Image.open('logo2.png')
+logo = ImageTk.PhotoImage(logo)
+logo_label = tk.Label(image=logo)
+logo_label.image = logo
+logo_label.grid(column=1, row=0)
+
+# instructions
+instructions = tk.Label(root, text='Select a PDF file on your device to extract all its text.', font='calibre')
+instructions.grid(columnspan=3, column=0, row=1)
+
+# Get the PDF file on device
+browse_text = tk.StringVar()
+browse_btn = tk.Button(root, textvariable=browse_text, command=lambda: open_file(), font='calibre', bg='red', width=15, height=2)
+browse_text.set('Browse')
+browse_btn.grid(column=1, row=2)
+
+canvas = tk.Canvas(root, width=600, height=200)
+canvas.grid(columnspan=3, rowspan=3)
+
+
+def open_file():
+    browse_text.set('On it...')
+    # Open the PDF file using the PdfFileReader object
+    file = askopenfile(parent=root, mode='rb', title='Choose a file', filetypes=[('PDF file', '*.pdf')])
+    text = ""
+
+    if file:
+        read_pdf = PyPDF2.PdfReader(file)
+        for i in range(len(read_pdf.pages)):
+            text += read_pdf.pages[i].extract_text()
+
+        text_box = tk.Text(root, height=10, width=50, padx=15, pady=15)
+        text_box.insert(1.0, text)
+        text_box.tag_config('center', justify='center')
+        text_box.tag_add('center', 1.0, 'end')
+        text_box.grid(column=1, row=3)
+
+        browse_text.set('Browse')
+
+
+def convert_to_docx():
+    pass
+
+
+root.mainloop()
diff --git a/PYTHON APPS/PDF-Text-Extractor/logo2.png b/PYTHON APPS/PDF-Text-Extractor/logo2.png
diff --git a/PYTHON APPS/PDF-Text-Extractor/random_text.pdf b/PYTHON APPS/PDF-Text-Extractor/random_text.pdf
diff --git a/PYTHON APPS/PDF-Text-Extractor/requirements.txt b/PYTHON APPS/PDF-Text-Extractor/requirements.txt
diff --git a/README.md b/README.md
@@ -115,3 +115,4 @@ guide [HERE](https://github.com/larymak/Python-project-Scripts/blob/main/CONTRIB
 | 64    | [Umbrella Reminder](https://github.com/larymak/Python-project-Scripts/tree/main/TIME%20SCRIPTS/Umbrella%20Reminder)                                   | [Edula Vinay Kumar Reddy](https://github.com/vinayedula)    |
 | 65    | [Image to PDF](https://github.com/larymak/Python-project-Scripts/tree/main/IMAGES%20%26%20PHOTO%20SCRIPTS/Image%20to%20PDF)                       | [Vedant Chainani](https://github.com/Envoy-VC)              |
 | 66    | [KeyLogger](https://github.com/larymak/Python-project-Scripts/tree/main/OTHERS/KeyLogger)                                                         | [Akhil](https://github.com/akhil-chagarlamudi)              |
+| 67    | [PDF Text Extractor](https://github.com/SamAddy/Python-project-Scripts/tree/main/PYTHON%20APPS/PDF-Text-Extractor)                                                         | [Samuel Addison](https://github.com/SamAddy)              |