Naan Mudhalvan is a pioneering initiative by the Government of Tamil Nadu, India, aimed at enhancing the employability of students through industry-aligned training programs. The program collaborates with leading technology companies and educational institutions to provide cutting-edge training in various domains, including Artificial Intelligence, Machine Learning, and Speech Recognition.
GUVI (Grab Your Vernacular Imprint) is a leading ed-tech platform that provides vernacular language-based technical education. As a partner in the Naan Mudhalvan program, GUVI offers specialized courses in emerging technologies, including their comprehensive Speech Recognition course that covers both theoretical concepts and practical implementations.
This repository contains three major projects developed as part of the Naan Mudhalvan Speech Recognition course:
-
Audio Data Preprocessing and Augmentation (
nm-unit1/)- Comprehensive audio data preparation pipeline
- Format conversion and standardization
- Data augmentation techniques
- Metadata management
-
Accent-Aware Speech Recognition (
nm-unit4/)- Advanced ASR system with accent adaptation
- Deep learning-based implementation
- Support for multiple English accents
- Performance evaluation and metrics
-
Call Clarity Monitor (
nm-unit5/)- Real-time call quality monitoring system
- Speech recognition and analysis
- Content safety monitoring
- Text analytics integration
- Django-based backend implementation
nm-projects-speech-recognition/
├── nm-unit1/ # Audio Preprocessing Project
│ ├── Speech-to-Text Transcription System.ipynb
│ ├── README.md
│ ├── LICENSE
│ ├── CONTRIBUTING.md
│ └── requirements.txt
│
├── nm-unit4/ # Accent-Aware ASR Project
│ ├── Accent-Aware Speech Recognition.ipynb
│ ├── README.md
│ ├── LICENSE
│ ├── CONTRIBUTING.md
│ └── requirements.txt
│
├── nm-unit5/ # Call Clarity Monitor
│ ├── call-clarity-monitor/ # Frontend application
│ ├── call_clarity_backend/ # Django backend
│ ├── analyzer/ # Analysis modules
│ ├── media/ # Media storage
│ ├── documentation.md # Detailed documentation
│ ├── about.md # Project information
│ ├── requirements.txt # Dependencies
│ └── manage.py # Django management script
│
└── README.md # This file
The Naan Mudhalvan Speech Recognition course, in collaboration with GUVI, covers:
-
Fundamentals of Speech Processing
- Audio signal processing
- Feature extraction
- Data preprocessing techniques
-
Machine Learning for Speech
- Deep learning architectures
- Model training and optimization
- Performance evaluation
-
Practical Implementation
- Real-world applications
- Industry best practices
- Project-based learning
- Full-stack development
- System integration
-
Clone the repository:
git clone https://github.com/itzi-vignesh/nm-projects-speech-recognition.git cd nm-projects-speech-recognition -
Install dependencies for each project:
# For Audio Preprocessing Project cd nm-unit1 pip install -r requirements.txt # For Accent-Aware ASR Project cd ../nm-unit4 pip install -r requirements.txt # For Call Clarity Monitor cd ../nm-unit5 pip install -r requirements.txt
-
Follow the individual project READMEs for detailed setup and usage instructions.
We welcome contributions! Please see the CONTRIBUTING.md file for guidelines on how to contribute to this project.
This project is licensed under the MIT License - see the LICENSE file for details.
- Naan Mudhalvan Program for providing the learning opportunity
- GUVI for their comprehensive course content and support
- Common Voice dataset for providing the training data
- Open-source community for various tools and libraries
- Django framework and its contributors
- OpenAI and other API providers
For questions and feedback, please open an issue in the GitHub repository or contact:
- GitHub: itzi-vignesh
- Project Repository: nm-projects-speech-recognition