Deep Learning OCR-NLP based AI Project
Identify ICD code from doctor-patient conversation using AI/ML models
- Develop Natural language processing (NLP) and Machine learning-based model to predict ICD codes from the medical records obtained in the form of pdf, doc, Docx, images (png, jpeg, tiff), audio (doctor-patient conversation in .wav format
- Converted sample Doctor Notes, nurse notes, Laboratory reports of different formats (pdf, docs, png, .wav) into text.
- Converted audio (speech recognition) to text and generated medical report in doc format based on the entities extracted.
- Text is extracted from images using Optical Character Recognition (OCR)
- Extracted all entities from text using Named-Entity Recognition (NER) and Rule-based approach
- The medical description is being generated based on the entities and conditions applied
- Predicted ICD codes by measuring the similarity between the medical description generated and the description against each ICD code. This is achieved by using cosine similarity and the code with maximum distance score is predicted.
- Developed a web application where one can upload the medical report in different formats (audio/image/document) and the corresponding ICD code will be generated and displayed