Ayurveda – Data Science and Omics

Dr T Saketh Ram

Research Officer (Ay.), CCRAS-National Institute
of Indian Medical Heritage, Hyderabad, Telangana

In an era of advanced computers and increasing computational power, Artificial intelligence, Big Data, Machine learning etc. are the new buzzwords. These technologies are bringing changes in our experience of the world and many such technologies have become a part of our daily life. Medical science also has not remained untouched by these technologies. Big Data refers to data which is difficult to process using normal human intelligence or normal computational power owing to its size, the speed at which it is generated, variety of data, unstructured data etc. The increasing number of gadgets and computational devices are generating a huge volume of data and dealing with such data needs greater computational power and advanced technologies such as Artificial intelligence. Artificial intelligence refers to computational algorithms which work similarly to human intelligence in functioning and helps in the automation of decision-making to a great extent which leads to a requirement of minimal human intervention. Artificial intelligence, when combined with machine learning can help in better decision making.

Adopting the changes in technology is essential for the sustenance of any form of science. In today’s scenario where there is an exponential increase in computing power and use of computers in medicine. Ayurveda should utilize these technologies for providing better patient care. Most of the data in the present scenario as far as Ayush systems are concerned are in the analogue form. Thus conversion of these data from the present analogue to digital form is the primary requisite in this regard.

What is Data Science?
Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract or extrapolate knowledge and insights from noisy, structured and unstructured data, and apply knowledge from data across a broad range of application domains. Data science is related to data mining, machine learning and big data. Along with empirical, theoretical, and computational paradigms of the scientific process data science is considered the fourth paradigm.

What is Omics?
The suffix “omics” is an emerging interdisciplinary knowledge system which analyzes the structure and functions of the whole makeup of a given biological function, at different levels, including the molecular gene level (“genomics”), the protein level (“proteomics”), and the metabolic level (“metabolomics”).


Data Sources for Ayurveda

The data generated so far and being generated now is in the following three forms:

1. Oral Traditions:

This is the longest and live form of data set available from folk healers, traditional vaidyas and other cultural forbearers. Few attempts have been made to record in audiovisual format for such materials and the work is in progress. In spite of the variety of this information in the form of different languages, and dialects the subject of interest remains the health care for humans, animals, plants and even the environment. There is great scope to transcribe such content using amicable tools and further convert them into interpretable data in this segment.

2. Written documents (Manuscripts and Archives etc.,):

Ayurveda is a codified traditional medical system, and a lot of literature is in the form of well-written texts. The texts are in Institutional libraries or in the custody of individuals. Post-National Manuscript Mission (NAMAMI) digitization drive, now we can now trace two forms of such collections:

a. Unattended medical manuscripts which need to be digitized. This collection comes with the scope for employing advanced imaging techniques along with dynamic data analytical tools for such works which are going to be taken up hereafter.

b. Already Digitized and awaiting further processing: A lot of work has been attempted to digitize the medical manuscripts under various projects such as NAMAMI and funds provided by Ayush to various organizations. Huge repositories of such images/pdfs are available and stored redundantly in various organizations across India. Now we can sit through using advanced image processing tools and AI tools to further process information and learn patterns of knowledge pedagogy transfer and develop new insights based on the same.

3. Print Form:

As of now a lot of data generated in the Ayurveda sector is in print form lying redundantly in books, journals, reports, proceedings etc., scattered across the country and elsewhere in the world. Very little of it is in digital form. For example, data pertaining to the Survey of Tribal areas and villages of India for medicinal plant wealth is in place for over a period of 40 years, yet we have no effective mechanism to skim through the same to bring out some tangible output.

4. Dynamic data in print/semi-digitized/ completely electronic form:

Some of the key sources of this data are:
1. Data generated from various IT initiatives of the Ministry of Ayush such as Ayush Research Portal, National Ayush Morbidity and Standardized Electronic (NAMASTE) portal, Ayush Hospital Management Information System (AHMIS), other diligently built initiatives from time to time
2. Traditional Knowledge Digital Library (TKDL) by CSIR,
3. Indian Medicinal Plants, Phytochemistry And Therapeutics (IMPPAT) by The Institute of Mathematical Sciences (IMSc) | Areejit Samal,
4. Encyclopedia on Indian Medicinal Plants by ENVIS, TDU-FRLHT, Bengaluru
5. Government of India-National Health Portal, Central bureau of health intelligence portals, National Health Authority Portal, National Resource Centre for Electronic Health Records, CDAC, Pune, Government of India data portal (, India Data Portal by ISB, World Health Organization Portals etc.
6. Numerous published books, reports, and monographs in print/semi-digitized form such as data collected from various research projects such as Tribal Health Care Research Project, Swasthya Rakshan Programme and other outreach activities etc.


Applications in Ayurveda, other traditional medicine streams

1. Smart Electronic Health Records:
AI-enabled electronic health records can simplify data collection and ensure the correctness of data. This, when coupled with technological assistance like handwriting recognition and Optical Character Recognition can increase the speed of data entry and generate a lot of computer analyzable data. MATLAB’s ML handwriting recognition technologies and Google’s Cloud Vision API for optical character recognition are just two examples of innovations in this area.

Clinical Data Mining:
a. Processing of big data of aggregate morbidity statistics from National Ayush Morbidity and Standardized Terminologies Electronic (NAMASTE), Portal and patient-specific real-time records from Ayush Hospital Information Management System (AHMIS) by Ayush Grid, Ministry of Ayush, and others by private vendors will provide many insights into the reason for encounter (RFE) and treatment trends in the domain of Ayush.
b. Further, such systems will offer the advantage of analyzing concepts like Prakriti of the various patients to the various parameters of data recorded and disease patterns to bring to light the underlying patterns for proving the fundamental principles of systems.

2. Disease Outbreak Prediction
a. The approach mentioned above can be used to compare seasons with disease patterns etc. to prove the relation between the seasonal dosha variations and disease patterns (predictive medicine).
b. This can also be used to predict the outbreak of communicable diseases based on climatic conditions to prevent mortality.

3. Disease Identification/Diagnosis:
Disease diagnosis is one of the foremost applications of Machine learning algorithms. Computer systems like IBM Watson and Google’s DeepMind are using ML in combination with Big Data analysis and AI to do cutting-edge research in medicine. These technologies are also aiming at developing the concept of personalized medicine by analysis of individual health data and predictive analysis. This may be useful in proving the concepts of holistic healthcare which is the hallmark of all Ayush systems. Some other possible fields of application include:
a. Artificial intelligence-based algorithms can be used to develop better Decision Support Systems based on Ayush parameters. The results can be improved by adding machine learning so as to improve their performance based on real-time data.
b. Medical image recognition: Useful especially in the diagnosis of skin disorders, identification of colour changes on the body in case of a disease like vitiligo to assess their progress, urine analysis using traditional methods etc. These when analyzed with the help of Artificial intelligence algorithms can help develop objective evidence for diagnosis and monitoring the effect of treatment. Applications such as Skin Vision (skin cancer detection through image processing using a mobile phone) is an example in this regard.

4. Drug Discovery
Knowledge Discovery by Data Mining of Traditional Repositories:
a. Pattern Recognition: Identification of patterns of drug use from the vast literature base of various Ayush systems for new drug development and also understanding the logic of formulations. This can be done by techniques such as frequency analysis, correlation analysis, complex network analysis, and cluster analysis.
b. Analysis of genetic information of medicinal plants may bring to light similarities in properties between plants of different species/genera with similar Ayush parameters (e.g. Rasa, Guna, Virya etc. in Ayurveda).
c. Docking/Simulation studies done using High-Performance Computing can help in understanding the drug action mechanism and also computed aided new drug development from Traditional Knowledge to fast track the process of drug development for emerging new diseases.
d. Identification of pathways of drug action can be done using ML algorithms which is highly useful in the case of systems like Ayush where the drugs have numerous active compounds and which act through multiple pathways in the body.
e. Use of artificial chemical sensors can also be used to assess various parameters such as taste and smell in various compounds to identify the parameters such as Rasa, Guna, Virya etc. through comparative analysis using various drugs.
f. Evidence-based Drug development through a reverse pharmacology approach can be done by analysis of big data from medical records by analysis of drug usage and the outcome of the intervention.
g. Comparative study of usages of the same drug in different systems of medicines such as Ayurveda, Unani, Siddha and Homeopathy.

5. Clinical Research in Ayurveda
The various points mentioned under the above headings can also be used from a research perspective for data collection, analysis and thus generating evidence for the fundamental principles of Ayush systems. Apart from these, the use of big data and machine learning can advance the concepts of evidence-based medicine in Ayush systems by assessing the outcomes of interventions based on various parameters of assessment. (e.g. changes in symptoms after doing Panchakarma).

6. Medical technology/imaging etc.
AI and ML-enabled technologies for assessment of various Ayush parameters such as Nadi (signal processing), Tongue, urine and skin examination (mainly image processing) etc can improve clinical diagnosis and research in AYUSH.

7. Personalized Treatment/Behavioral Modification
Integration of data from IoT-based devices to assess the real-time variations in various parameters for analysis based on fundamental principles of Ayurveda. The development of advanced sensors coupled with wearable gadgets has made the collection of health-related data from patients as well as healthy individuals on a real-time basis possible which if analyzed can bring to light various patterns related to health and disease. Technologies for analysis of various parameters such as pulse examination, amount of sweating, body temperature variations, sleep patterns etc., can be monitored on a real-time basis can be used for this. This data on the long-term analysis can be used for the early detection of diseases and thus reduce the disease burden through preventive interventions.

8. Development of virtual assistants for patient support and medical education using technologies like chatbots.

9. Development of speech recognition and text-to-speech and advanced word search technologies for the Sanskrit language with the help of AI and Deep Learning:
This would be helpful in generating advanced data mining tools for analysis of classical Ayurveda literature, which would be helpful in education, research (by developing theoretical evidence for drug development) etc.


Present scenario from Ayush Perspective

As of now, there are only very few AI-related applications being used in the Ayurveda Sector such as Nadi Tarangini (a diagnostic instrument for Nadi Pareeksha), Jiva Android App (trend-based diagnostic tool) etc. and most of these are in the private sector.  Also, there is very little awareness regarding the possibilities of the use of AI in the Ayush Sector at present. Attempts for centralized data collection have been initiated by the Ministry of Ayush has been initiated through initiatives such as NAMASTE Portal and AYUSH-HMIS but as of now, no AI technology has been used in these.

 Challenges in implementation:

  1. Diversified data (lack of common standards for data) and informal scattered databases leading to a lack of centralized data.
  2. Lack of a sufficient number of trained and oriented human resources in Ayush Sector having good knowledge of IT.
  3. Infrastructure, Hardware and Software related challenges.
  4. Lack of dedicated financial resources and sustainable recurring expenditure.
  5. Lack of assessment of Cost-Benefit Analysis and utility of these applications in the Ayurveda Sector. 


  1. Partnership/Collaboration with institutes already in the field:

Since technologies change rapidly, partnerships with organizations which use the latest available technologies help to harness the best out of technology.

  1. Brainstorming sessions involving technology experts to identify focus areas.

Brainstorming sessions involving experts in AI, ML and Big Data along with domain experts will help in evolving specific projects.

The key areas for brainstorming include 

The Development of Databases
● Digitization of Triskandhakosha to develop Hetu, Linga and Oushadha Databases with the inclusion of Morbidity/Symptom/Drug Codes
● Development of Medicinal plant and Mineral (Dravyaguna and Rasashastra) databases including single drugs and formulations.
● Database of formulations.
● Database of books and classical Ayurveda literature.
● Epidemiological and demographic database including data from health-related surveys, outreach programmes etc. with special focus on etiological factors and their relation to disease. 

Verification and Validation of existing databases
Verification and Validation of
● Existing databases such as Ayush Research Portal including
o Safety Database
o Efficacy Database
● Prakriti proforma,
and their integration with Dhara, Pubmed, Google etc.

Identification of Software Professionals and Companies for testing and Validation.
Developing specific projects to meet the requirements in Ayurveda and other Ayush systems.


The projects may be divided into various phases such as
1. Immediate
a. Identification of areas of work,
b. Identification of experts &
c. Validation of existing databases

2. Short-term
a. Development of new databases
b. Validation of new databases &
c. Preliminary testing of developed AI bases systems

3. Long-term projects.
The possible outcomes include
1. Data science, Omics-assisted Decision Support systems (including Diagnostic support, prognosis assessment and management suggestions).
2. Epidemiology and demography data from analysis of data.
3. Data science, omics-assisted Drug designing and Drug development taking leads from Ayurveda.
3. Promoting the centralized data collection in all formats including images, radiography reports, signals etc. in standardized formats of that respective data type. eg. Standardization of camera specification for images, collection of video data with fixed length etc. in Ayush HMIS systems.
4. Pooling of information/resources regarding all available analogue/digital assets and information repositories already created under Ayush at various levels and preparing the ecosystem for big data compatible and AI, Omics ready.
5. Use of open source AI, Omics libraries and algorithms for generating Proof of Concepts.
6. Generating awareness of the need for use of Data Science and omics in Ayurveda.