How nference.ai leverages Ignite for distributed analytics in the bioinformatics domain
At nference, we have a ton of use cases which involve heavy real time compute on large amounts of multi-omics data(genomics, proteomics etc). Each user request might require statistical computation involving 100s of GB of numerical data. An example of this is finding similar proteins which translates to cosine similarity of one vector against 10 million other vectors. We needed a horizontally scalable framework which allowed us to define different statistical analyses and execute it on TB's of numerical data in real time, without movement of data. We have built a light framework on top of Ignite to allow for developers to quickly create their own functions (E.g cosine similarity) to run on existing datasets and also upload their own datasets. We have leveraged ignite colocated processing and ignite thick clients to solve this. Internally , we built this with the goal of satisfying the use cases of one team. We have planned to migrate other real time distributed compute use cases from other teams to this framework.
Rohit Jain
My educational background is in NLP and engineering. Have published 3 papers in top NLP conferences. I spent around 2.5 years working in a cutting edge R&D lab out of india before joining nference.ai as one of their early employees.