September 23, 2022

Keshav Pingali, Co-founder & CEO of Katana Graph

Keshav Pingali is the CEO and co-founder of Katana Graph, the AI-powered Graph Intelligence Platform providing faster, deeper, and more accurate insights on massive and complex data. Keshav holds the W.A."Tex" Moncrief Chair of Computing at the University of Texas at Austin, and is a Fellow of the ACM, IEEE, and AAAS. He received his Masters in Electrical Engineering and his Doctorate in Science from MIT. He previously also held the India Chair of Computer Science at Cornell University

Julian: Thank you everyone for being on the behind company Lines podcast. Today, we have Keshav Pingali co-founder and CEO of Katana graph. Katana graph is the AI power graph intelligence platform providing faster, deeper, and more accurate insights on massive and complex data. Keshav. Thank you so much for being on the show.

I'm really excited to dive into your background, your experience, and really just to jump right on in what were you doing before you started the.

Keshav: Great pleasure to be here, Julian. Thanks so much for inviting me to your podcast. So I'm a professor at the university of Texas at Austin and the computer science department. And my area of research is high performance computing. About 10 years ago we decided to start using techniques from high performance computing to.

Graph computing, which at that time was a somewhat underserved area. So we did a DARPA project [00:01:00] with BAE systems. They wanted to build a system for realtime inclusion detection and computer network. So were bad guys trying to break into computer networks. Yeah. Wanna catch them as quickly as possible.

And the way that BA wanted to do it was by building a. To capture all the activities in the system and then mine, that graph for suspicious patterns. So they tried a bunch of commercial graph systems. They were not responsive enough. They approached us, we built a system for them, the DARPA really like, and at the end of the project, we were the top of the five performers as DARPA call says.

And so DARPA contacted me and said you know, if you want to do a startup. We'll support you. We'll introduce you to the various three letter agencies in Washington and so on. So that's how CATA graph 

Julian: got started. Amazing. Describe the technology a little bit more. What, what is, what is graph when you describe it and how does it capture those trying to penetrate someone's system?[00:02:00] 

Keshav: Yeah. So that's a great question. Usually when you think about big data, people always think in terms of relational data, So relational data you can think of as big tables, and then you can access the tables by rows or by columns. And there are query languages like sequel mm-hmm that have been around for 50 years that people use in order to query the data, but more and more of the data sets that are being generated now are what we call unstructured or irregular data.

A lot of them can be represented useful in terms of what are known as graphs. So even though graphs are a mathematical concept and might seem very complicated, the reality is all of us use graphs all the time. Even if we are not aware that what we are using is a graph. So if you get on a plane and you open up the inflight magazine of the airline, you'll see an airline map.

So the root map is a very simple example of a [00:03:00] graph. So the cities or the airports are represented by dots. So we call them nodes mm-hmm and then if there's a direct flight between two airports, you'll see an edge align connecting those two cities. So that's a very simple example of a graph. So the there are nodes and then there are edges, and then there are properties on the nodes and edges.

And then given that kind of graph data, you can say, well, what's the minimum number of flights I need to take to get from Austin to the bay area, for example, mm-hmm so you can trace the edges of that graph and answer that question. now airline roadmaps are a very simple example of a graph. I'll give you a more complex example of a graph that we are currently involved in building.

So we work with AbbVie, which is one of the biggest pharma companies in the world may well be the biggest pharma company in the world. And like all other pharma companies, they're building what they call knowledge. So in knowledge graphs, [00:04:00] you're taking all the medical knowledge that is known and representing it inside the computer.

And so in that knowledge graph, there are nodes that represent diseases, treatments biologically active compounds papers, authors, and so on mm-hmm . And then if there is a paper written by certain authors on a particular disease with a certain treatment, you can add. Between all of those entities in order to represent that information.

So in general, a graph has it represents entities and relationships between entities which are represented by edge. So graphs that turns out are ubiquitous. They show up in just about every vertical. We're also heavily involved in FinTech, for example, and there, the nodes represent transactions, the people doing those transactions, whether the transaction was a debit or a credit and so on.

And again, given all of the transactions data, for [00:05:00] example, from an online payments processor, which we are working on, you can represent all of that transaction data as a. And then you can look for suspicious transactions, like for example, fraud rings and so on. Mm-hmm . So the idea is to represent all of these data sets as graphs, entities, and connections between entities and then look for patterns.

Julian: Yeah. What? So that's so, so fascinating. So are within different companies, it sounds like in different industries, they utilize this information and this database a little bit differently. And my question is how are they, how are some of the ways that they use this outside of, you know, it sounds like there's for finance, you know, it's to alert of any, you know, transactions that don't seem either real or, or that seem fraudulent.

But in terms of like pharmaceuticals in healthcare, that's extremely fascinating because of the connection of information is so important when either ailing certain illnesses or diseases or trying to come up with a different compound that treats a certain specific disease or, [00:06:00] or virus or illness.

How, how are they using this information with, you know, with CATA photograph for different purposes, what are the different purposes are using? . 

Keshav: Yeah, so there are a whole lot of use cases within medical and pharma. So we work with companies in the area of precision medicine. And I'll tell you a little bit about that.

What's called drug hypothesis generation. So that's another big area. And also what I call molecular property prediction. Mm-hmm right. So that's another growing area. So let me start with the last one, because I think that's the easiest one. Understand where graphs and graph AI in particular come in.

So the problem is a following one. If you're a pharmaceutical company, your IP really is the molecules that you have. So you're also synthesizing new molecules all the time. And what you want to do is to be able to predict the properties of these new molecules [00:07:00] without having to do a whole lot of expensive tests in.

Okay. So the way to think about the problem is I give you this big database in which you have the molecules, and then all the known molecules I will label. Let's say I'm just interested in whether they're poisonous to people or not. So you can imagine a label on each of those molecules, red or green.

Okay. And now given a few million molecules with known properties like this, if I give you a new molecule, can you predict whether it's going to be poisonous? So that doesn't seem to have very much to do with graphs because the simple way of thinking about it or trying to solve the problem is to say, well, let me look at the molecule structure of this new molecule.

Find the known molecules, which are closest in structure to this one. And then I'll look at their properties and just use that as my. but that's basically eliminating a lot of information that might come from molecules that [00:08:00] are not very similar, but nevertheless provide useful information. So one of the things that our guys did recently, which we're very proud of is we worked with Avi.

We used our graph AI technology, and to cut a long story shot, we basically build a graph in which the entities are the molecules that are. And then the edges represent various similarity metrics between the molecules. And then we train our graph AI system so that when you give it the new molecule, it makes a prediction based on all the molecules that are there in the database, that's the way to think about it.

And so we are able to do far more accurate predictions than previously known methods. Amazing. Well, so that's that's an. 

Julian: Yeah, no, I thank you for the, for the explanation. It's so fascinating how different companies can utilize this information and, and the technology to accomplish specific tasks and specific, you know, kind of milestones within, you know, their own industry.

What, what did people do before technology like [00:09:00] autograph? 

Keshav: Well so one of the places where we are finding a lot of traction is in replacing what people were doing earlier. Mm-hmm , which is going to wet labs and actually doing all of those experiments in order to figure out, for example, the property of this new molecule.

So with this new graph, AI inference technique, you basically cut short. A lot of that experiment. so some of these problems you could solve before CATA graph and graph technologies, on the other hand, it would be very expensive. Yeah. And human intensive, you know, that's the other thing, because you need people to go in there and start doing those experiments and so on.

Yeah. So what this allows you to do is to basically cut a lot of those cycles short, and that's very important because time to market is essential in all area. All industries, as you know, 

Julian: incredible. How I don't know if you have this figure on you, but what, how, how how much do you increase the [00:10:00] time to market for a lot of companies and, and the products that you're helping them or, or supporting them build.

Keshav: Yeah, I don't have that information because a lot of that is very closely guarded information that's held by these companies. So what they do is they bring us in and say, here is a problem where we think your technology can help us. Here are the metrics for success and show us what you can do. And if we show them that, well, then we move forward to licensing and so on, but we never really.

Yeah. So end to end drug pipeline, for example, because that's a closely guided secret. They don't want their competitors to know. 

Julian: Yeah. How, how much is involved in this onboarding process? You know, a company comes to you or you come to a company looking to solve a specific problem and you, it sounds like a lot of technology is built around that specific problem.

What's the process like and how long does that typically take.[00:11:00] 

Keshav: Yeah. That's, that's a great question, Julian. And the answer that I need to give you, is it all depends. Yeah. I know you want a number like two months or something, but it's very complicated and I'll tell you why. So the first level of complication comes from whether the people that we are talking to within the.

Understand graphs and graph technologies and what graph technology graph AI can do for them, or whether they're still in a very exploratory mode. We've heard about graphs. We don't know very much about them. Can you educate us and so on? So obviously the first category of customers, things move much faster because they're already sophisticated about graph technologies.

They have a very specific need in mind. Right right now. The second thing is once we get to the point where they understand what we can do, what graphs can do for them and so on, then we need to look at how much effort it's [00:12:00] going to take for us to provide them with a solution for what they want.

Right. And at the point where we are, we are working very closely with a few lighthouse customers yeah. In different verticals. And the idea is to basically prove the technology in these verticals before we open up the floodgates and go to many more customers. And so in this learning process, we essentially figure.

What it takes to deliver a product successfully in the health and life sciences area or in the FinTech area or information security area, which are the three verticals we are focusing on. And then once we have proven ourselves with these lighthouse accounts, that's when we open up the floodgates, we do all of these things in such a way that we can then replicate whatever we've done with other companies in these.


Julian: the, what's the hardest part with, you know, a technology that seems, you know, very technical [00:13:00] involved within the company's, you know, system and also you know, kind of has a lot of moving parts to make sure that you, you accomplish, you know, whatever task is, is set in front of you and, you know, solve whatever problem that you're, you're working to solve with the company.

What's the hardest part for, for you or for your company? What's something that's so technical.

Keshav: So Julie, you're going to be amused by this, but the artist part has really nothing to do with our technology, but it has to do with security. Okay. Because interesting, the data we work on data, we work on big data sets and ultimately this is customer data and you know, data is money. Data is gold as you.

Yeah. And even if customers are not directly using that data to make money, it could still be very confidential information. Yeah. So medical records, for example, well, there are all these HIPAA regulations and so on that you need to worry about if you're a [00:14:00] bank or an online payments processor. Well then the transactions, again, there's a lot of confidential information there.

And so just navigating the shoulds of. Confidentiality problem takes up a substantial amount of time because we can show them when we get started our results on publicly available data sets. For example, ultimately they want to know what can you do with our data sets? Because the performance that we get might be very different.

On their data sets, but they can't give us the data sets directly. And so they have to sanitize that data and then throw that over the fence to us. And then we deal with that and then we give them results back. They have to translate that into their actual data. So it's just navigating all of that and bringing up our system inside their firewalls and all their security and everything.

Once we get past. POC stayed. So it's those sorts of things that really take a [00:15:00] tremendous amount of time. Yeah. But it's inevitable in the data world. So we factor that in, when we are figuring out the contract. 

Julian: Amazing. Yeah, that, that, that makes sense. A lot of startups that I talk to, and, and I guess, you know, I, the golden rule for a lot of companies as they grow is, is to focus on one problem and be hyperfocused on that specific problem till you move to different use cases.

But it seems like, you know, with the CATA photograph, like this graphing system can be used for so many different in. Can be used for pharmaceuticals financial, I'm assuming it could be used, you know, kind of time and time again for life sciences as well, research what's. How do you view the problem? Do you view the problem as, you know, tackling big data and how to understand it, or do you view it as how to enable companies to you know, solve problems within their organization and utilize it as a tool?

How do you view that? The problem that you're. 

Keshav: well, we work with companies in both modes. Mm-hmm okay. So with some companies we [00:16:00] actually build solutions for them. Got, and so then the data science, so to speak is being done by our guys within Katana, in collaboration with people within. Now that turns out to be a useful thing to do.

If the company that we are dealing with doesn't have the expertise within the company, they have the domain expertise, right. And so they educate us about the domain, the needs of that domain, how fast, for example, the SLAs have to be, and, you know, things like that. But we do the the solution within Katana and that's obviously more hands.

Right. White glove treatment, then a second kind of customer. The other kind of customers we have is they want to do the data science themselves, and they want a platform that's very easy to use for large graph data sets. And so one of the things that people find attractive about the Catana platform [00:17:00] is that unlike these siloed databases where the data sits within your database, and if you want to do any comput.

With the data, you have to create a new cluster, move the data there, move the results back and so on. We are a compute platform. So what that means is we give you a Jupyter notebook and then you can just use Python in order to create a cluster with some number of machines. And then after that, everything is done through Python.

So you can load the data. You can clean the data, you can do analytics on the data. You can query the data, the AI machine learning, everything is done. Through the Jupyter notebook Python interface. And so it's, you're not aware that it's a scale out platform. So we've scaled out to 2 56 machines. We could probably go even more than that, but it's very expensive to do that on the cloud now.

So all of that is hidden from the user. You don't really know how many machines are being used except when you first create that cluster. So [00:18:00] all of the charting of the data. Communication, all of that is taken care of automatically by our system. So people find that attractive because then they can do their data science internally within the competition.

So we have those kinds of customers as. 

Julian: Amazing amazing. You know, going back to, to how, how it was conceived, you know, you were a professor, you were working our project and then, you know, you, you enrolled in this competition ended up, you know, being, you know, the, the finalist and the top performers. How has that transition been from, from teaching then to running a startup been for you?

Keshav: Well, it's been a barrel of fun as is. Yeah. So I've been an academic most of my life. And this is my first startup. My co-founder Chris Ross pack. Who's the CTO has done many startups as I was telling you. And of course we have lots of people within the company with experience and big companies and startups and so on, like our CBO far of our shades.

It for me, I think the, the most [00:19:00] interesting thing has been. Seeing the ideas and the prototypes that we built as research projects get transitioned into products, right? Because for a prototype, you know, as long as you can do a good demo for DARPA it's success. You're the top of the competition. You did the demos, they're happy and you shake hands, you get your money and you move on.

Whereas a product, obviously, you know, it has to be much more robust. It has to be reliable. There's a lot of QA that has to be done. So it's basically that sort of battle hardening of software that has been a learning experience for me. And. Interesting different from what I have done before.

Although all our QA people are veterans in the industry, so they know how to get all of this done. And then the other interesting thing is as an academic, I'm used to getting money from the national science foundation or DARPA. And so you write a [00:20:00] proposal, you make it look good and you know, you get the money.

So I've been fairly successful during my career in. Lots of money going to VCs is a entirely different experience, right? Because it's not a question of showing them nice PowerPoint slides and a good proposal or something that it's like, well, what's your timeline? What's your ARR? What is the goal to profitability?

You know, there's a lot of very concrete, nuts and bolts things that one has to pay attention to. So it's a very different convers. With VCs. Yeah. Now I'm very fortunate because the chairman of our board is li Butan. Who's one of the major investors in the valley. And so he has been a wonderful mentor for me.

And he's the board of SoftBank, Intel and so on. So very well connected. So that has opened a lot of doors for me, which would've been difficult. 

Julian: Amazing. I, I love how, you know, the transition from what you were doing to what you're doing now. You're obviously finding a lot of success and, and having a really, you [00:21:00] know, fun time learning and, and having these new experiences and taking some of your knowledge, but, you know, learning some new knowledge and, and utilizing your network.

I think it's undervalued how, you know, founders you know, their network actually plays a role in their success and it's equally huge. Yeah. Yeah. And it's equally takes equal amount of attention to foster that community around you to, to provide you know, for, for more support and more future doors to open.

Tell us a little bit about the, the detraction you're facing, what types of industries you're in, where is CATA photograph now? And where has it grown, grown to be over the years. 

Keshav: So we are almost a hundred people at this. And we have we have headquartered in Austin and that's where I'm speaking to you from.

We also have offices in the bay area, New York, city, Denver, and we have team members in India, Poland. So we're really all over the world. Amazing. And goal is to grow to about 150 or so in another year [00:22:00] where we are in terms of. The verticals as I was telling you earlier, we are focusing on three.

Because when I got started, Liu told me the most difficult thing you'll have to do as a CEO is to learn to say no to opportunity . And I remember that every time, you know, some interesting opportunity comes by my way, in a vertical that we. Aren't really in yet, like, you know, supply chain optimization. So lots of users of graphs over there.

We get approached by big companies all the time to see whether we can do something for them there. But right now we've said we are not ready for them yet, because I want to make sure we get traction in these three verticals, established ourselves as market leaders. Duplicate the successes we've had with our lighthouse accounts in medical pharma, InfoSec, and FinTech.

And as we were talking about earlier, you know, each of those areas is so huge, so [00:23:00] many big companies with deep pockets and medical pharma and so on. They're just one of those verticals could actually keep us very busy for a very long time. Yeah. And so it's that kind of. That's very hard to, to get, but I've disciplined myself.

And so that's where we are, those three verticals, lighthouse accounts, and then replicating that success with other accounts. 

Julian: Amazing. I, I love the growth and, and it seems like, yeah, your plate is full, especially with the industries you're in. But I'm curious, you know, I like to ask startups this, because I think it, you know, it tells into, you know, how, how you view the landscape in front of you, but what are some of the biggest risks that your company faces?

Keshav: Well, the biggest risk I have is actually not a technical one because we have a wonderful team. And everybody who talks to my team goes away, super impressed in saying, wow, where did you find these guys? You know, make sure you keep them. So we. I think we have a great team and we have a great dispar decor.

[00:24:00] So people tell me that the attrition rate in the industry is like 15 to 20%. For us it's about one to 2%. So it's a lot less. And I think it's because we've created this sense of accomplishing something worthwhile and then treating everybody. Team members and so on. So I always tell my guys, there's no chain of command.

There's a chain of responsibility within Katana. So keeping my team together in these times when you know, all these big companies are offering enormous salaries. That's a challenge every day. Yeah. But we are meeting that challenge. The second big challenge, and this is completely outta my control is the macroeconomic.

so there's a lot of uncertainty. We are probably in a recession, although nobody wants to admit it. And situation in Europe looks like it's going to worsen during winter. Yeah. When their energy supplies are rather fragile at this point. And so if Europe goes into a depression, they're also trouble [00:25:00] in China because of their real estate market.

So, you know, we are all interconnected and it looks like there's a. And so I've been spending a lot of my time, basically making sure that I can weather this storm, which people are predicting may last a year and a half to two 


Julian: Yeah. Yeah. What are some ways that you're weathering the storm or some strategies that you wanna implement?

Keshav: Well so we are doing everything we can to reduce. Right. And that basically means in our case, we spend a lot of money in the cloud because we are cloud native on two clouds. Right. Awesome. Google and AWS. And so I've been spending a lot of time basically auditing everything that we are doing on the clouds and making sure that there's no redundancy in the tests and experiments we are doing and asking people, do you really need to do.

On the cloud or can you wait and, you know, things like that. [00:26:00] So basically watching money, making sure that we are not wasting any of that, you know, just because we have deep pocketed yeah. Investors. It doesn't mean that we can go off and spend money anywhere we like, so watching expense.

Extending the wrong way. That way is very important. Yeah. Yeah. And then the other is making sure that we deliver value to our customers so that they stay with us. That stickiness is very important because if they don't have confidence in us, then they'll go and work with somebody else. And so I tell my guys, you know, customers, number one, even if you have to spend more money on cloud right now, that's fine.

But customers, number one, keeping them. Amazing. So it's, it's a bunch of things. It's no one thing that you do. Yeah. But you really have to pay attention to everything. No, you, 

Julian: you echo a lot of a lot of you know, things that other founders have kind of focused on when, when anything, you know, building a company and, and creating something that's [00:27:00] extremely, you know, viable and, and overall good product, which is tightening the screws on the process, making sure it's efficient and then making sure that the customer's happy.

That's your main supply of, of yeah. You know, income revenue, but also, you know, that word of mouth spread when you do look for more opportunity is so much more valuable than any, you know, marketing or sales campaign. It's really the, the work you do that compounds on itself. What's your long-term vision for you're exactly right.

Yeah. Oh, sorry. Sorry. But yeah. What, what's your long term vision for CATA photograph? Well, 

Keshav: we want to be the graph AI. Because we think there are two things in there that, well, there are actually three things that we do that are the wave of the future. If not the present, even. So one is big data.

Everybody says, this is the age of big data. And so that's the space we are in. As I was telling your listeners earlier more and more of the data that's being generated is unstructured graph data. Yeah. So we are in a growing segment [00:28:00] of the data space as opposed to traditional companies and the relational space.

So that's where graph comes in. And then the third thing is we believe this whole AI machine learning revolution is here to stay and is going to acceler. And you know, we are gonna see it everywhere. Traditionally, when you had big data, as we were saying earlier, you could just query it essentially to see what happened in the past.

Right? Sure. So it's what people call descriptive analytics. But the new wave is to take all of that data, build models, how to fit like these graph, neural networks, graph, convolution networks, and so on, and then use them to do predictive analytics, say something about what might happen in the. Or going back to that molecule example that we took, right?

If I have a database with a bunch of molecules and for each molecule, I know whether it's toxic or not. If I have a new molecule, well, the database is not gonna help you because you haven't seen that molecule before. You can look it up. If it's one of the [00:29:00] molecules you've already seen, but if it's a new molecule that the database is not going to help you, but if you can build a predictive model, Using the kind of graph neural networks we do, then you can actually make a prediction about that molecule.

So we think that kind of graph AI machine learning is just on the verge of taking. So machine learning, AI obviously revolutionized a whole lot of fields. Yeah. Like for example, image recognition and text translation, natural language translation, and so on. And within graph AI machine learning is the next big wave in AI machine learning.

So we are leveraging those three technologies. And so we want to be the company that people think. When they say we need fast efficient insights from very large scale graph data. Who do we turn? 

Julian: Amazing. I'm so excited to see what comes of, you know, the, the, the big data industry and well, it's, it's an industry, but everyone seems to [00:30:00] have, you know, data that they need to query and, and understand, and then build models off of, and then build solutions from that.

And it's so exciting to see how it's gonna work with humans to not only. You know, connect information, but predict certain outcomes without having to, you know, do the equity of work that, that a human must do through experimentation. So it, it's extremely exciting to, to see what's that gonna lead to and how quickly that's going to accelerate certain tasks and, and you know, certain outcomes.

One thing I always like to ask my founders to not only give me some, some research material and, and some homework, but for our audience as well is what books or people influenced you.

Keshav: Well, I think the, the book that's influenced me the most is the it's called the . Keita, it's probably the most important text in Hinduism. I grew up in India and my grandmother, my mother, all of them used to read the every day. And a few years ago I became [00:31:00] interested. In learning about the Ette Gita.

I started reading it and that's been a very big influence lately in my life because the central message of the Bette GTA is that you should always do. What in Sanskrit is called your DMA. So it's a word that's now entered the English language as well. And it could be translated as your. right.

Yeah. And so doing your duty to the best extent possible is the best way that any human being can live. That's, that's the central message of the ha and I could tell you the entire story, but it would take us another five hours or something, but that's the message that Wishner or Krishna one of our gods compared to a warrior prince who did not want to fight his own relative.

So Krishna tells. You have to do your karma. That is the greatest thing that any person can be expected to live up to. And so I think that that [00:32:00] is the message that I've tried to yeah. Internalize and base my. Everything that I do on, yeah. I have certain duties as CEO. I have a duty to my customers. I have a karma to my investors.

I have a karma to my team members and doing that karma as well as I possibly can is what my life right now is all about. 

Julian: Incredible. I, I love that and it seems not only noble, but it seems simple to focus on and it, it seems like it breeds the right amount, the, the right outcomes that you're looking you know, forward to.

And and it's self it's self selfless in, in in nature, which is which is hard with a lot of, you know, the different inspirational pieces that people use. But I love, I love the sentiment. Well, thank you so much catchall for being on the show. I know we're at time. But before we end, I always like to give my guess a chance to let us know how to support your business.

Where can we find CATA photograph? Where can we support? Where can we learn more knowledge about graph computing? [00:33:00] Give us your LinkedIns, your Twitters, all that information so we can support you outside of this. Okay. 

Keshav: So you can go to our website that has a lot of information about CATA. So it's just CATA

So that's very easy. CATA is one. Perfect. And you'll find a lot of white papers descriptions of our system use cases, all of that over there. We're also on LinkedIn. So you go to company slash Katana dash graph and that's our LinkedIn page. And you can always contact me. My email is caping and I'd be happy to talk with your listeners if they reach.

Julian: Love it. Well, thank you Keshav so much for being on the show. I'm really excited to have this, you know, posted to our audience and for them to learn not only about your background, your experience, but what CATA photograph is doing now and in the future. So thank you again for being on the show. 

Keshav: It's been a pleasure, Julian.

Thanks again for inviting me and [00:34:00] thanks to all your listeners. 

Julian: Of course.

Other interesting podcasts