March 14, 2023

Episode 200: Patricia Thaine, Co-Founder & CEO of Private AI

Patricia Thaine is the Co-Founder & CEO of Private AI, a Microsoft-backed startup. She is also a Computer Science PhD Candidate at the University of Toronto (on leave) and a Vector Institute alumna. Her R&D work is focused on privacy-preserving natural language processing, with a focus on applied cryptography and re-identification risk.

Patricia is a recipient of the NSERC Postgraduate Scholarship, the RBC Graduate Fellowship, the Beatrice “Trixie” Worsley Graduate Scholarship in Computer Science, and the Ontario Graduate Scholarship. She has a decade of research and software development experience, including at the McGill Language Development Lab, the University of Toronto’s Computational Linguistics Lab, the University of Toronto’s Department of Linguistics, and the Public Health Agency of Canada.

Julian: Hey everyone. Thankyou so much for joining the Behind Company Lines podcast. Today we havePatricia Thaine co-founder and CEO of Private AI. Private AI providescutting-edge machine learning models that are the best in the world at findingredacting and generating synthetic P I I within semi and unstructured data setsacross 47 different languages.

Patricia, I'm so excited to chat withyou. As I mentioned before the show, not only is AI becoming more popular, butpeople are starting to think more about privacy and, and how information'sused. And also, not only within, the US or, or even on North America, butacross globally, everyone's kind of thinking about these problems and lookingfor creative ways to solve them, or, or, or fine, or, or 'em are are fine orgenerate information.

But before we get into all that with,with Private AI, what were you doing before you started the company?  

Patricia: Thank you so much forhaving me, Julian. Really excited to be here. Of course. So before I startedthe company I was doing a PhD that's on hold. And it's about privacy preservingnatural language and spoken language processing with Crito Appligraph withinthere.

Julian: Yeah. And, and.Describe language in, in, in regards to like how it, how it is communicated orhow it's kind of tracked in, in how like digital and software and computing cankind of, I, I guess, I guess identify it, but also identify within a certainrule structure of, of language. How is that, how do those mechanics work?

Patricia: How the mechanics ofidentifying language. Language, yeah. Or understanding language works. Yeah.It's. That is, so initially it was pretty hard coded in a lot of cases, or youhad to have things like trigrams as language models. Very, very initial stages.I don't know if you know about ngram models, but essentially you calculate theprobability of, or the, the stitch, the statistical likelihood of one wordcoming up, up after another.

And you can play around with that. Butbasically it's based on what. What these models have seen before in the past,and you can think of it basically as a table. Yeah. Move, move forward to now.What you've got is large neural networks that are doing that pattern matching.And that's why they need to be trained on a huge amount of data so that theappropriate parameters and weights are set on the.

Neural networks that are being used totrain, train these. Language models. I, I don't know how technical youraudience is or how, how deep  

Julian: No. No. That that wasa No, that was a great explanation. And, and, and, and, and it really, we'reseeing so much of these models being trained in the data sets are gettinglarger and larger and it's getting more sophisticated.

And it is so fascinating to see wherethe direction of this information, or this technology's gonna. Before we getinto speculation, describe what inspired you to start Private AI. What was the,what was the catalyst to have you start, looking at, say, personal personalinformation and privacy and, and thinking around that type of, that type ofindustry and information.

Patricia: Mm-hmm. . Yeah. Soprior to 2019 if you wanted to identify person and file information within. ,you needed to basically create regular expressions and combine those withmodels that didn't work so well. For identifying p i i. So some popping ones,for example, long short-term memory models, ltms, mm-hmm.

And they worked much better than, forexample, the Ngram model that I mentioned. But still not as good as thetransformer models that we have. So in 2019, it was just at the cusp of those,the transfer models getting good enough to solve this problem efficiently. Andessentially nobody in the world was doing unstructured data de-identificationor pi detection within unstructured data in any reliable format.

And a lot of companies were solving thisproblem in. And if you look at other privacy enhancing technologies, a lot ofthe times they have to be very services based. So you need to build it for aparticular problem and you can't create a SAS business out of it. So we decidedto make privacy more accessible to developers by building the product that wedid.

Julian: So where is thisinformation kind of. Identifying this information and unstructured data, whereis this data coming from and and what is the, the information that's beinginputted or the, the personal identification information, where is that beingI, I guess inputted by, is it by a user? Is it by a company?

What, where is this information beinginputted in and what are the data sets that you're finding them in?  

Patricia: Yeah, it's a lot ofdifferent use cases. So I'll go for the use cases and I kick over where we geta bit of the data that we do. So in terms of use cases, we're being used for,for example, automatic speech recognition to allow for redaction of transcriptsso that you could share the transcripts within an organization or to a thirdparty or to understand how well a conversation went.

Things like that. And you need that tobe GDPR or C or P C I or HIPAA compliant, for example. In other instances we'reused to remove the personal information so you could train a chatbot with thatdata without worrying about that neural network, memorizing the personalinformation and then spewing it out in production as has happened in the past.

Okay. With a Korean love bot who hasstarted spewing out whole names and addresses and so on of their users cuz theyweren't worried about this problem. and we're also used as a risk assessment, arisk assessment tool. So if you've got a data lake you need to understand whatkind of person unidentifiable information is in that data lake in order to knowwhat to do with the data in the first place and what regulations you need tocomply with.

And a lot of companies that have no ideawhat's in their data lakes cuz it's documents, it's data dumps, it's. Pictureswithin Excel files that contain checks. So it's a big question, question mark,and that's 80 to 90% of the data that's collected out there. Does that answeryour Oh, where we get the data from?

Yeah, yeah, yeah. Yeah. So in part from,from our customers so they provide us with some, some of the data that we useto train our model. In part, we use synthetic data. In part we go and get dataon our own from a variety of different sources.

Julian: Yeah. And, and or, orwhen customers come to you, is it because they've had say, some kind of badactor come in and, and take data?

Or, or maybe they had some or, or, or isit more preventative measures? Are they kind of thinking ahead because of,previous companies not having thought of that and, and having information kindof spewed out? Where, where do the customers come or the book.  

Patricia: It, it's both. Yeah.I, I love it when people think proactively and a lot of the times they have tonow with the regulations that are out there.

But we can, we can help in bothscenarios. .  

Julian: Yeah. And thinkingabout, I feel like the burden of a lot of companies, similar to yours, wherethey have a tool that, that can be used in so many different use cases. How doyou think about targeting your, your market? Or is it, are you defining your,your kind of beyond an industry?

How do you define. what customers tocapture, who to kind of invest in terms of building technology around. Becauseif you do it in too many directions, then it's hard to, to move forwardprogressively or within milestones in, in one direction. How do you kind ofknow what customers are you, are you going to be most effective with and say noto the others that you don't.

Mm-hmm. , how do you think aboutthat?  

Patricia: Yeah. That's a greatquestion. So really our core product. Is pretty generalizable across multipleindustries and across multiple use cases without us having to change too muchof it because of all the data that we've, we've used to trade it. When it comesto choosing what other features to build, the way that we look at it is whatfeatures and what other products can we build that can make use of our coreproduct and be generalizable also across vertical.

So we do a lot of product discoverychats around within insurance, within banking, within pharmaceutical, withinconversational AI to see across the board are there things that everybodyneeds? And the answers actually often yes when it comes to privacy.  

Julian: Yeah. Yeah. Andthinking about, privacy and especially like with, with Web three, kind of doingthis opt in, in terms of having your personal information, then being able to,to, to own that and extract it with your wallet.

How do you think about, companies with,with that accessibility to the privacy, but also what changes are, are comingwith privacy that, that users are having a little bit more ownership thatyou've seen. Is that impacting you at all, or are, are, is that not necessarilywithin your wheelhouse?

Patricia: More privacy thatusers. So not as much as I'd like to see, hopefully more down the line. Somegood things coming out are some consumer products where you can requestcompanies, delete your data. You can go, go and scope out using their products,who has what information about you. You can request it and then you can ask forthe deletion.

So that's kind of cool. , there's alwaysthe, the regulations around cookies. That's, that's pretty big. Annoying butuseful. Yeah. There's I mean, the way that Apple changed the, the matter inwhich advertisers can actually see user data and that Yeah. As you saw vastlyaffected meta stock. Yeah. Hopefully more coming.

We'll see how it goes.  

Julian: Yeah. Yeah. How did,how. How complex is it to add, say, I know you, you, the, the technology worksover different language, 47 different languages. Is it, is it, or I guess, howcomplex is it to add another language? Is it, is it like adding a whole newtype of of dataset? Is it, is it data or is it adding a whole new languages?

Is it training a, a model in acompletely different way? What are the complexities of adding, or having thecapability to, to do this across multiple languages?.  

Patricia: It's a lot of dataquality handling and making sure that we have enough data that encompasses allof the languages that we support. Yeah. So to give you an idea, we have to keeptrack of, we support about 50 entity types and we have to keep track of 47different languages.

So that's 50 times, 47 different entitytypes that we need to mm-hmm. to always be aware of, to make sure that we'recontinuously improving the perform.  

Julian: Yeah. Yeah. And as afounder, thinking about your journey, doing very research heavy, I know you'vebeen on a few boards and, and kind of been around technology and, and, and theimpact of AI and, and especially on, privacy and information, but, do, do.

As a founder, did you ever findyourself, moving into this direction, was this always gonna be kind of your,your, your end all in, in terms of building a company and, and buildingtechnology around that helps, a a bunch of other organization kind of enablehimself to not only track this information, but also redacted and, andsynthesize and, and do all the things that you're doing.

Is this, was this always kind of on yourhorizon or, or what kind of pushed you in this direct.  

Patricia: I did for multipleyears. Always wanna start a company. So since my master's, I was aiming tostart a company that would scale. Yeah. I started my PhD on acoustic forensics,so who's speaking and recording what kind of educational background they haveand so on.

And when you combine that informationwith automatic speech recognition, it can really improve the a SR systems andit could, tailor make a system for. However, there are two sides to that Cointhat are very complex. One, the user, if you have access to that data, that'shuge. User privacy concerns.

Number two if you, you often can't getaccess to that data because of the user privacy concerns. Mm-hmm. . So that'swhen I started looking at pri, how to combine privacy with natural language andspoken language process.  

Julian: Yeah. Yeah. And uh,thinking about one thing, I don't know if this is silly question or not, but I,I, I read the word synthetic, p i i, which, which made me think about how isthis built or, or how is this information generated?

Is it generated by outside sources? Isit generated by your company? To kind of train models a little bit moreeffectively, what is synthetic p i i and and how is it different from say, Iguess, I guess user pii. Mm.  

Patricia: So, synthetic p i I isp i that's fake and generated based on context.

So it still makes sense within thecontext and looks very natural. And you'll want to do that when you're traininga machine learning model with that data, for example, because you want it tolook as natural as possible so it doesn't affect your production model. What,how it defers to p i i that a user generates?

Well one this is much more secure cuzit's not actual p i. And how it differs from, say, a redaction model where it'sreplacing a name with name one, name two, and so on. Is that you for one? Yeah.You have that data accuracy problem. Otherwise, if you're, if you are, or themodel accuracy problem, if you're training the model with the redacted.

But also it's very difficult to tell ifanything is missed, what the original data was from the fake data. So it addsthat extra layer of privacy.  

Julian: Yeah, yeah. Thinkingabout Private AI and, and where you are today, tell us a little bit about thetraction. How many people are, or how many companies are using the, the productand prop.

And, and platform. And also what,what's, exciting about, the, the recent years of growth, but also what's,what's coming up this year? What, what do you have in motion that you'reparticularly excited about in, in terms of the, the growth and the increase interms of maybe customers or, or ability?

Tell us a little bit about the tractionand, and where you're heading.  

Patricia: We have some severaldozen customers using our product now. Anywhere from pre-season startups allthe way to multi-billion dollar public c. Where we're seeing a lot of tractionnow as an enterprise. And one of our series A goals is to move in towardsEurope.

So we've been hiring teams in Europe aswell.  

Julian: Yeah. Yeah. And whatare some of the biggest challenges that Private AI faces today?  

Patricia: Some of the biggestchallenges, I guess it's always product prioritization. There's so much needfor different products in the privacy space because it's so nascent and so manythings that we can actually tackle.

And. Very difficult to assess which, whatto prioritize. So, we recently made a really interesting product hire. We'vealso been doing a lot of product prioritization and product discovery on ourown. There's, there's a steering committee, for example, that we run with whichwas a great suggestion by a mentor early on.

We. A bunch of our, our customers andprospects, we get together once or twice a year maybe, maybe once, hopefullyonce a quarter, eventually. And tell them, this is our product roadmap. This iswhat we're considering. What are your thoughts on how you would prioritizethis? And ask them a bunch of questions around what pain points they're feelingwhen it comes to privacy.

But this works for, of course, any. Andit's also a great way to have current customers interact with prospects aswell. Yeah. And have a good community for people to meet and and network.  

Julian: Yeah. Yeah. Ifeverything goes, what was the long-term vision for Private AI?  

Patricia: Our goal is to becomethe privacy layer for software.

So we started. Specifically thinking,people are gonna want to use our tech directly when the data gets collected onthe edge. So in browser, in iPhone, in in Android, and so on. And so we builtpro, we built MVPs for each of these. Ultimately we were too early on the marketfor those. So we focused on, on-prem and private cloud.

But down the line it's, it's my dream tohave privacy technology as soon as the data is.

Julian: Yeah. Yeah. I like to,I like this next question section. I call it my founder at faq. So I'm gonnajump in and ask you some wrap questions and, and we'll, we'll see where we canget. So first question is, what's particularly hard about your job?

Patricia: Constantly shifting.There's, there's always a brain rewiring that has to happen at every majorshift within the. . And it's always figuring out what is my job? , you, youhire somebody to do somebody something that you were doing last month and nowwhat do you do? ?  

Julian: Yeah, yeah, yeah.It's, yeah, that's definitely often the case.

It's like you, you delegate. Then you'relike, okay, I guess, I guess I find a new direction or something else that Iforgot to pick up that. you dust off and, and maybe taking that direction. Ummm-hmm. thinking about the fundraising process and, a lot of founders kind of,I don't wanna say struggle, but the, the, the goal is to find investors andmentors and people who will share the vision and share the product, but alsoshare the, the timeline in terms of mm-hmm.

where you expect the product to be, at, atcertain milestones. And, and when those. How do you go through the process ofidentifying, the right investors, the right people, the right community aroundyour product? And also when you do find those individuals, how do you set theexpectations in terms of the, the balance between how much they're involvementin is in the business versus, empowering you as a leader?

Patricia: Yeah, I, I'm gonna sayI've been so lucky with the investors that we have. I really, really love ourinvestors. Yeah. the way that, so one, it's talking to a lot of people. Two,it's not accepting even, even if you're in a tight spot. Yeah. Not acceptingmoney from somebody who you think you wouldn't work well together with.

Yeah. And that's really tough whenyou're in a tough spot when you, yeah. Especially when you're starting upreally early on. Yeah. It. Coming. I mean, it has to be people who reallybelieve in you and your vision and your team. Mm-hmm. and understand it. Andfor, in our case in particular, we look a lot at the ethics of the investors,um mm-hmm.

because our tech, the way that weenvision it, And the way that it is now can be on a dime turned intosurveillance tech, and that is not something that we want for our company. Sowe need to be really, really careful as to who we bring on our board and who webring on as investors. .  

Julian: Yeah. Yeah. And whenyou think about kind of, founders and being prepared for the whole, thoseconversations and even, when you're kind of going a pitch meeting, things likethat, what are some ways that, that you found are best to prepare for thoseconversations and, and also set your company up with a foundation that, thatnot only will the investors see the maturity, but also see the trajectory ofwhere the company's gonna go.

What are some things that you've donethat are extremely helpful that other founders can use when they're enteringthese conversations?  

Patricia: Personally, I liketelling a story without a deck. And that, that can be a little bit tough whenit comes to fundraising. But there, for example, if you, if you go out forlunch with an investor or you meet face-to-face the deck doesn't necessarilycome up as the first thing.

Maybe, maybe it's something that happensafterwards. So we, I make sure we're on the same page. We understand theyunderstand how the company. and then I often bring up the deck afterwards toshowcase. This is the trajectory of the finances. These are the things that webuilt. I show the web demo and show them how well it looks but it's aboutpainting the picture either with or without a deck about where your vision isgoing and making sure that that vision is big enough for them to be interestedin it.

Julian: Yeah, that's, yeah.Yeah. If that's helpful, . No, it's super helpful. It's super helpful. Youdiscuss a lot of founders, talk about, the, the co-founder relationship and,and how it's, there's some things in terms of setting expectations oridentifying responsibilities or knowing exactly each person's job functionthat, that are helpful in terms of Yeah, having a, a strong co-founderrelationship, but also.

Going to be, long-term successful.What's something that you and your co-founder do that that really, not onlysolidify each other's, say your relationship, but also your responsibilitytowards the company versus, individually? What are some things that you do thatother co-founders can implement in their, conversations or in their structureof their business that you've seen extremely helpful in, in driving Private AI ina certain.

Patricia: So, my co-founder andI are very, have very complimentary skillsets and that has helped a lot in,maintaining what roles are going to be who's mm-hmm. . My co-founders, he isone of the first people who to build traffic sign recognition models fromscratch that we're deployed in millions of cars around the.

So his expertise are in really bringingmachine learning, state-of-the-art research into product, into a product thatruns efficiently and reliably. He used to, he's really, he, he used to work onprojects where if you make one mistake, bam, that's a million dollars down thetrade for the company for that mistake.

So really co really ner full on quality.So his, his expertise are into in building production products. My expertiseare on the research and privacy. His expertise are on the how do you buildtechnical teams. My expertise are on the how do you get funding and what elsedoes the company really need to move forward as a result of a lot of thediscussions that I have with peers, investors what I, what I read and so on.

This is the, my first time really doingthis. So there's still a lot of learning happening. Sure. Part of the reasonwhy our relationship works well, we, we both have very thick skins and we tellit like it is. And we understand. We, we don't let that get to, we don't leteach other's comments get to us and we're just very honest with each other.

And that might not work as well. It'sone of us was a little bit more sensitive when it came to feed.  

Julian: Yeah. Yeah. No, itdefinitely, the, the communication piece and the feedback piece is, is a veryintricate it's almost like a dance because, one, one misstep and, and it kindof messes up the flow.

But it's awesome to hear that you twokind of, have kind of the choreography, if you will, of just stay within thatanalogy. The choreography of, of, what each other does and, and how itinfluences and impacts the. As a ceo, what's something that you spend a lot oftime doing that you didn't expect to spend time as much time on kind of goinginto it?

Patricia: Interesting question.

I guess I kind of expected doing alittle bit of everything. I guess very early on I didn't expect. Quite howmuch, very naively, quite how much of a sales role this was . But it is verymuch a sales role. And my, our VP sales has taught me a lot with regards to howto be an effective salesperson. Another thing I didn't expect to spend quite somuch.

Finding the right people for, for jobsthat they say that they are good at. And then you look at their cvs and theylook like they should be good. And then it turns out that not everybody is goodat what they say they do. So I guess from, that's more from a contractor'sperspective. Our employees are wonderful.

Julian: Yeah. Yeah, yeah.Yeah. What's something that you wish you could spend more time?  

Patricia: Oh. I, I love writingcontent and producing content and it's always helpful for the company when Ido, I think so I wish I had more time for that.

Julian: Yeah. Yeah. If youwere to wave a magic wand what's one thing you wish your company had now versusI have down the line?

Or, or could have but don't necessarilyhave it? In terms of timing? What's something that you could wave a wand andwish that your company could have now?  

Patricia: Hmm.

Something that I wish we could have now.

I mean, being on a Gartner MagicQuadrant would be really great.  

Julian: That every, yeah,everybody's fighting for those lists. My next question was what's somethingthat, that you've learned as a founder that you wish you learned earlier on,in, in your journey?  

Patricia: Something I wish I hadlearned earlier on as a founder, when to cut. I, I guess when not, notnecessarily to um, Extend the Extend the inevitable if, if somebody is notworking in well in a particular role, it's better for everybody. If you justdon't extend,  

Julian: then Sure, sure. Yeah.Yeah. Knowing, knowing when that, that, that, that expiration is, is difficult.Especially, I feel like founders are generally optimistic people.

And, and so they, they, we, we kind ofhope that things work out, but, sometimes it's like, kind of cut, cut lossesand, and move on. I always like to ask this question because I love howfounders extract knowledge out of any, anything they read or any, any, any typeof interaction they have.

But whether it's early in your career ornow, what books or people have influenced you the most?  

Patricia: What books or peoplehave influenced me the most? I guess one person has influenced me a lot is mydad. Yeah. He, he. He would always make me question my, my job aspirations. Yeah.One, one example is he, at one point I said, I think he'll becoming ex, I don'twanna name the job cause I don't want anybody to feel insulted.

Yeah. And he said, you could do that,but you'll never be your own boss. And it's, it's things like that that I thinkwere very helpful in. Making me understand what the trade-offs were ofdifferent career paths. Yeah,  

Julian: yeah, yeah.  

Patricia: Yeah, lots of, lots ofpeople have been very, very influential.

I don't know how, how much time you wantme to spend on that.  

Julian: I love that. Anymaterials that, that you make, even if it's not, say, business related orwithin your sector, if it. Not no worries, but any material that, that you seeas extremely influential on you even today.

Patricia: Sure. So material,this is more broad.

I did a

Julian: liberal kind of withinthe books. Yeah, yeah, yeah,  

Patricia: yeah. I did a liberalarts degree in c j. CGEP is this degree that you could get between university,high school and university and Quebec only. And you're taught, basically taughtby PhDs. We're focused. Just just teaching rather than research.

And that liberalized degree made all thedifference in the sense of giving me a wider worldview a historicalunderstanding and made me really good at writing. And I think writing skillsare vastly underestimated.  

Julian: Yeah. Yeah. Especiallywhen developing content and thinking about sales tracks and that whole process.

It's, it's very much, built in that youmust almost be like a great communicator, um mm-hmm. . But the practicing ofthat I think is extremely important, the foundation of that. I always like toask founders this where do you go to find good talent? Where do you go to findthe right people for your company?

And, and how do you know that they'rethe right people?  

Patricia: Hmm. Network. Usuallynetwork. So people that, yeah, people that I've, I know I've spoken to in thepast we, I know from their peers that they're really good at what they do.They're well respected and admired. Aside from that, we've had, we've had goodluck on Angel List for developers, for example beyond Network.

We've had. Good luck on just playing jobapplicants on our, on our site on occasion as well. But oftentimes they heardabout us through their networks. So . Yeah. ,

Julian: get around thatway.  

Yeah. Yeah, yeah. I love that. I lovethat. And last little bit before we get into the plugs, which is, knowing aboutwhere to find you and, and where to support and all that, is there anything Ididn't ask you that I should have or that you would have liked to.

Patricia: I guess there's onequestion of yours that I didn't fully answer yet. Mm-hmm. , which is, whatactual material you recommend. Yeah. And I, I. We really enjoyed the bookcalled how the World Really Works by book Club Smith who's an academic in theprairies in Canada. And he goes over so hi.

His focus is on the history of energyand and, and the history of energy comes in agriculture and, and everythingthat humans have done in order to make things more efficient. In this book, itreally, it talks about what it really takes to, what, what it might really taketo fix climate change and how Yeah, it's an underestimated problem and all ofthe different things that are pollutants.

And the way that, that, that book framesthe problem. Yeah. Is really interesting in terms of how you frame, how youthink about framing other problems.  

Julian: I love that. I, I love,I love to ask those questions because I mean, it, it not only selfishly givesme some, some more material to go research, but also our audience really getsto understand, like I said earlier, where founders extract knowledge and Ithink it's always so, I always so impactful, and, and, and founders have thisreally unique way to do so.

But last little bit. I know we're at theend of the episode and I would love to know and, and, and have you share withthe audience where can we find you? Where can we find Private AI? Tell us your,your websites, your LinkedIns, your plugs. Where can we support the product,but also support you as a founder?

Patricia: Thank you so much,Julian. So happy, happy to chat with people on LinkedIn. You can look upPatricia Thane there are not many of that name. . You can go on. You can alsofollow us on LinkedIn. I think our Twitter's still active as well. And you cantry our web demo at demo.private-ai.com.

Julian: Amazing. Patricia wasso exciting and, and, and fascinating to not only learn about your journey as afounder, but also, a PII in in Private AI and that whole space and, and how,these different AI models are, are being trained more and more in sophisticatedwaves, but also in waves that really impact other businesses and, and theircustomers as well is always exciting to hear.

And also the knowledge that you shared,I think is always I appreciate and, and I know my audience does, but I hope youenjoyed yourself on the show and thank you so much for joining.  

Patricia: Definitely did. Thankyou so much, Julian.  

Other interesting podcasts