The Changing World of Data Breaches with Troy Hunt

Easy Prey

5 months ago

“You can put your data in one place and it can traverse the internet and end up in places you don’t expect.” - Troy Hunt Click To Tweet

Everyone who’s on the internet is exposed to risk. Every time you set up a new account or app, you create another place where your data can be hacked or breached. This makes your differing passwords crucial.

Today’s guest is Troy Hunt. Troy is a Pluralsight author and instructor, Microsoft Regional Director, and Most Valued Professional specializing in online security and cloud development. He is a conference speaker and runs workshops on how to build more secure software with organizations. He is also the creator of the popular data breach aggregation service known as Have I Been Pwned.

“I am critical of organizations that are not receptive to security reports.” - Troy Hunt Click To Tweet

Show Notes:

[1:26] – Troy shares what he currently does and the work he is known for.
[2:57] – You can put your data in one place and it can traverse the internet and end up in places you don’t expect.
[6:54] – There’s a challenge in running a site that has millions of queries at any time.
[9:25] – Troy shares some of the accomplishments of Have I Been Pwned.
[13:32] – Does he experience a lot of malicious traffic? He used to. Troy explains how he has managed this.
[18:14] – Have I Been Pwned has been around for a while and began as a lot of manual labor for Troy.
[23:10] – It is crucial for organizations to be receptive to security reports.
[25:09] – In a lot of ransoms, data of specific groups of people are used as threats.
[27:56] – Troy lists some of the things that happen on the back end of running a site like Have I Been Pwned.
[30:36] – Cloud services have been an amazing advancement in technology, but they open up more points of entry.
[33:35] – There is a hierarchy of multi-factor authentication. Troy discusses the current strategies that are best practice.
[35:45] – For users, what is the second-factor authentication you can manage to use?
[37:27] – There are different risk levels to different things. What do you actually need to carefully protect and what level of inconvenience are you willing to bear?
[39:59] – Troy shares how his parents have been impacted by confusing technology. What is the right technology for a demographic?
[43:15] – Some data is more important than other pieces of information.
[45:33] – Some data is also more or less important to different individuals.
[46:54] – For those managing and discussing data breaches, we also need to be aware that there are pieces of data that could be important to someone but isn’t important to others.
[48:24] – Unfortunately, data breaches haven’t gotten less common and aren’t really getting better.

“There are different risk levels to different things. What do I actually need to carefully protect and what level of inconvenience am I willing to bear?” - Troy Hunt Click To Tweet

Thanks for joining us on Easy Prey. Be sure to subscribe to our podcast on iTunes and leave a nice review.

Links and Resources:

Transcript:

Troy, thanks so much for coming on the Easy Prey Podcast today.

Hey, Chris. Thank you very much for having me back.

I'm glad to have you back. It's been over three years since the last time you've been on the podcast.

Not much has happened since then, hasn't it?

No, nothing. No worldwide pandemic, no hundred million record data breaches. Nothing like that's happened in the past three years.

Yeah, it's all the world stuff. I'm sure we both have a lot of personal stories from that period as well.

That we do. For those that don't know who you are who are listening, what is it that you do?

I’ve got to give that some thought. I'm well known for the Have I Been Pwned data breach service, but I do a lot of public speaking. I do travel again now, which is nice, in speaking events. I'm doing a lot of training. I haven't done as much blogging lately, because I've been very busy. I do some advisory things.

Honestly, like I say to people as the honest truth, I get up each morning, I just look at what's in my inbox, and I just figure it out from there. Other than things that are scheduled like this talk to us now or travel events, it is literally just winging it.

Let's go with Have I Been Pwned. Why did you start that? Was it just some sort of, “Hey, I found this large data set, and I want to start playing with it”? Or did you have this vision behind starting a service?

It was definitely not a vision. I guess it's like a lot of pet projects that someone builds in their spare time. They have no expectation of doing anything. And if I did, I wouldn't have given it such a stupid name. I think it's actually like the other day.

It was after the Adobe data breach that I'd been blogging a bunch about data breaches and some of the patterns we can see. I find it really fascinating. The data breaches peel back the covers of what is behind the veneer of the website that we all use. You get to look at it, and go, “Oh, where they did that thing.” You look at this data, and those people are in there.

I'm in there, too. I was in Adobe twice, once with my work address and once with my personal address. I thought that it was interesting being there twice. Also, I thought it was interesting because as far as I knew, I didn't give Adobe my data. I know what Photoshop is, but as far as I know, I don't really use Adobe products, but I use Macromedia products.

Adobe bought Macromedia, so my data flowed through into their database. I thought that was interesting, too. The fact that you can put your data in one place, and then it traverses around the Internet or the various entities, acquisitions, and things, and you end up something you didn't expect.

That's pretty crazy. You exploring that was what started the service?

Yeah, it was that. Also, I was in a corporate job working for Pfizer. Everyone knows who Pfizer is now, because the stuff that's happened since we last spoke. But at the time, they were one of the largest companies in the world, sixth largest company when I joined. Most people didn't know who they were, so I had to be like, “Hey, do you know what Viagra is?” I was like, “We make Viagra and a whole bunch of other drugs.”

I was working there. I had to go through this transition period, which so many people in technology do, which is where they say, “Hey, you're a software developer. You seem to be good at that. If you want your career to progress, you’ve got to stop doing that.” He didn't look like a manager or something, and I hated that. It was miserable.

I wanted a pet project. I also wanted in the new role I was in to transition this organization into more cloud-first development practices, particularly things around Platform-as-a-Service models, such as we had in Microsoft Azure. I wanted to build some stuff, but I wasn't really primarily a developer anymore.

I went, “OK, well, I'll just go home and build some stuff. Hey, there's a good data set. Let's just whack an interface on top of that. I can play with Table storage and some of the modern pessy paradigms and build a project.” That was a large part of why Have I Been Pwned was born.

That is hilarious. I love it when people's pet projects become something that's useful to society and helpful, because you weren't thinking, “How can I make a boatload of money off this darn thing? How can I monetize it?”

No, and honestly, no one's more surprised than me that I'm here talking about it. It's going to be 10 years in December, the 10th birthday. No one's more surprised than me that I'm here talking about it. It has become not just a useful service, but a valuable service that has given me a lifestyle. That is by no means the way it began.

I think looking back on it, if it hadn't begun that way, I don't think it would have been as successful as what it was. I was just like, “How do I make this as useful and as accessible to everyone as possible?” That's what got it known, got the traction, and built it into something that it is today.

I think when people went there, it's like, “Hey, there's no advertising here. They're not trying to sell me anything. They’re not trying to push me anything. Oh, I tend to trust people a little bit more when they're not advertising things.”

That's what's been really, I don't know if challenging the right word, but something I've approached really, really cautiously in more recent years. It's like, “How do I do things that do generate revenue for it without it ramming down people's throats?”

The 1Password product placement that's been there, I think, since 2017-2018, is very subtle. It’s very relevant. There are no trackers or privacy-invading things. It is the best possible thing, I firmly believe, that you can do, even preferably before a data breach, but especially once you've discovered a data breach. So trying to find ways to make the service equally more valuable but also generate money to pay for the thing, that's been a very delicate process.

Let's go down a rabbit hole, but we won't talk numbers. You're probably more inclined to talk numbers than I am. That's one of the challenges with running a service like that, that has millions and millions and millions of queries, that it's not the thing that you can spin up on a shared box at your host, Azure, and it actually functions.

It kind of is. I guess it's a question of things like cost and performance. It's an interesting set of sometimes competing objectives. How do I make it faster, make it scalable? How do I make it cost effective? Now's actually an interesting time to talk about this because one of the things I did very early on that was a great decision was to use Azure Table storage, which is just a very, very simple storage construct, really just a key value pair lookup.

This was one of the things that attracted me to building the product in the first place. I was like, “I wonder if I put 155 million records in there, which is what I started with, will it actually be fast? And how much will it cost?” It was super fast. It was really, really fast. It was costing, I think, $20 a month or something stupid like that for the storage.

I was curious. Over the course of time, how will it perform? Now, there are a total of nearly 13 billion breached accounts. When I say breached accounts, my personal email address has got 30 of those accounts. These are instances and email addresses in a breach.

I did a very manual backup the other day, which leads me to where we're going next. I found there are about five-and-a-half billion unique email addresses. We've gone from 155 to five-and-a-half billion. What's been really cool is that when you look at those metrics of things like performance, it just has not changed at all.

The only time I ever see any changes is if I'm absolutely blasting it with data because a new breach is getting loaded, and I haven't seen problems there for a long time. The cost has gone up a bit, but it hasn't gone up even remotely commensurately to the volume or the usage. I'm just fascinated about how well that lasted. I think I'm going to be lucky if on us.

It sounds like you are. I know at some point, you switched to using, I think, Cloudflare on the front end to cache and manage the front end?

Yeah. Cloudflare has been amazing for many, many different reasons. In no particular order, they've given me a whole bunch of services to make it available to people for free. Things like the Pwned password service, we can do password lookups. We're approaching a billion passwords, and there we do about six billion queries a month. They host all of that for free.

Normally, I'd say our cache hit ratio was around about 99.9999%. We found the other day, for the last month out of 6.02 billion requests, there were zero requests at the origin. We are now at all the zeros. It's just 100%, and they give that away for free. They've been very, very helpful that way.

They make a really awesome product as well just in terms of being a reverse proxy, being able to massively cache, but also block things, create some very clever rules, and run a lot of code on the edge. You push a Cloudflare worker you're running within seconds on 300-plus different edge nodes around the world.

Part of the challenge for me has been I've got all this Azure stuff there. That does a lot of things really good, but there are some things that do really bad. How do I now augment this with Cloudflare to do the things that do really good to fill the gaps of the things that do bad?

A good example is egress bandwidth. It's really, really expensive from Azure. It does that very bad. You can do that very well with Cloudflare if you can cache things and not have to go to the origin. Things like their API management tools.

Even if people are making way, way, way too many requests over their quota, you still have to service the request. But now, you've got to block those at Cloudflare and not go to the origin. It's always been a little bit of, “How do we dice with the technology to make everything work as well as it can?”

Have you shared how many queries you're getting against the platform per hour, day, or month?

Yeah, I normally do. Let's have a look. Good question. I haven't looked lately. We can do this in real time. I would be surprised. The whole question of queries is one that becomes a little bit tricky because it's like, OK, querying what? I know that I have 200,000-300,000 unique visits a day. That's a combination of individuals, and then consumers of the API, two-factor authentication. It's really good so far on the way.

We'll get there.

Because it's Windows with a biometric device, I can two-factor my way in with my fingerprint. As soon as I do that, someone's like […].

I was about to say, just don't hold it in front of your camera.

I do a regular radio piece here in my local city. Yesterday, I went in there and they're like, “So tell us about this thing where you plug your phone into a charging port in an airport and they take over your phone.” It's like, “No, it's not really what happens.” Some researchers once did a little bit of fingerprints. They've got some gummy bears, and they watched James Bond, but it's not what we really have to worry about.

For Have I Been Pwned, keeping in mind that today is a Monday on your end. We have 216,000 unique visitors to the haveibeenpwned.com domain. Normally, we see some number of hundreds of thousands of queries to a combination of the API and the front page. Of course, there are domain searches and other things as well. Most of the time these days, I just watch some dashboards to see if anything ticks in a direction I don't think it was meant to tick.

I know exactly what you mean by that.

It does occasionally, and then I built a ruler. I broke something the other day. I looked at my dashboard much, much later after I had numerous tickets logged for it. I was like, “Oh, yeah, look at the big spike of errors.” I was trying to look for that.

I did that.

You can still see the origins of the pet project is still very clearly there.

Do you get a lot of malicious traffic? You have a lot of API traffic, so I hesitate to say non-human traffic, but let's call it malicious traffic or unauthenticated, unapproved traffic.

Yeah. It's almost easier. I think malicious is a good word, actually, because whether or not it's via APIs, scraping the front of the page, or whatever it may be. I used to, and there are a few different things that I did that really changed that.

What I used to see is when I first launched this—it’s December 4th, 2013—it looks like it does today. There's a front page, input box, you put your email address in, and it would make an async request to an API with that email address and return results. I just put the website out there not expecting to do anything. It was surprisingly popular and then I went, “OK, well, that should document the API, and then people can start to use that to do good things.” I did that, and people started to use it to do bad things.

What I meant by bad is I would just see patterns that made me feel like the data wasn't being used in the best interest of the individuals that were being searched for, which is the whole paradox of running a service in the first place. So mass enumeration in a sequential order of email addresses on different domains.

I went, “OK, well, I'll put in a rate limit so that if you make more than one request every 15 milliseconds, you'll get an HTTP 429, then a retry after header, and you will slow down because that's the semantic content of that response.” People would get 429, and then they just keep hammering it, hammering it, and hammering it. Then they'd go through, and then they get a whole range of different IP addresses.

That was the point. Actually, I wrote to Cloudflare. I'll put it behind Cloudflare because then I can start doing rate limiting at the edge, because they were just hammering, hammering, and hammering the service, and my poor little Azure passing. It just couldn't keep up, and I'd lose requests.

The problem with rate limiting at the edge is if you rate-limit by IP address, you still have the problem of people going getting lots of IP addresses. You don't want to just stop bots per se because bots can be good. If someone builds an application to go and check the corporate email addresses once a week or something, that's a good bot.

Then in 2019, which would have been before we last spoke, I ended up just introducing a charge. If you want to query the API, it's $3.50 a month. You will get an API key. The key is the thing that you need to send with each request. It doesn't matter if they're all from the same IP, different IP, whatever, but the key is the rate-limited thing. And the abuse disappeared entirely.

The entire abuse against that API completely disappeared. It's remained that way to this day because you have to stump up a card to go and buy the thing. This is why I was actually a little bit supportive of the Elon Musk stuff the other day on Twitter. He's like, “Look, we're going to trial new users in New Zealand and Philippines needing a card, and it's a dollar a year.” I can see the path that takes you there, and then there's a whole other debacle with that.

Anyway, that solved that problem. What tends to happen is you push the malicious stuff in a different direction. There was still an API behind the front page, and I had all sorts of controls on there to try and limit abuse, but people were then abusing the non-documented API, not the one that was now meant to be consumed.

It was only earlier this year that I started using Cloudflare Turnstile, which goes through and issues a challenge to the browser. It all happens visually. Challenge to a browser; make sure that you are a legitimate user. If you're a legitimate user, you get to call the API.

The next time you call it, you have to have another challenge. It's all happening invisibly for more than 90% of people, and then a small fraction end up on a page that does a challenge. I was like, “OK, we've moved the problem; now we're going to solve this one.” As far as I know, that hasn't just moved the problem somewhere else.

You haven't found where it's moved to, or they haven't found a way to poke at something else?

The thing that you normally see, I'm looking at my chops at the moment, and I can see the volume of requests to APIs. If it incrementally suddenly really, really ramps up when it's a rate-limited API, then I know that something funny is going on, but that just doesn't seem to have been happening, which is good.

That's a good thing. Taking a step backwards towards the data breach side of the conversation, which we have gleefully jumped over. Where did you get your first data set, how has your methodology, and where you get your insights secrets come from?

I think what changed is when I launched, it was Adobe. I think there were Stratfor, Golkar, and a few other little ones. They're also in fairly broad circulation. I had to go, grab the data, load that in, and it was manual labor on my behalf. Right after it launched, people started popping up and just going, “Hey, I've got the whatever data set just here.” To this day, nearly 10 years on, that's what happens.

On average, every single day, I'd say on average is 1.-something times per day, I'll get an email. I'll wake up, and what will I get today? I've got a whole bunch of weird stuff like that, where someone will go, “Hey”—without throwing anyone under the bus—it could be anything from, “Here's a new disclosure of a breach. This one's linked me to a URL that has got the entire dump. I need to go and have a look at that.”

Apparently, it's a German site. I don't know what it is. I just get stuff like this every single day now. It's a firehose of breaches. Honestly, it's much more than what I can even handle.

Imagine the firehose, and I have to filter that down to just what I can deal with. They go into the service, and then eventually I get around to some of the others. Sometimes when it's a little bit quieter like it has been in the last few days, I've loaded breaches from 2019, because I’ve finally gotten around to dealing with them.

When people are saying, “Hey, I've got this new data set.” You're saying 1.1 a day; is it one or 1.1 new ones a day? Let's say a new data set becomes available. Do you have 10 people say, “Hey, I've got this new data set”? Or is it pretty much one for one?

Sometimes it is, particularly if it appears in public hacking forums. There are a bunch of people lurking there that always support, ever been paid and all. Often, I'll get up. It's always more impactful when you get up because you've had eight hours overnight for something to happen, and then there'll be three different messages. “Hey, have you all seen this one?” There are links to that.

Other times, it's just an individual that's come across data. This gets really awkward as well, but sometimes it's an individual who's taken the data. I was like, “Crap, now I’ve got to tell them that they really should go to their room and think about what they've done,” because normally, it's children that have done this as well. If not children, then very young adults. There are conversations there as well.

That's interesting. I was wondering if they stole the data, why are they giving it to you? Is it that there's this altruistic, “I have stolen the data because I want the world to know that this company is not practicing good hygiene”? Or did they steal it because they wanted credit card numbers? They've got that, and now they want to feel good about themselves so they give it to you with other credit card numbers?

It's different motives. Just a few sign this, I was thinking. I won't name this brand. It's a brand that you know very well. It’s a very large American company. Someone has a website here that they've stood up just to publish this data. They're like, name a company, was breached, and they update your model security oversights, including but not limited to, and then they go on about all this stuff.

They've got an entire employee directory that they've dumped and about 16 gigabytes worth of documents as well. They've just dumped the whole up there. This person just appears to be cranky about the security posture of this organization. They've failed to train employees on popular, known phishing techniques, ultimately leading to the compromise of the victims' employee accounts and all these things that's wrong with them.

I think that the motives could be anything from, for all we know, this person tried to shake them down for money, and they didn't get money. Now they're throwing their toys and they just say, “Hey, here's all the data,” which we see all the time with ransomware as well. I think sometimes it is altruistic in that they want to see security better, and they feel like companies don't take it seriously enough.

Part of what pains me with all of this is that sometimes they're right. Companies just don't take it seriously enough, and the thing that gets their attention is to get the data. It shouldn't be that way, but a lot of that falls back on the companies, too.

Companies just don't take it seriously enough, and the thing that gets their attention is to get the data. -Troy Hunt Click To Tweet

But then I'm sympathetic to the companies because they get so many people reaching out the whole time saying, “Hey, I've found vulnerabilities.” It's a big bounty. It's a good term, big bounties, not a bug bounty. It's someone saying, “I got one of these a couple of days ago.” They say, “Yeah, I found vulnerabilities in your site, how much will you pay?” Then you start having the discussion and it's like, “Your SPF records are too messy.” That's like, “Come on, man.”

I'm going to go on a rant. I get those almost every day on some property that I own. “I have found a vulnerability in your website. You aren't publishing SPF records the way I want them published.” One, that's not a problem with my website. Number two, number three, number four. “You want me to pay for that, and it took you three seconds to run it through the MX toolbox.”

Yup, and this is exactly what they're doing. They're using tools like that. The thing that's just such a kludgy mess. I am critical of organizations that are not receptive to security reports. I push them to do things like user security uptake fee and list your contact details, which I do on my services.

People go and find that, run free tools, and give you rubbish reports. Then you have to go through, and you need to take it seriously because I keep telling you, you should. But then, you know it's going to be rubbish. It's a hard problem. I get to the point, every time I have one of these discussions, I'm like, “Well, it's a good industry to be in, because it's tough like this.”

Job security. Have the contents of what you're seeing in the data breaches changed over time?

I'd say that probably the most obvious thing that's changed is ransomware. It has gotten so big in probably the last five years. -Troy Hunt Click To Tweet

I'd say that probably the most obvious thing that's changed is ransomware. It has gotten so big in probably the last five years. But I feel particularly the last few years, it just really, really ramped up. The prevalence of ransomware crews running dark web websites, where they're just like, “Here's all the data.” We’ve seen this over and over again.

Australia had a really, really big, I guess, public incident last year, where our largest private health insurer literally had some are approaching half of the country exposed in this breach. The ransomware crews were like—I think they wanted to live on a million dollars or something—“Pay the money or we're going to dump it all.” Of course, they started dumping it all. The press was just covering it like nothing else.

What the ransomware crew did was, “We're going to start by dumping the list of people who've had abortions, and then we're going to dump the list of people who have drug addictions,” and just maximum, maximum impact. Of course, a lot of what's being ransomed is documents. This one here, one US company has been taking a look at, and it's just troves and troves of emails, Word documents, PDFs, that could be everything from invoices to corporate communications. It used to be that the data breaches were .SQL files and .CSVs. I think now there's a lot more of the entire crown jewels of the organization in there.

From the database dumps of non-documents, non-emails, and things like that, customer records, are you seeing more fields in those dumps?

I don't know that that's really changed in any material way. I think perhaps the question to ask is, are we building our applications any differently today than what we were 10 years ago at the start? How are we collecting different fields? We're storing our passwords a bit differently. I'm going to say there's less MD5 than even yesterday, later data breach on salted MD5.

I think perhaps the question to ask is, are we building our applications any differently today than what we were 10 years ago at the start? -Troy Hunt Click To Tweet

There are more stronger adaptive hashing algorithms, which is great, but it's still a field with the password and then everything else in clear text. There's always everything else in clear text. That hasn't changed. Possibly a few more instances of having things like Stripe IDs instead of payment records because there are other providers you can use. We've certainly seen more issues around things like compromises of third parties.

In fact, I've lamented quite a bit on Twitter lately, and very often see disclosure notices. It's like, “Hey, there's a data breach. It wasn't us. It was a compromise of a third party.” “All right, who was it?” “It was a third party.” “Who did you give my data to?” “You agreed to the terms and conditions. We could give it to whoever we want.” “I didn't read that. OK.”

Yeah. Who's going to read 85 pages of security disclosure?

I have terms and conditions on Have I Been Pwned. I expect that one or two people will read them, and then that will be it. They're just going to be like, “Yep, fine.” Because that's just what we do.

I would like to see whether we have to regulate this or whether we just say, “Look, this is just common decency. I would like to see disclosure of who the third party was.” What we often see is there'll be some compromise, and you'll get half-a-dozen different organizations going, “Compromised third party, unnamed,” and then one will go, “Yeah, now it was these guys.” You're like, “OK, well.”

OK, so all of you are pointing back to this one guy, even though you didn't say?

We make assumptions then, but this is part of the fabric of the way we build applications now. You have your chatbot service here, your payment services there, and then your ticketing service there. For us, we recently did a whole bunch of very boring privacy and legal stuff at Have I Been Pwned.

The lawyers went, “Look, you have to go through and list every single party you provide the data to.” It's like, “OK. Azure and Cloudflare, obviously, but then SendGrid, because we send emails, and Zendesk because we have a support system, and you might enter your data in there.”

It's obvious. I'll tell you what. Let's say Zendesk had a compromise and data that our customers put in there, I'm going to be the first person to be like, “No, it's Zendesk.” I don't care. I'm going to be honest and transparent about it.

I use Zendesk as well for a number of projects, and I'm surprised at people's willingness to just throw out all sorts of personal information in a support ticket to someone they've never met before. I'm just appalled by, “Oh, yeah. Here's my Social Security number.” I'm like, “Delete, delete, delete, delete, delete.”

There's a whole other rabbit hole there about why Social Security numbers are such a secret that you give. That's an American problem. We just have driver's licenses; we have the same issue with it.

We saw a really good example of this just last week where Okta had an incident. In the Okta support system, they had customers that include one password in Cloudflare, where they had provided half files from the browser that had material for authentication tokens in it. The half files had everything that you needed to hijack the session, which is fascinating. Maybe that wasn't explicitly like, “Here's my Social Security number,” but you've still taken a secret, you've put it into a ticketing system, and then it's been compromised.

I would hopefully hear if Zendesk got compromised. But if someone got into my little store of Zendesk and my tickets, how would I ever know, unless there's a public disclosure?

You get the name off of me.

Actually, I probably would. Hopefully, I wouldn't be, “Who are you and why are you pestering me?”

You're talking about this push for cloud services. Do you think that will result in more of these, “Hey, this random undisclosed third party has had a breach, and now rather than it being my data on my premises, it's now hundreds of clients that were using the platform”?

I think cloud in general has massively increased our attack surface, for want of a better word. I'm hesitant to say it's made us more vulnerable, because I think in many times, cloud services can do things a lot better than what you can do in your own application.

But it has made many, many more points of entry, and that could be anything. There are so many data breaches at Have I Been Pwned because people have used easily accessible, well-priced storage constructs, and they haven't put a password on that. There's a lot of MongoDB, Elasticsearch, S3s, and all this sort of thing.

It's a little bit the same with using other third-party services. Let's say it's Zendesk. If you're using Zendesk, and you don't have multi-factor authentication, you're reusing your password, you've now got a great big attack surface sitting there that could be very easily accessible and could contain some very sensitive things.

It's also scary and messy. It's great job security. Where do you see the future going, emerging technologies, strategies, things like that that might shift how we look at data breaches, how they're happening, or how we secure against them?

We're definitely making good progress away from passwords. I feel like we've really started to gather momentum, maybe even just in the last year. Things like passkeys, for example. Passkeys referred too much more frequently. I was listening to the radio for normal people the other day, and they're talking about passkeys. I think that's always a good sign. Even things like, I wouldn't say ubiquity, but a greater acceptance of things like universal two-factor.

My father somehow manages to keep resetting his Gmail password because he's trying to log onto some model boat site or something, and he's confused between what's a Google account and what's the account for the website. I'm his security contact, so I get these notifications saying, “Hey, your password has just been reset.” I'm always like, “Hey, dad. Was this you?”

The other day, I was like, “I'm taking away all your privileges. I'm turning on Google advanced authentication or Google advanced protection. I've got you two YubiKeys. I'm going to enroll them both. The only way you can ever log back in other than the device I log you in is with these keys. Now, this is not going to happen again. Also, if you're overseas and you lose your device, you need to access your email, you're screwed.”

Yeah, or you lose the YubiKey.

Yeah, but I have the YubiKeys.

OK.

I have two, and I have them in different locations. I protect them very, very carefully. There are account recovery processes. I think this is one of the interesting things about using U2F like this. It works really, really well when you have access to everything, and then by design it's really, really hard but not impossible when you don't have access to them.

Has your view and thoughts on 2FA shifted over the years? I don't remember where our discussion was three years ago on SMS via Authenticator app, via fingerprint, retinal, hand, or physical token.

I read about this probably before we spoke last time about two-factor and the hierarchy of SMS, soft tokens, and U2F. I don't think we see many hard tokens generating numbers these days. I think there's a lot to unpack there.

First of all, SMS 2FA is not terrible. Having no 2FA is terrible. Adding a second factor is greater. I think it's quite interesting. I often see people say, “Using SMS for 2FA is even worse than not having 2FA at all.” I'm like, “No, it's not. You can't do math. It's science.”

Call it 1.5FA if you must, but it is always adding something. I think what people often mean is very, very frequently you see that once there is a phone number attached to an account, that can be used as a single factor for recovery, which is a concerning thing, but that's a bit of a kludgy mess on behalf of the provider.

Using a phone number is always a step forward. For many people, particularly less technical people, that's probably the best way to do it because if they end up with an Authenticator app, it's only just recently Google Authenticator started actually allowing you to back that up and restore it to another device. What happens if you lose your Authenticator app? U2F keys, you've got a financial burden, and then you have to have it on you. You need to know what you're doing with that.

I think all of those are very, very relevant. Things like passkeys are a really good step forward, because we can start to move away from this dependency of terrible passwords, but then you're sinking them somehow. If someone gets access to the device where you're sinking them, then that's a different problem. You mentioned biometrics. I think biometrics as a second factor like the way I just logged onto Cloudflare is awesome.

I have to be at this machine using this device. This is not James Bond. Someone's not going to gummy bear my fingers into a new prosthetic and log on from the other side of the world. They're fantastic. I biometric into all of my things now. I've got an iPhone, iPad, new laptop, PC with an external reader. It's great because I can also authenticate without anyone seeing the secret, which is great.

Yeah, I like options. For me, it's always what is the second-factor authentication mechanism that this person can tolerate doing without it destroying their life?

Even then, I think it depends. To give an example, we keep talking about Zendesk. One of the things that really bugs me with Zendesk is they want you to 2FA in very regularly. When I think about my Twitter—to me, Twitter is enormously valuable, because there are so many DMs with very sensitive discussions—I two-factor into that only with security keys. I tweaked all this the other day—no soft tokens, anything like that, security keys.

How often do I have to enter that security key? It's really every time I set up a new device, which in the scheme of things is not that frequent. Zendesk, very frequently, I have to enter the second factor. It's the same device, same IP address, same browser heuristics and all this, but I have to enter the second factor.

OK, well, then that makes more sense to have a soft token in my password manager because then I can take that barrier of friction down to basically zero. Otherwise, it's like, “OK, now I'm going to pull out my phone. I'm going to get the Authy app. I'm going to get the token,” and it's just painful. I think we've all got to find where our sweet spot of usability and security is as well.

Understanding, for me, the more sensitive, the more financially valuable, I'm willing to go through more pain and more hassle to keep that secure than I am something else. I won't say what that something else is.

There are different risk levels to different things. What do I actually need to really, really carefully protect? What level of inconvenience am I willing to bear? -Troy Hunt Click To Tweet

It's almost like a risk-based assessment. There are different risk levels to different things. What do I actually need to really, really carefully protect? What level of inconvenience am I willing to bear? I think the U2F keys are a perfect example.

Short of a Genesis Market-style thing where someone is actually selling your browser fingerprints and your cookies, your post auth, you're just not getting into an account that has U2F enabled without actually having the U2F key. I love that. I've just got a process for the way I manage my keys and the keys of my family, because I have family members using it as well.

I don't want to get too into the weeds with family. Have you had these types of discussions with kids? Are they as security-conscious as you are, or more or less security acceptance?

Yeah. My son's 14, and my daughter's 11. They've grown up seeing what I do but also traveling a lot with me, coming to events, being on stage themselves. I guess they get to live in that world.

They have a pretty good understanding of it. They've had password managers since forever. They're all part of my 1Password family account. We have lots of shared vaults between my wife and the kids. My wife and I have a shared vault with my son, and we have a separate shared vault with my daughter, and then we have a family shared one, and then she and I have one. We can all exchange secrets and things backwards and forwards as we need to.

For them, that's very normal, that workflow. When we come off air, I'll tell you a fun story, actually, about the way I've used U2F keys, but that's not for rebroadcasting for reasons you'll hear about soon.

They're starting to understand that, and they understand things like why we need multi-factor. Because they're already in that password manager ecosystem, it's very easy for them to set up one-time passwords in 1Password by adding that in for the various sites they use. I think there's this benefit for them being, that term we often use—the digital natives—where they've just grown up with this. That's been their life.

The flip side would be parents, where there's this constant pushback, and I don't want to understand the technology, let alone do I actually understand it sometimes.

It's interesting. My parents are of a demographic. They're in the mid-seventies where obviously they didn't grow up with this. It's something that they've had to learn over time. My father was a pilot, so he was used to being very technical and in control of, I guess, complex machineries. He grasped it reasonably well.

Going back to the theme of the Gmail story the other day, he came over and I was like, “Just help me understand what you did the other day.” He's gone to this place he orders his model boat batteries or whatever from, and it's like, log in with Facebook, log in with Google, log in with email. I think what he's done is he's gone login with email, and then he's entered all these Gmail credentials into it. I looked at it and I went, “That is confusing. It is really confusing.”

If you're not digital-native like our kids but not someone who got to spend all their adulthood working with technology, it's confusing for me, so it must be for you. So I think finding the right technology fit. My mother often just writes passwords in a password book, which is a physical book. That's fine so long as they're unique.

Her threat actor is someone who can break into the house. The person who wants to break in the house wants to take your TV, not the notebook in your drawer. I think we've got to find what is the right technology for the right demographic, and what's something that they can use and be comfortable with as well, because you also don't want people to just be completely lost every time they go to buy a battery for a boat.

I think we've got to find what is the right technology for the right demographic, and what's something that they can use and be comfortable with as well, because you also don't want people to just be completely lost every time they… Click To Tweet

You want their online experience to be a positive online experience and not so difficult that they're just like, “Forget it. I'm just not going to buy a battery for my boat. I'm not going to enjoy my boating anymore because it's not worth it anymore.”

No. I do wonder if there's an industry. I've thought about this for a very long time, actually. If we've done ourselves a bit of a disservice by being able to have social provider logins, because it is super confusing, and you do delegate a lot of visibility to third parties. But then if you create an account for every single website, people are using that same lousy password absolutely everywhere as well, so you've got a different problem.

Do you find yourself creating fewer accounts because of what you do?

No. I've got a thousand accounts. There are so many accounts, but I do have a strong and unique password for every single account. Particularly given the pandemic situation, but also just societally, we buy so much stuff online these days. We're always buying stuff online, and we're always creating more accounts.

Just yesterday, I wanted to get a solar radiation sensor for a weather station. There's a website that sells those. “OK, I guess I need another account.” There's another wild wiki password that's gone into there. One other account, one other place that I might have my name and password I don't care about because it's strong and unique, but my home address, my phone number, or the last four digits of my card or something exposed. But am I not going to buy stuff? No, that's just a risk we take on board.

Do you find yourself providing less optional informational fields when creating accounts?

Yeah. You don't often have a lot of options with fields. I also think we need to have a little bit of a discussion about what is actually the risk of some of the data classes we provide as well. I feel that people get incommensurately concerned about their email address, their home address, and their phone number.

I know they feel very personal, but remember, most of that is stuff that we've had in phone books for decades. I regularly get people send me a message where I've gone out the back and it looks awesome, so I've taken a photo. I've put it on Twitter, and then some well-intentioned person will DM me or say something like, “Look, you just took this photo. I've looked at the angle of the sun, how high the tide was, and the buildings. I think I've managed to triangulate your address.” I'm like, “Why don't you just use the phone book, man? Why are you making life so hard as if it's some massive secret?”

I think that that's just one of these cases where we get incommensurately worried about the exposure of a piece of data that doesn't really have real-world impact most of the time. Of course, there are exceptions. People will find your address if they want to.

That's one of those things that's probably on some public record anyway that's not even behind any authentication whatsoever.

In Australia, you're in a phone directory by default. If you have any business dealings and you're a director of a company, which I certainly am, you're in the book there. There are just lots of other. You could easily figure out the schools my kids go to. You could easily follow me. It'd be so easy to find this information if you wanted to. If you're more worried about people just casually discovering it, well, I'm less worried about the intent of people casually discovering it.

I wonder if that's a good spot to stop.

It's not as bad as it seems.

But I think there is a point to that where we shouldn't panic about the data that's out there. The password to your retirement account, yeah, you should be concerned about that if it's something that all your neighbors know about you anyway.

I think also a fair comment to make here is that in many ways, we have the luxury of it not mattering. To me, it doesn't matter too much. If someone discovers my address, it also doesn't matter too much that they discover my gender or my sexuality because they're not things that are particularly sensitive to me. But I recognize that there are people out there for whom that could be very damaging. I'm reticent to say it doesn't matter at all. I do think that in the vast majority of cases, the real-world impact is extraordinarily limited, but it can be very important to some people.

I guess that makes sense. You were talking early about medical record data breaches. That is, in many cases, extremely personal. If I broke my ankle, I don't really care if people know whether I broke my ankle or not. But if I got a procedure that may or may not be socially acceptable or that I may not want people to know about, then that becomes highly sensitive, highly charged. You just can't say all medical data, that's no big deal if you break your ankle, because that's not always what it is.

That's just why it's generally classified as sensitive personal information. If I think back through my medical history, I can't think of anything where if it was all public, it would matter too much. Again, for many people, it would. It could be deeply personal, deeply sensitive things there. That's certainly something which is very different for other people.

I guess that's where we as people in the field talking about data breaches have to be careful that we don't minimize any particular field for someone. It could be important. If someone is being stalked, for example, their address is very relevant.

Yeah, it is. I guess my hesitation there is that we're trying to filter out and almost cast a judgment ourselves on whether or not something should actually be important to them, because there are plenty of times where someone absolutely loses their mind over the exposure of a piece of data that actually has no impact on them. They're not a persecuted minority or anything like that. They're just pissed because someone exposed their data, which is also OK in any way.

Which is also not always a bad thing to be upset about. This company that should have been doing a better job wasn't doing a better job.

Do you see a trend of things getting better?

No.

I was hoping for a nice positive end to this.

I see things just changing. The password situation is a good example. There are more times where we're using bcrypt for our passwords than MD5, so it's harder to crack passwords. I mentioned Genesis Market before it was taken down by the FBI and friends earlier this year, because it's getting harder to brute force credential stuff your way, past accounts that might have strong, unique passwords or might have multi-factor authentication.

What if we've got malware that can just take the browser fingerprints and the cookies you post off? OK, it just moved the field along. We've seen things like multi-factor authentication bombing, where people just get prompted over and over and over and over again to accept the login. Eventually they're like, “Just make it go away.”

All right, we changed the model. Now you've got to enter the number that's shown on the app before you can authenticate yourself. I think we keep changing the playing field, but I don't think that if we look at it holistically, it's any better today than what it was the last time we spoke.

We just need to do better. As consumers and businesses, those just need to start trending and we hope those are trending in the right way.

We will, and then the bad guys will do better, and then we'll do better. That's a good industry to be in. We’re back to there again.

Before we stop recording, any parting advice in addition to the use of password manager, use two-factor authentication? What else would you add to that list?

I think what people need to appreciate is just how low the bar is for most of the adverse security incidents that people suffer from. I use an IRL analogy. We had a situation earlier this year where we were doing some renovations in our house. We had to leave our car out the front of the house. Due to a series of unfortuitous events, we left the car at the front, and we also couldn't close the gate into our house.

I went down in the morning. I got in the car and I was like, “Why is the glove box open?” It's just stuff all over the seat. “Also, why didn't I have to unlock the car when I got in?” I immediately pulled the security footage out.

Of course, someone has driven past, have gotten out. They've opened the door to our unlocked car, so how often do I do that? They've gone in there. They've taken my pocket knife out of the glove box. They've spent five minutes trying to figure out how to open it, because it was hard, and then they've seen the open gate. With the knife in hand, they walked through the open gate into our house. I'm just like, “Oh, I'm such a freaking idiot.”

There were good reasons. There wasn't a good reason why the car was unlocked. I screwed that up, but we just had new tiles laid. The gate wouldn't close because they're a little bit high. All these things happen.

We've got this WhatsApp group for our community here, and you keep seeing people report petty theft. Every single time, it's the low-hanging fruit. Someone got dropped off by a car service from the airport the other day, and the car service left the car running while he took their bags inside. You should be able to do this, but if you do, that's low-hanging fruit.

A lady ran inside to drop off a library book, left the car running out the front. There you go. What could we do differently? Lock your car, put it inside the garage. You must have a garage here. Close your doors.

We have a safe for the keys. So if anyone gets into the house, they don't get the keys. Really, having a password manager and multi-factor authentication is that. The vast majority of the adverse security incidents that individuals experience just disappears, and it's so, so simple. It's just like locking your door.

I like that. There is a certain level of hope of, “Look, if you do these very simple things, you've eliminated, let’s say, 95%, 90% of what could happen to you.”

Totally. I don't think it's that hard.

That is an encouraging stopping point.

That's the positive note. If you're doing those things, you're almost certainly fine, and there's just a little bit left after that.

Awesome. Thank you so much for coming on the podcast today.

Thanks, mate. Thanks for having me back.