Hacker News

Rels•6y

Authentication and the Have I Been Pwned API troyhunt.com

124 comments

zaroth•6y
All this seems to be hinting more than ever, that the time to provide these results directly and exclusively to the email address being queried is approaching.
Why is this API being abused? Because it provides valuable information—which took a significant amount of effort to curate—about an email address.
The list of services which have lost my (hashed or not) password at some point ever in the past eventually turns into a list of every service I’ve ever subscribed to.
Whether or not it’s possible to scrape that information together, is it really something that should be available to pull over an API for a million emails a month?
Note this is very different information than the password breach count, which gives you an approximate count of how many times a given password has been breached, and works as a proxy for password strength without disclosing any PII.
- diminoten•6y
  Sorry, but the cat is out of the bag. HIBP is evening the playing field, making the data less valuable to those who have the skills to collect it.
  It's the same thing as responsible/full disclosure; by making this information available to anyone (publish a vulnerability), you greatly reduce the power of those who have the skills to collect it anyway (the person who found the 0day).
  So yes, this information needs to be available, or it'll only be some people who have it, not none, and those few people who do have it will be 10x stronger than they are now.
  This is the old Antisec debate all over again, let's skip to the part where we end up agreeing generally that disclosure is better, okay? No need to relive 2009 or whatever.
  - nwah1•6y
    "Disclosure" could mean many things. The idea of providing the info directly via email to the affected user seems to adequately disclose things to the relevant parties.
    Are there additional benefits of the public api that on balance benefit the public more than attackers?
    - diminoten•6y
      Yeah, the availability of the data being common rather than rare, so the skill of collecting that data doesn't create a power structure where only the hackers/skilled users have power.
      Imagine it being $500/month to access HIBP, because that's the alternative, not some, "everyone agrees to only use this info for good".
      - jtbayly•6y
        Explain to me how anybody besides myself can use info about my leaked account for something good or useful.
        I can’t think of an example.
        Therefore, having that info cost more is better. Having it cost a lot more is a lot better. (I’m assuming I can still get access for free by having provided directly to my email address.)
        diminoten•6y
        What? No, you're not understanding. Even if no one but you could use this info legitimately, the fact that it's widely available depowers the people who have the skills to collect it (specifically, people who want to do you harm).
        By virtue of the fact that this info is widespread, you have no choice but to take actions to protect yourself from this information. That means the information becomes useless.
        You are, in a way, being shamed into acting, through public disclosure. So no, having that info cost is not more better, it's more worse.
        Furthermore, it is not an option to only let you have this information. That ship sailed when the breaches happened. You don't get access to this information for free, you don't get to control the dissemination of this information, you are powerless. You're acting like HIBP is the only way people can find this info out; it's not. That $500 price tag is just for you. People who are more skilled than you or I at collecting this info get it for free, and that's never going away.
        jtbayly•6y
        You can’t have it both ways. Either it’s widely available, or it isn’t.
        If it’s already widely available then HIBP doesn’t accomplish anything. (It doesn’t anyway, since it doesn’t “shame” anybody except people who are already signed up, who only need and get their own info.) If it isn’t widely available then HIBP is helping people who are bad at collecting and using this information to do so.
        We accept that from bug reports only because of the other benefits that come from releasing the info.
        diminoten•6y
        You're not getting that the alternative is much worse.
        Your data is out there. Period. The end. You don't have control over that. All you're doing is trying to re-establish control over data you already lost.
        The question now is, do you want it only in the hands of people who want to harm you, or do you want it in the hands of both people who want to harm you as well as people who want to help you?
        You seem to only want bad guys to have your data. That's weird.
        jtbayly•6y
        Thanks for the explanation. I get your point now. I did not find BFDM’s proposed benefits from white hats having access to be compelling. So what I’m struggling with is simply the idea that anybody could do something good with my data. If only bad can be done, then the fewer people spreading the data around, the better. Your presupposition is that some people will do good with it if they have access that currently only bad people have. Can you give an example of one of some of those good things?
        diminoten•6y
        1Password tells you which of your passwords have been part of a breach. Many other companies will suspend the accounts of anyone whose login information to their leaked as part of another site's breach.
        Other websites won't allow you to use a password that's listed as a common password from the aggregated passwords in breaches.
        Lots of studies have been done on password frequency, such as the top 100 most common passwords and what security people can do about their repeated use.
        Based on your question however, I'm concerned you don't actually get my point. You're being forced into action, exactly how companies are forced into action, by the availability of this information. You have to change your password if it's easily available to anyone who uses this API and who has your email address, you no longer get to pretend it's not a big deal.
        zaroth•6y
        > 1Password tells you...
        This is software acting as an agent of the effected user. 1Password could be authorized by the email holder to gain access to the API without making the information public.
        > Other websites won't allow you to use...
        This and the following example in your comment are discussing the breached password API, which is a completely different API that I specifically mentioned up-front as not compromising any PII.
        I take zero issue with providing an API to see counts of how many times a password has shown up on breach lists, although I wouldn't use the API myself on any of my own passwords, because it leaks a 1-in-1-million discriminator to the actual password you are querying.
        diminoten•6y
        You don't get to take issue with any of this. Your information was already stolen! You have no say, the end.
        jtbayly•6y
        So your fallback position is that it is perfectly legitimate to traffic in stolen PII. Got it.
        Well, I take issue with that.
        diminoten•6y
        Yes, in some cases it's perfectly legitimate to "traffic" (terrible word choice) in stolen PII, that is correct.
        And my "fallback" position is that it's better this way than the other way, where it's actually being trafficked, rather than your hyperbolic assertion that it is now.
        zaroth•6y
        Now apply this same thinking to the Equifax breach and millions of credit reports.
        Now apply this same thinking to the OPM breach and security clearances.
        Now apply this same thinking to medical records breaches.
        I don't want anyone to have my data, but I resign myself to the fact that data security is, and never will be perfect. That does not mean I resign myself to the fact that all my personal data should be freely available to the entire world via a well documented REST API.
        bfdm•6y
        A service provider could check the API for the signup email and if previously compromised could challenge the signup with additional CAPTCHA steps to detect bot activity. They could check email+PW entered against leaked pairs and prevent you from registering with a known-compromised PW.
        Your bank could check emails attached to customer accounts and work with affected customers to ensure their bank account access is secure.
        You employer could check for leaks of accounts using corporate domains. They could check leaked passwords against known last 5 to see if there are active threats.
- jtbayly•6y
  You’ve convinced me. I didn’t know anybody could lookup my info. I only want it for myself.
  Only thing is, there are a couple of old email addresses I used to use that I don’t have access to anymore. I guess I just need to shrug at that at this point.
  - eli•6y
    The bad guys have access to it either way. That's the whole point: this is data that already leaked.
    - jtbayly•6y
      As discussed elsewhere in this thread, there are real benefits provided to bad guys by allowing them to look up this information about anybody in a central location.
      - ReverseCold•6y
        There are plenty of other similar services that you can find that cheaply do the same thing AND provide you with the leaked hashes/emails.
        HIBP doing the same thing with less friction for people who are trying to learn about security is probably fine in comparison.
      - eli•6y
        This just feels like another iteration of the Full Disclosure debate.
        zaroth•6y
        I’ve never heard Full Disclosure concepts applied to serving stolen PII in an API.
        The reason is that the purpose of full disclosure is to shame the vendor into ensuring the patch is made, and to warn the user base that the attack is possible, while disclosing a flaw in a commercial product.
        In this case, we are not effectively doing either naming or shaming by publishing actual email addresses, rather than just user counts and the type of hashing that was performed.
        And at the same time the information being “bartered” is private user information and not merely identifying a flaw in a commercial product.
        I fail to see how an API into the HIBP database can be justified under the concept of full-disclosure. Particularly when the service could have been implemented as an email report to the queried email address.
        jtbayly•6y
        You left out the biggest part of full disclosure in my opinion. The reason for full disclosure is because those who are affected by a security flaw in a product they are using have a right to know about the dangers of that piece of software.
        But once I put that down in writing I discovered you are right about the difference in this instance.
        The person who has the right to know about the flaw in this instance is the list of people whose accounts were compromised. Giving it to the general public is to further victimize them, rather than help them protect themselves.
        Ao7bei3s•6y
        I don't think that's the most important part. Rather:
        Full disclosure can also protect previously unaffected / potential future customers, by warning them of companies that have been so lax with their security that they've been breached.
        So to achieve a comparable upside to full disclosure, HIBP needs to also make aggregate data publicly available. Which they do:
        https://haveibeenpwned.com/PwnedWebsites
        jtbayly•6y
        Interesting. Good point. I'll have to think about that.
        sneak•6y
        It feels that way, but there is definitely a different utility value in “searchable by the whole world” and “leaked in obscure formats in small nonpublic forums”.
        Troy has absolutely added value here, although 100% of the data is all “public” from having been leaked already.
        Searching over data that was publicly available some time in the past (but isn’t now) is also a value, sort of like time-shifting of the publicness of the data...
- massaman_yams•6y
  Bulk emailing notifications to all affected addresses would be a deliverability nightmare, and would require manual intervention at most ISPs to prevent these messages from being blocked, which said ISPs may or may not be willing to do.
  Just think of the number of clueless users who would mark such a notification as spam, and the number of old, dead addresses, some of which are now spamtraps.
  edit: clarify bulk vs. individual notifications
  - dmix•6y
    That's a service Have I Been Pwned has been offering for years...?
    - massaman_yams•6y
      For single addresses that specifically request it, which is both fine and hugely different from bulk notifications to any/all addresses observed in a breach, which is what I was referring to.
      But I realize the wording in the original post is a little ambiguous; I had read "provide ... directly" as implying "push", but that may not be the case, and if so my comment above is not relevant.
theandrewbailey•6y
> Making an authenticated call is a piece of cake, you just add an hibp-api-key header as follows:
> GET https://haveibeenpwned.com/api/v3/breachedaccount/test@examp...
> hibp-api-key: [your key]
Wouldn't the standard Authorization: Bearer <key> header be more compliant?
- floatingatoll•6y
  See also elsethread about "not a token" — but, also:
  > There's a couple of these and they're largely due to me trying to make sure I get this feature out as early as possible and continue to run things on a shoestring cost wise
  Using the Authorization header can cause significant problems with both clients and servers, and also might unintentionally permit browsers to directly query the server if they can be convinced to provide a bearer token.
  Using a custom HTTP header sidesteps both client and server issues altogether and closes the door on browsers direct-querying the API, which could be considered a positive by the site operator.
  - theandrewbailey•6y
    I'm not sure how browsers using the API would be a concern. Someone paid for the key, so it should be up to them to use it how they please (within rate limits).
    - floatingatoll•6y
      Allowing browsers to query directly would break the terms of engagement specified by the site operator, who specified that a proxy shall be used to concentrate end-user requests for a given paid key. That’s their right as service operator. I can construct plausible scenarios why this is a sensible choice, but the underlying point is that they clearly regardless made that choice after thinking it through.
      - akerl_•6y
        That’s not a requirement specified anywhere. The “Protecting the API Key“ section talks about using a proxy specifically in the context of client-side applications (think of things like 1Password that integrate w/ HIBP), where embedding the API key into the app is obviously undesirable. In those cases, using a proxy allows managing the request volume and injecting the API key.
        That same section of the document describes other scenarios, like a hosted service or a CLI tool, that do not involve a proxy service.
        floatingatoll•6y
        I look forward to clarification someday from the operator - but that custom header will still block non-extension browser-side calls in v3, and I bet the ACAO header isn’t present to allow it either.
- pionar•6y
  No, because it's not a bearer token.
  Edit for clarity: A bearer token [0] is a concept for OAuth. This is not OAuth.
  [0] https://tools.ietf.org/html/rfc6750#section-1.2
  - Androider•6y
    OAuth doesn't have a monopoly on bearer tokens. And it is literally the definition of a bearer token: you shall know the messenger who presents this token, a concept old as history itself.
    - y4mi•6y
      Should every OS which uses windows be able to call itself Windows, because windows are a quite old thing as well?
      Like it or not, there is an rfc for this and using it for anything else would be code smell at best
      - patmorgan23•6y
        > Should every OS which uses windows be able to call itself Windows, because windows are a quite old thing as well?
        > Like it or not, there is an rfc for this and using it for anything else would be code smell at best
        No but every OS that uses windows can call them windows....
        y4mi•6y
        I guess they should be able to call them windows.
        Can you link to any tool which uses bearer tokens and doesn't grant them through oauth2?
        Or it's internal, please explain how the token is obtained.
        I haven't seen any to date but I guess I could be wrong
        X-Istence•6y
        Github will happily hand you an access token by visiting "https://github.com/settings/tokens".
        These are bearer tokens, in that the bearer gets granted access by that token alone.
        You happen to send it along in a Basic authentication in HTTP instead of as an Authorization header, but it is a bearer token all the same.
        No OAuth2 flow required.
        X-Istence•6y
        Any service that uses API keys are basically handing out bearer tokens. Whoever holds that API key can make requests to the service, it grants you access.
        OJFord•6y
        > Can you link to any tool which uses bearer tokens and doesn't grant them through oauth2?
        Yes: https://www.pelion.com/docs/device-management/current/integr...
        (I know I've seen and used many others, but Pelion comes first to mind because I used to work on it.)
        Androider•6y
        It's incredibly common. See Stripe for example https://stripe.com/docs/api/authentication
        Authorization: Bearer <API Key>.
  - theandrewbailey•6y
    JWT uses Authorization: Bearer, too.
    https://jwt.io/introduction/
jawns•6y
I wish the post made more clear, ideally right at the top, that the new fee applies only to third-party apps that access the HIBP API, not to end users whose email addresses are being checked against the API. You have to read through the post a bit before that becomes clear.
Individual users who just want to figure out whether they've been pwned will not have to pony up the cash. They can still visit https://haveibeenpwned.com and get that information for free.
- pixelbath•6y
  Perhaps it could be made more clear, but from the post I thought it was very apparent he was only talking about API abuse; most of the introductory text was concerning rate-limiting.
- nathan_f77•6y
  It would also be great to emphasize that this only applies to the HIBP API, and the Pwned Passwords API will still be free. (It's mentioned about half-way through the article.)
  - Glyptodon•6y
    I completely missed this because of skimming. Almost jumped the gun on subscribing. Use the pwned password API a lot. (I use the email-based one not at all.)
  - jrochkind1•6y
    Hm, I didn't actually realize there was a separate Pwned Passwords API. Having trouble finding docs on it (could be becuase I'm a horrible googler).
    - bpye•6y
      Pwned Passwords is detailed towards the bottom of the API page - https://haveibeenpwned.com/API/v3
- badrabbit•6y
  Domain wide breach searches for a domain you control still appears to work for free as well.
- denzil_correa•6y
  Bury the lede.
JoshTriplett•6y
> Late last year after seeing a similar pattern with a well-known hosting provider, I reached out to them to try and better understand what was going on. I provided a bunch of IP addresses which they promptly investigated and reported back to me on
I'd love to know how to get a hosting provider to actually answer such requests. (I hope the answer isn't just "be high profile". I'm hoping the answer is more like "know the right people to contact or the right phrasing to get through first-line support".)
I've reached out to hosting providers before, providing clear logs of malicious activity, and either gotten no answer, or occasionally gotten a rote "prove it came from us" that would trivially have been answered by actually reading the logs.
(Examples of such logs include SSH brute-forcing attempts, HTTP logs showing attempts to exploit web-app security holes, and spam headers showing the IP that contacted my provider's mail server.)
I've mostly stopped even trying, due to the near-zero response rate.
In an ideal world, I'd love to see reports like this lead to "we can confirm and we've shut down outbound traffic from that system until it gets fixed".
- coderholic•6y
  How are you contacting them? If you use the correct abuse contact you'll usually get a response. We (IPinfo.io) are adding abuse contact info to our API within the next week or so (see https://twitter.com/ipinfoio/status/1138901541937602560) - let me know if you'd like early access.
  - JoshTriplett•6y
    Typically via abuse contacts or abuse forms.
    The only type of service providers I've ever had useful responses from are email/mailing-list service providers, many of which will very quickly investigate and terminate spammers.
novaleaf•6y
I feel his pain.
I run a SaaS with what I think is a pretty generous free tier (PhantomJsCloud dot com), and yeah, I have numerous people from all over the world doing their best to shit all over it:
- switching IP addresses every request to circumvent "demo user" rate limiting
- creating upwards of 100 fake accounts to get free credits ($0.05/day each account)
- embedding api calls into their webpages so their users ip address is used for "demo user" credits
- API driven credit cards and hijinks around that.
- using url shorteners to circumvent blacklisted domains
I'm not sure if it's a case of people being incapable of paying credit cards, or just their ethics allow stealing anything that's not bolted down?
I don't mind people signing up with a burner email address, but unfortunately most these abusers are too. I am going to be banning all throw away email accounts soon. And if that doesn't work (which it probably wont) I'm going to have to kill my free tier.
- peterwwillis•6y
  Can you do what the big cloud providers do, and demand a "real" phone number be verified for sign-up? Not impossible to beat, but more costly. Or maybe there's a market for paying customers somewhere between your free and paid tiers?
  - novaleaf•6y
    My lowest paid tier is USD$10/mth. As my target audience are developers, I think it's hard to believe that any of them would really be unable to pay that, yet still gain value from my service.
    Maybe I'm just a peace loving hippy but I'm rather shocked at the levels of abuse I see. I do want to enable paypal, just in case it's a lack-of-credit/debit card issue.
ksahin•6y
"After 4 and a bit years, by far and away the most popular method with an uptake of more than 90% is versioning via the URL. So that's all V3 supports. I don't care about the philosophical arguments to the contrary, I care about working software and in this case, the people have well and truly spoken. I don't want to have to maintain code and provide support for something people barely use when there's a perfectly viable alternative."
Well said !
- mehrdadn•6y
  Funny thing is here I am wondering why he didn't pass a query parameter instead of altering the path or adding a header to version the API... does anyone know? It has the advantage of being clickable while not implying the resource is different.
  - floatingatoll•6y
    One reason could be constructed by example, as:
    <Location /v3>
    vs.
    <LocationMatch ?[.*&]v=3(&|$)>
    Which is to say that, depending on the application's coincidental design and structural choices over time, managing versions at /v1 /v2 /v3 might well be vastly easier for the "shoestring budget" operator than at /?v=1 /?v=2 /?v=3.
    - mehrdadn•6y
      It seems unlikely considering the other 3 were more drastically different and yet seen as pretty equally easy.
  - Ayesh•6y
    API versioning with query parameters is often an implementer nightmare.
    - mehrdadn•6y
      Why..? Does it break through too many abstraction layers?
  - novaleaf•6y
    one benefit of putting version in the path is it makes it easier for loadbalancers to direct traffic. like v3 could be served from different servers than v2
    - mehrdadn•6y
      Why can't they do that with the query parameters?
      - novaleaf•6y
        maybe he can, but I know that google cloud's loadbalancer doesn't let you.
elamje•6y
I wonder if this actually has more to do with trying to sell HIBP, than abuse. He just announced that he was selling HIBP a month or two ago. Presumably, if he can get people to pay a nominal fee now for access to the api, it makes HIBP much more valuable to a potential acquirer. If you can prove people are willing to pay $.01/month for a subscription, you can assume(as a potential acquirer) that they would pay $.02/month in the future. Much harder to sell something that is completely free because of the risk that monetization completely fails later.
In previous blog posts he mentions that he gets 99.x% cache hits on Cloudflare, then also has a cache on his Azure service. He is sponsored by Cloudflare and Microsoft and doesn’t pay for the service unless something has changed since a few months ago. If that is still true, I don’t fully buy that he is actually spending money on Microsoft api hits as the post claims.
But, I like Troy and HIBP, so maybe I’m just too much of a skeptic :-)
skybrian•6y
Very understandable, and also yet another example of why we can't have nice services on the Internet. Traffic from bad actors pushes anyone offering an API in a similar direction, or discontinuing it altogether.
birdman3131•6y
I find it ironic that a site dedicated to seeing if you have been compromised has no method of changing your API key if it is compromised.
- incidentnormal•6y
  Even though he explained why (it is likely a forthcoming feature), I did enjoy this comment.
londons_explore•6y
Who bruteforce scrapes the HIBP API across many IP addresses when they could just download the original leaked username & password databases?
Theres even a torrent file of all of them I won't link here...
- sleavey•6y
  Maybe spammers check if an email address is legitimate by checking HIBP. A pretty significant fraction of legitimate email addresses probably do show up in at least one list.
- rolltiide•6y
  Torrent file Of ALL leaks?
  I usually only see some
  And when people ask about a latest leak, others disingenuously reply “just check YOUR email on HIBP what kind of person needs the database”
  - necovek•6y
    If you run a web service and want to proactively expire breached passwords, you need to have full list of plain-text passwords to hash them with algorithm you are using (and use the same salt if you are doing that too).
- abathur•6y
  The compromised servers might be doing some primary work to which these queries are incidental, rather than for the purpose of scraping the database.
  In such a case, the API may be saving them from needing to build infrastructure to accumulate the database and either distribute slices of the data or host their own API for their distributed software to use.
  While the database may be valuable, they'd still have to invest a lot of time and some amount of money, face the same need to secure their API against exploitation by others, leave a stronger footprint leaving back to themselves, and have to depend on a service that is more likely to get flagged as a sure sign of suspicious activity than HIBP...
- floatingatoll•6y
  Why download anything when you can simply query a public endpoint for free?
yjftsjthsd-h•6y
Obvious next concern: Will bad actors just scrape the website? Putting authentication and payments in front of that rather defeats the entire point, and without that you're back to rate limiting which is exactly what has just been declared as a failed approach.
- abathur•6y
  Probably.
  But you can justify a significantly more restrictive rate limit for a website form intended for individual mortal humans to check their own personal email addresses for breaches.
  The API has to support request frequencies for legitimate usage that are obviously exploitable at a sufficiently small scale to attract a few exploiters...
- ec109685•6y
  Or scrape websites that provide a proxy to the API (e.g. the cloudflare worker he described).
- lightedman•6y
  "Will bad actors just scrape the website?"
  That's already been happening. Many simply use HIBP as a starting point to pwning someone's online accounts. Now, Troy is just going to attempt to really profit off of the actions of those bad actors.
zxcvbn4038•6y
Adding authentication so you know who is using your service is reasonable, but not sure why author is complaining about 1.2M requests per day, that is only 14 requests per second on average.
- floatingatoll•6y
  They consider those requests to be "bad actors". It's not necessarily about the volume of traffic, it's that they are compromised VPSes configured to perform unknown malicious activity that takes advantage of a free endpoint in support of unknown malicious intent. See also "Why do bad actors abuse this endpoint?" discussion elsethread: https://news.ycombinator.com/item?id=20480230
  - wolco•6y
    Wouldn't most api traffic come from vps's regardless of the intent?
    - floatingatoll•6y
      The article notes that the VPS providers indicated that those top API traffic consumers were all a specific cron.php on compromised VPSes, so while in theory your statement is true, in reality the issue here was maliciously-compromises VPSes, not VPSes in general.
- mtmail•6y
  Near the top of the article it says peak 14k per minute (233 per second) and it sounds like demand is ever growing.
w8rbt•6y
I obtain the SHA1 hashes published by HIBP, load them into a bloom filter and use that for checks. It's super fast (constant time lookups) and avoids a network dependency/third party service. Here's working Go code:
https://github.com/w8rbt/bp
Edit: This is solely for password vetting during account creation and password reset (which will remain free/no-cost in the API).
sucrose•6y
Why are bad actors abusing the API? What benefit does it give them to just be able to check for leaked data on e-mail addresses? Especially when it doesn't actually provide the leaked data...
- HereBeBeasties•6y
  Doesn't take much imagination to find a use.
  Assume I find Anna's email address as part of a breach somewhere.
  Hello Anna,
  We value transparency and honesty highly at $p0wn3d_company. To that end, we're sorry to have to tell you that our systems were compromised by an unknown hacker recently. Although we believe that no personal data has been stolen, we are working with Government agencies and expert security consultants to determine the full extent of the breach.
  As a precaution we are asking our customers to change their passwords, which you can do by clicking on >this link here to a website that looks like ours but is actually owned by a hacker<.
  Etc.
- birdman3131•6y
  AFAIK from looking myself up on the website before it tells which breaches to go hunt down for the actual info. Knowing they need to go hunt down the SpecificWebsite.com's March 2017 breach is way more specific than trying to have a database of all breaches.
- geddy•6y
  Perhaps they hammer it inefficiently or simply too often, possibly without even realizing it?
  - smacktoward•6y
    Never underestimate the potential impact of stupid people in large numbers.
sroussey•6y
Makes sense. I was writing an email to Troy that he can post about how to set custom user agent in Electron and Cordova, as the defaults fail. Guess it won’t be needed.
Aeolun•6y
I don’t use this API myself, so it doesn’t really effect me, but this somehow feels like one of the last purely good things was lost.
Daviey•6y
Next step, premium access without rate limit?
w3rhn2j34oh5o•6y
Boom, now Troy is monetizing stolen data. Unethical and illegal.
- mfkp•6y
  It does cost money to run a service like this. He's historically had sponsors, but you can't expect someone to run a high traffic service for free forever.
  - w3rhn2j34oh5o•6y
    And the local pawn shop has expenses too. Just because they have to pay rent and electricity costs does not make selling a stolen item legal.
    - mfkp•6y
      There's a difference - he's not selling the leaked passwords. He's selling the information that a password has been leaked for a certain account. You can't buy stolen passwords from the site, so it's perfectly legal.
      - w3rhn2j34oh5o•6y
        I don't think it is that clear -- he is selling access to a data set containing PII (email address or account names). Its stolen data. One can make a case that free and open access to this data set is a common good, however once money is involved, one is conducting business with data that one did not legally obtain. It is not 'perfectly legal'.
        Aeolun•6y
        I don’t think the API ever returns that information. You need to already have the email address to be queried.
- penagwin•6y
  He gives a cost breakdown showing that he's almost guaranteed to lose money off it. Azure is charging him 3.5$ per 1 million calls to ratelimit/charge people for using the api. He's charging 3.5$. Consider that Stripe will be taking another 35 cents or so... lets just say if this was a monetization method it's not a very good one.
  - w3rhn2j34oh5o•6y
    He can try to justify it however he likes -- its selling stolen goods. Just because you sell stolen goods under their value, or under your costs to provide does not suddenly make it ok.
    - sho•6y
      Oh come on. It is plainly evident that he is not "selling stolen goods". He is providing a valuable public service merely checking whether people are caught up in those "stolen goods" and the context in which it occurred. In these changes, he is seeking only to recover his costs and reduce the time he spends administrating it, which he has done at absolutely no charge for 5+ years.
      You are quite simply wrong and you should just admit it, rather than repeating your ludicrous claims in ever more hysterical terms.
- sucrose•6y
  I don't see it as him monetizing the stolen data, but the mere existence of it.
  - wolco•6y
    No getting around that he sell access to stolen data.
    He didn't orginally steal it. He collected the illegal dumps and runs a service on top of that data.
    There is nothing wrong selling stolen data provided someone else dumped it first.
    - w3rhn2j34oh5o•6y
      So fencing stolen goods is not illegal if someone else stole it? I don't think so. Stolen is stolen -- nobody has any right to sell it.
DINKDINK•6y
All the ways congestion controls are implemented on the web lead to a cognitively infantilizing UX, privacy violations, and even "skynet" enabling[1] (hyperbolic but nothing stopping it from happening).
"Are you really human? What's: 3 x 9"
"Can you click on images of buses?, hmmmm don't believe you're human still, can you click images of stores, hmmm now bikes, hmmm now vehicles, oh I didn't mean all vehicles I just meant autos and not motorcycles, here quick copy this token, oh it expired? Too bad. How about you click on images of buses for me..."
"Sorry, browsers that protect your privacy and location aren't allowed. We only allow users who are willing to deanonymize themselves."
"Well we all know /those people/ who come /that place/ are antisocial users"
"Here's your IP addresses back. Oh yeah, sorry about blacklisting them"
This is a comment about the meta issue Troy faces. If costs are rubegoldberg'ed to create a facade of "free", it's not actually free (even if user data isn't being sold). e.g. A median-wage (10e3USD/year) world worker spending 20 seconds solving a captcha has an opportunity cost of 0.03USD[2]. Further more, having to solve congestion issues by implementing requirements to use closed/inaccessible (credit cards) poorly programmable, sucks too. Additionally, if a congestion solution is ("I'd rather low-demand users have free access and high-demand users have expensive access) isn't solved by having a flat rate (which a "keep it low cost, mantra is incentivized to keep low"). There is market demand for: If your demands on my service are x, I'll give you back the $3.50 but if you consume y resources You have to pay Z.
Wouldn't it be great if there was a way machines could own money, send it over a layer-2 network, that was open, cheaper than credit cards, faster than L1 bitcoin, and get your money refunded if you didn't demand excessive server resources, all while not using game-able "good users come from here" privacy violating algos?
This is why micropayment using layer-2 bitcoin on the Lightning Network has significantly-valuable, latent, economic-coordination implications. Micropayments aren't about paying for 1/1000 of a peanut. They're about obviating all the engineering, social, product costs dealt with dealing with Marginal Value, Marginal Cost issues. BAD: The marginal cost of anti-DoS counter measures can always be above the marginal value of deploying them ("listen folks it costs to much to keep this service running, we'll have to shut it down". UNSTOPPABLE: If a price is put on service requests (Services on Demand)[3] the marginal value will never be below the marginal cost ("I can keep this AED locator map service running because I know a spamming request will incur costs above my production costs").
In a future where L2 Bitcoin payment/Lightning client infrastructure is prevalent, gone will be the days of annoying, productivity-draining captchas, attribute-discriminating access. Troy could charged a 0.01USD "bond" payment for a request (Which he could give back fast and costlessly to a low-demand user). Meaning the 14e3/min requests for 3 hours would have required the high-demand user a payment of $25,000USD[4].\
0.01USD refundable payment for honest users.
$25,000 USD penalty for high-demand "spammer"
[1] https://i.redd.it/pb5nggw3rulz.jpg
[2] 20/60/60 * 5
[3] https://medium.com/@soddiraju/the-not-so-micro-potential-for...
[4] 14e3 * .01 * 60 * 3
- skybrian•6y
  That would only solve paying for services if you are an amoral service provider and don't care where the money really comes from as long as you get paid.
  It doesn't do anything for people who don't want their services used by bad actors, which is increasingly the case these days - see all the people concerned about privacy and how big tech companies use their data. It's not going to help for anything social where you are trying to promote pro-social usage and discourage anti-social usage, however you define it.
  Those concerns inevitably lead to things like "know your customer" and supply-chain policing. You can still build nice services, but not anonymous ones.
  The issues are pretty much the same as TOR. Some people are willing to run TOR nodes because the good outweighs the bad, others get squeamish about child pornography and say: no thanks.
  And that's why it's an API. If the "have I been owned" database were harmless and there were no concerns about bad actors, it would be a torrent, not a service.
  - DINKDINK•6y
    >It doesn't do anything for people who don't want their services used by bad actors, which is increasingly the case these days
    My comment illustrates precisely how such an incentive structure denies high-resource demand users.
    >That would only solve paying for services if you are an amoral service provider and don't care where the money really comes from as long as you get paid.
    This makes no sense to me, sorry. Are you claiming that anyone who accepts cash payments is amoral because a euro/dollar bill could be stolen and equivalently people who accept bitcoin payments are amoral because they don't surveil their customer's financial history?
    - skybrian•6y
      By "amoral" I don't mean immoral, I mean you don't care what anyone does and you're happy not knowing the consequences of your actions.
      Depending on what you're providing, maybe that's fine. In the open source world, we give away code all the time, to everyone. Most public reading material is fine.
      But services differ and for some services of interest to bad actors, many people are concerned about the consequences when they do business.
- gen3•6y
  Why do something that is so complicated and time consuming to implement when charging $3.50 is good enough? Its easier for him, as he can use already made tools, and its easier for me because I don't have to add all this extra overhead (and money) to a project. It's just $3.50 and a header.
  - DINKDINK•6y
    "Why would anyone put film in their camera, take a picture, have it developed, scan it, email it, all so that I can print it on a dot matrix? That's so complicated; I could just put it in a manilla envelope and send it through the postal service. Why do I need a new way to send information?"
    If you want to continue using legacy technology, that's fine. If you're not comfortable with your bits being in a computer, that's fine. But it'll be slower, more expensive, and less transnational etc.
    - gen3•6y
      > But it'll be slower, more expensive, and less transnational etc
      That's the issue with your idea though. For the current status quo all I need is a valid credit card and the ability to type "curl -H "hibp-api-key: <whatever the key is>" https://haveibeenpwned.com/api/v3/breachedaccount/test@examp...
      I can have that done in less then 60 seconds, it fits his threat model, and can be done in any language with a tcp lib.
      Your idea isn't bad, it just does not fit the problem.
Nullabillity•6y
Looks like AgileBits is getting scared.
dustinmoris•6y
> One thing I want to be crystal clear about here is that the $3.50 fee is no way an attempt to monetise something I always wanted to provide for free.
If this was true, then all revenue made from those 3.5 would get donated to a worthy cause, not donated into Troy's own pocket. I am not saying that he shouldn't monetise it, but please let's be honest about it.
> The point is that the $3.50 number is pretty much bang on the mark for the cost of providing the service.
The cost of the service is the actual final bill which has to be paid for this service, taken into account all the free credits Troy gets as a Microsoft Regional Director, free credits for hugely advertising Azure at every occasion, free credits from Cloudflare for constantly advertising for them, the tax which he doesn't pay as a registered company, etc. divided by the actual amount of customers who use the API. This cost could be much more, or significantly less than $3.5. If Troy wanted to be more transparent then he could, but given that he is very secretive and very selective about the bits of information he shares around all of this, my guess is the cost is much less than what Tory makes everyone believe.
Overall I don't think it is ethical to monetise a service which is built on stolen data. There is a good chance that Troy holds data on me, my parents, my sister, wife and lots of other people who's data have been breached over the years and have no idea who Troy is, what the heck HIBP is or even know how to contest or request from Troy to remove their data from his service, yet it's being used for monetisation.
There was never a consent from anyone to hord our data. It's stolen, and only because stolen data is easily discoverable on the internet doesn't make it alright to actively search, store and monetise that data. It's still stolen and should get deleted from everywhere.
- jtbayly•6y
  This is such a clearly useful, legitimate service. You cannot tell the bad guys to delete your data. The next best thing is to be alerted when your data is found in a bad guy’s trove.
  - wolco•6y
    Just because you have a legitimate reason doesn't mean everyone does.
    There are no bad guys just selfo serving people.
    - jtbayly•6y
      As I've said elsewhere in this thread, I've come to realize that giving other people the ability to mine my data is definitely different than what I understood this service to be about. However, it's apparently what he does. And now he charges for it. I definitely see a problem here.
      Of course, if there's no bad guys, I guess you don't see any problem. That's one weird point of view.
  - dustinmoris•6y
    It's not that clear cut unfortunately.
    What do you really know about Troy and his service? Really just what he wants you to know.
    For example, Troy stores extremely valuable information about millions of people without their consent. A lesbian women in the Arabs, who might have had her credentials breached on a gay forum, who also has a gambling addiction and had her password breached on a gambling website and on another dating website for prostitution services might not want some Aussie guy selling all that information about her to anyone who pays him money. There is nothing, absolutely nothing ethical about this!
    My sister does not know anything about Troy, I showed her his Twitter profile and the first things which stood out to her:
    - Old man
    - Orange skin like Trump
    - Loves to show off outdated cars
    - Making occasionally snarky comments about Indians, Indonesians and other Asian people, always suggesting that anything illegal is coming from those countries
    - Constantly tries to self validate himself by bragging with something expensive he's recently bought in life
    - Very capitalistic and money focused individual
    It's not great optics for people who have never seen his blog. His blog is just marketing at the end of the day. There is no regulation, no actual organisation or anyone who can be held accountable for gross mishandling of the data.
    It's just an old Aussie guy who stores hordes of stolen data on his private laptop and in his private cloud and sells it to other people who clearly gain benefit from collecting that data from his service.
    There is trust and naivity, and in this case anyone who doesn't find it slightly dodgy is simply naive. Sorry, but that is the reality.
    - bbatsell•6y
      I am trying to assume good faith, but I confess to being incredibly confused by this post. I just skimmed the last two months of Troy’s tweets (which constitutes quite a few — he is prolific) and couldn’t find a _single_ one that matches up with any of your summations. Would you mind showing your work?
      - gjm11•6y
        I'm wondering whether (deliberately or accidentally) dustinmorris pulled up the wrong Twitter profile or something. No part of his description seems to me like it has any connection with reality.
    - Ayesh•6y
      This is definitely to most grateful comment I have seen on HN this far... Time out.