Unless you have been living in a cave on Mars with your eyes shut and your fingers in your ears for the past few weeks, you have probably heard something about a data breach at Australian telecommunications giant Optus.
Despite all this, Optus (seemingly) got somewhat lucky. After attracting a lot of attention from the authorities and the public at large, the self-purported ‘hacker’ who stole the data ended up apparently choosing not to leak any more than the 10,000 or so accounts they had posted first. There’s every possibility that those data are still out there somewhere and could come back to haunt Optus yet, though.
While there does not seem to be a definitive pronouncement on the security vulnerability which enabled the data leak, the hacker is reported to have said it was as simple as an unauthenticated Optus REST API endpoint — which appears to be backed up by an anonymous insider at Optus and (implicitly) the Australian Minister for Home Affairs.
As a cyber security consultancy, we work with Australian and New Zealand organisations every day to prevent data breaches like this one.
As security mistakes go, the vulnerability reported to have enabled the attack leans toward the more embarrassing side of the scale. If reports are true, Optus has effectively exposed customer data on an endpoint available to the entire internet.
Being only interested bystanders, we at Cosive have no greater insight into the specific details of the incident than what is reported publicly. As an information security business that develops and markets web applications of our own, however, we put time, effort and our collective decades of secure development experience into ensuring that we don’t leave our REST API endpoints inappropriately exposed to the world.
Our best guess (and that is all it is) about why there was supposedly no authentication requirement for the endpoint is quite simple. Most likely, someone deliberately exposed the endpoint to make testing it easier while they were first developing it, and then forgot to turn authentication back on once they were done. Sadly, due to the fallibility of humans, this is both all too easy to do, and all too common in practice.
An important point to emphasise here is that this breach should not be construed as entirely the fault of one developer — doubly so if they are relatively junior.
Rather, incorrectly exposed API endpoints strongly suggest wider organisational failures, the responsibility for which rests primarily with management.
Without knowing the precise details of what happened, nobody can say for sure what led to the exposure, nor what could have been done differently to prevent it, but it almost certainly is not as simple as “that one developer didn’t do their job.”
Your first reaction in the wake of a data breach should be to investigate the process leading to the failure, not find a scapegoat.
At a firm the size of Optus, well-organised – and appropriately resourced – development and operational processes can ensure that one developer having an off-day doesn’t cause a major security headache.
While it is plausible that a developer will forget to (re)secure an endpoint once they finish their development work, there are multiple practical steps you can take to catch or mitigate the problem.
These steps won’t necessarily stop someone masquerading as a valid user, but at least then any subsequent data breach will require a modicum of technical sophistication.
Moreover, these steps can be used together, making the gaps an attacker can slip through ever smaller. The more of these steps you implement, the more depth you give to your defence against improper access to API endpoints.
This list focuses on web application REST APIs but most of the advice should translate directly to other approaches. Some of these measures described below are old wisdom (see the OWASP Top 10, for instance) but are still not implemented everywhere. See how you rate!
Use deny-by-default as your team’s default philosophy. ALWAYS.
This single step is usually quite straightforward and extremely effective. Simply put, every private API endpoint should be required to accept and respond to requests only from properly authenticated users.
Attempts to make any sort of request to an endpoint without providing valid identification should receive a 401 HTTP status code response, even for endpoints that do not exist (to prevent an attacker from discovering extant endpoints by checking for 401 vs 404 status codes).
The only exceptions are those endpoints that service unauthenticated users specifically. In practice, this will typically mean only the user authentication endpoint itself, and maybe account creation and password reset endpoints if users are permitted to do so themselves.
This configuration means that any intended unauthenticated access to an endpoint will require an explicit declaration in the codebase. Any developers examining the endpoint later on will see that the endpoint is exposed to the world, and can query this if it doesn’t make sense.
Simply put, every private API endpoint should be required to accept and respond to requests only from properly authenticated users.
An objection sometimes raised to this approach is that it makes using your REST API more difficult because if authentication fails your users still can’t access the endpoints. Frankly, such an objection is nonsensical. If access to a resource should be restricted, then all potential users must at least prove their identity (for the API’s purposes). After all, there can be no authorisation without authentication.
This seems likely to come up only when your users have written their own code to interact with your API, but have not used authentication properly. The fact that a bug in your system hid a bug in their system doesn’t mean there isn’t a bug in their system. Two bugs do not make a correct behaviour.
Every good web application development framework includes a configuration option for a deny-by-default approach, and their documentation usually clearly shows how to use it.
If your framework of choice doesn’t provide such an option, then you shouldn’t use it for any application that deals with sensitive or restricted data. Sadly, this does mean that the hot new up-and-coming framework, possibly from that cool new language, may not be appropriate for the project. You probably shouldn’t think of a framework as production-ready if it can’t handle this requirement, anyway.
If developers need to disable authentication for testing, explicitly tie this to your development tools. Don’t use variables or commented out code that can sneak into production.
A developer may want to switch off authentication requirements when first developing a new endpoint. Having to ensure that a valid JWT or cookie et cetera is in place before experimenting with REST calls slows down development and can be frustrating. Switching off access controls for that one endpoint in the codebase is absolutely not the best way to handle this, however. It is much too easy to forget to re-implement the requirement, and the exposed code finds its way to production.
Many web application frameworks provide some sort of facility to do this sort of thing. For example, out-of-the-box, modern versions of Microsoft’s ASP.NET framework include flags set at startup declaring whether a given instance is running in development, testing or production mode. A developer can alter configuration so that authentication requirements are not imposed in development mode without running the risk of altering anything in production.
Alternatively, most languages include conditional compilation facilities to change how a program is built depending on whether it was built in debug or release mode (or equivalent). Since you won’t run a debug build on your production environment, you can switch off authentication in debug builds if needed. Using flags on startup (even if they are just environment variables) is preferable since they permit easier configuration, but conditional compilation is still better than risking inappropriately exposed endpoints in production.
At first glance, one may think that the idea of allowing a change to this configuration while the application is running is a good thing. After all, once a developer has confirmed that their new endpoint works without authentication, they will want to confirm it works with authentication. It’s not a good idea.
If your developers can do it, then potentially so can an attacker.
Permitting such a change after startup would mean that an outsider only needs to gain access to that one switch in order to get the keys to the entire kingdom. This seems unlikely, but it would be better not to make it possible in the first place.
Always use peer code reviews. Coach reviewers to be thorough and understand the full context of even a one line change.
Code reviews should be standard practice in almost every software development team.
Code review might have prevented the Optus data breach. One would hope that improperly exposed endpoints would be noticed during a code review, especially when using the deny-by-default approach for authentication. For an endpoint to accidentally be exposed under this system, a developer would have to create an explicit manual exemption to authentication requirements, which should be fairly obvious to other developers reviewing the code. The reviewing developer would bring these exemptions up during the review unless it is abundantly clear from the context why they are in place.
For an endpoint to accidentally be exposed under this system, a developer would have to create an explicit manual exemption to authentication requirements, which should be fairly obvious to other developers reviewing the code.
Even when you do have code reviews as standard, make sure your reviewers fully understand the context and implication of the change beyond just the lines of code changed. If reviewers just rubber stamp the PR with their approval, nothing of value is added and worse, people have a false sense of the level of verification performed.
No solo merging to major branches! Use repository branch rules to save us from our worst temptations.
Repository branch rules so that one or two people other than the author must approve the code are essential. If others have reviewed the code AND it has passed through the necessary battery of automated tests and checks, then the code can be merged.
The temptation to merge or approve one’s own code is just too tempting for the best of us.
Code reviews at Cosive often contain comments on the perceived possibility of a security issue. Sometimes those comments do indeed catch potential security issues before they get any further, while other times they help ensure that team members are on the same page. Both are extremely valuable benefits.
Create simple security checklists developers must work through before merging code.
Perhaps your developers, by and large, don’t tend to think about security. Or maybe authentication and authorisation requirements are far down the list of things they think about when reviewing code. If that’s the case, it would be worthwhile to consider instituting code reviews focused specifically on security into the development process.
These reviews can be as simple as a checklist of open-ended questions about security that developers must work through and answer before considering a pull request truly ready to merge.
One of the first questions for a web application should be along the lines of “how do you know that all endpoints are secured with appropriate authentication and authorisation requirements?”. The reviewer should explain why they can have a high degree of confidence that this is true.
If developers ever modify code in production at will, you don’t really know what’s actually running at any time.
“Hmm, we’re hitting an error in production that I can’t replicate in the test environment. I’ll just SSH into the production host really quickly and comment out this line of code to test a theory...”
If developers can change code in production without any controls in place, you cannot guarantee exactly what code is running in production. Any other good controls you have in place, like code reviews, are also trivially side-stepped. The “quick experiment in production” doesn’t get undone at the end, and the production codebase is left faulty or vulnerable. Even if it really does fix an issue and not introduce a new one, will the fix be back-propagated to the code repository so that all future deployments use it?
Such ‘experiments’ will (probably) only happen with web applications written in interpreted languages. Still, considering the popularity of Django, Ruby on Rails and Laravel, this sort of thing is possible in many cases, and a terrible temptation when it is.
Automatic, continual testing helps make sure the system running today still meets requirements even after passing through twenty sets of hands.
Most applications, web or otherwise, will have accumulated a significant battery of tests by the time they are ready to be put out into the world with sensitive data behind them. These tests won’t be just unit tests, but should also include integration and end-to-end tests.
You can put these tests to good use by using them to ensure your endpoints don’t permit access they shouldn’t.
This measure consists simply of sending queries to your various endpoints, and confirming that you receive the expected response type in each instance. E.g., an unauthenticated user should see a 401 HTTP status code; an authenticated user without the appropriate authorisation should see a 403 HTTP status; and an appropriately authenticated and authorised user should see a response somewhere in the 200 range.
Tools such as Postman or Insomnia come with built-in support for API testing, and if you already use one you should consider expanding your test suite to cover authentication testing. Alternatively, behaviour-driven testing (at least the Gherkin/Cucumber family) can potentially be utilised effectively for this purpose. Ultimately though, with a little ingenuity, you should be able to use your chosen end-to-end testing framework for this purpose.
We use this measure internally and it has saved us from accidentally putting a new release out with an exposed endpoint at least once, despite everything else we do to stop this from happening. Automated testing doesn’t necessarily prevent unknown issues from escaping into production, but it can do a great job of helping you catch potential problems you know about, such as a lack of an authentication requirement, before the rest of the world can see them.
Routinely simulate external users in your production system to ensure it’s still running as intended where it counts.
Testing your endpoints using test suites is one thing, but you’ll also want assurance that your system has been deployed in production with all your carefully planned security controls still intact.
For this, you’ll also want to test from the standpoint of an external user (or attacker) by:
Monitoring like this can help you detect when a code or configuration change has slipped past all your other measures and left your systems exposed. There are both SaaS services and open-source software that can help here.
Know when application activity looks highly abnormal and take automated action.
It seems that the so-called hacker in the Optus case was able to scrape the reams of data involved simply by making repeated requests to the same exposed endpoint. Assuming the figures thrown around in the media of the exposed data containing 10 million individual records, and that each record was collected via a separate request to the endpoint, then the hacker would have made 10 million successful requests to the endpoint.
At a rate of one datum per second, that would further translate to a total of 10 million seconds in total required to extract all of them.
This is the same as 16.5 weeks of continuous polling.
It seems doubtful that the scraping was going on for that long, so the scraper most likely accessed records much faster than that.
A limit of one request per second will not be sensible in all circumstances. The appropriate limit will always be context-dependent, but it is always worth contemplating what a reasonable request rate from a given user will be and whether there should be a limit.
The data exfiltration was apparently first detected because of the significant spike in request volume associated with the scraping. This suggests that the security controls and the security operations team at Optus were effective at detecting the spike in requests, but modern computers can work at a much faster rate than humans.
In the minutes – or more likely hours – between the start of a scraping attempt and the initial response from the SOC, an attacker can harvest a significant volume of information. Rate limiting doesn’t stop data exfiltration, but it serves to bring the speed of computers back towards that of regular humans, and might significantly reduce the number of customers whose data leak.
If your user base comes from limited, known network locations, restricting access via IP ranges is a highly effective option you should use.
If (and it is a big if) you can be fairly certain that your customers will only attempt to access your endpoints from a small range of fixed IP addresses, then you could configure your application or WAF to respond only to requests originating from those IP addresses.
This is another application of deny-by-default, but not as broadly applicable as authentication controls. It works even better if you run separate instances for different customers since it nearly guarantees that someone cannot somehow leverage access to one instance to gain access to another.
Naturally, this option is not a luxury that a very public application like a telecom’s customer portal can afford.
Lock down access to your system management interfaces or highly sensitive applications via VPN access.
No, not Proton or Mullvad. You can use a VPN as they were initially intended: to help control access to corporate networks. This approach makes the most sense for internal-only applications, where requiring any requests to such an application to originate from within the corporate network will once again severely reduce the ability of any random person from the wider internet to access restricted resources or exploit web application vulnerabilities.
Bear in mind that it’s not just human users who can use a VPN. You can also issue credentials to your web applications, providing a secure tunnel between internal systems.
Bear in mind, too, that vulnerabilities in VPN concentrators have been exploited in high-profile and damaging incidents, so be sure that your choice of VPN improves rather than weakens your security posture.
Know the full scope of your API surface via self-documenting REST endpoints.
These days just about every web application development framework worth using supports producing an OpenAPI specification of your REST endpoints automatically. This functionality is either built into the framework or made possible through the use of an extension, plugin or third-party library.
Such a specification has a multitude of uses; one of which is that it documents the authentication requirements in place on those endpoints. Generating the specification does not in-and-of-itself secure your application, but it provides a semi-structured document to inspect.
Train your developers to consider an attacker’s perspective. Write your code accordingly.
Many developers are self-taught, and even those with university degrees relevant to software development often receive little or no instruction in application security. This can leave security as something of an “unknown-unknown” for developers.
Unknown-unknowns are the most problematic unknowns because people are unaware there is a problem they need to address until some hacker (un)helpfully points it out.
A little security training can go a long way here. While you should seek to tailor the training provided to the technical stack used, the bigger benefit of training for information security in software development is in making security something that developers consider much more. Going through training also helps make developers aware of gaps in their knowledge and understanding.
Even a small amount of training can make developers much more cognisant of bad ideas (such as trying to create their own cryptography library) and shifts security matters into the “known-unknowns” category — meaning that developers will now understand there is probably an issue and can seek assistance accordingly.
Get some skilled, fresh eyes on your application to find problems before attackers do.
Getting a penetration test done isn’t cheap, and is probably only useful once you have a reasonably stable system and API in place. It can be well worth the money, though.
Skilled penetration testers are specialists in finding ways to do things they aren’t meant to be able to do. They know where typical weaknesses and gaps are and will check those as part of their normal process. It’s highly likely that a good red team will find open endpoints leaking sensitive information, assuming that the endpoint wasn’t somehow excluded from the scope of the engagement.
Penetration testing might have prevented the Optus data breach.
As an aside, make sure you address not just the specific problems the testers’ report identifies, but also the underlying causes of those problems. Otherwise, you might find that a skilled attacker can make small tweaks to their process to get their attack working again. Penetration testers sometimes give public talks about their frustration at clients not responding appropriately to their reports. Don’t be one of those clients.
Never mix data from different environments, ESPECIALLY production.
After the drafting of this blog post began, further (vague) suggestions emerged that the exposed Optus API endpoint was, in fact, on a test system connected to the production database(s).
The thinking goes that, because it was “only a test environment”, developers probably didn’t see a need to worry about its security. Which might be appropriate in the right context. Clearly, this was not that context.
Any environment connected to production data sources is a production environment, whether it is intended to be or not.
What is the appropriate context not to worry about security? It’s a relatively rare one. There must be no sensitive data whatsoever involved. The environment must not be accessible to the outside world. There must be no ability for changes and updates made inside the environment to propagate out of it.
Nobody should ever be able to think “I’ll just do that in the test environment since it’ll be quicker to see the result of it than going through the rigmarole to go via prod”. If anybody not directly involved in the testing workflow will notice if the environment disappears entirely, it’s not a testing environment. If it’s not a testing environment, all normal security measures are mandatory. If in doubt, play it safe, and treat it like a production environment.
So there you have it. A range of behaviours, procedures and technical measures you can implement to give your web application the best chance of not exposing itself inappropriately to the internet at large.
If for some strange reason you can only choose one of these then you should probably look to the deny-by-default approach.
It can stop many low-sophistication attacks by itself, and help to make more sophisticated attacks more difficult to carry out. Ideally, this would be your starting point, not your finish line, though.
If you are struggling with any of the measures listed here, Cosive can help.
Some should be as simple as reading the documentation for your web application framework and toggling some configuration settings, but others are more involved, requiring extra development work, standing up and integrating new infrastructure, or hiring external experts.
The toughest nuts to crack will fundamentally require a change to your thinking and culture on software development practices–we can help with that too.
We offer secure development services working alongside your teams, a variety of training sessions, secure code reviews, penetration testing and general security consulting. Between us, we have experience with many mainstream technical stacks (and some more unusual ones, too). We even have experience running an internal VPN through our own Bastion host in the cloud, if that is something you are keen on setting up.
Written by James Cooper. Reviewed by Chris Horsley and Sid Odgers.