Codeslinger: February 2024

Generally, passwords are a better form of security than biometrics. There are a few well-known reasons for this: passwords can be changed, cannot be clandestinely observed, are harder to fake, and cannot be taken from someone unwillingly (eg: via government force, although one could quibble about extortion as a viable mechanism for such). A good password, used for access to a well-designed secure system, is probably the best known single factor for secure access in the world at present (with multi-factor including a password as the "gold standard").

Unfortunately, entering complex passwords is generally arduous and tedious, and doubly so on mobile devices. And yet, I tend to prefer using a mobile device for accessing most secure sites and systems, with that preference generally only increasing as the nominal security requirements increase. That seems counter-intuitive at first glance, but in this case the devil is in the details.

I value "smart security"; that is, security which is deployed in such a way as to increase protection, while minimizing the negative impact on the user experience, and where the additional friction from the security is proportional to the value of the data being protected. For example, I use complex and unique passwords for sites which store data which I consider valuable (financial institutions, sensitive PII aggregation sites, etc.), and I tend to re-use password on sites which either don't have valuable information, or where I believe the security practices there to be suspect (eg: if they do something to demonstrate a fundamental ignorance and/or stupidity with respect to security, such as requiring secondary passwords based on easily knowable data, aka "security questions"). I don't mind entering my complex passwords when the entry is used judiciously, to guard against sensitive actions, and the app/site is otherwise respectful of the potential annoyance factor.

Conversely, I get aggravated with apps and sites which do stupid things which do nothing to raise the bar for security, but constantly annoy users with security checks and policies. Things like time-based password expiration, time-based authentication expiration (especially with short timeouts), repeated password entry (which trains users to type in passwords without thinking about the context), authentication workflows where the data flow is not easily discernible (looking at most OAuth implementations here), etc. demonstrate either an ignorance of what constitutes "net good" security, or a contempt for the user experience, or both. These types of apps and sites are degrading the security experience, and ultimately negatively impacting security for everyone.

Mobile OS's help mitigate this, somewhat, by providing built-in mechanisms to downgrade the authentication systems from password to biometrics in many cases, and thus help compensate for the often otherwise miserable user experience being propagated by the "security stupid" apps and sites. By caching passwords on the devices, and allowing biometric authentication to populate them into forms, the mobile devices are "downgrading" the app/site security to single factor (ie: the device), but generally upgrading the user experience (because although biometrics are not as secure, they are generally "easy"). Thus, by using a mobile device to access an app/site with poor fundamental security design, the downsides can largely be mitigated, at the expense of nominal security in general. This is a trade-off I'm generally willing to make, and I suspect I'm not alone in this regard.

The ideal, of course, would be to raise the bar for security design for apps and sites in general, such that security was based on risk criteria and heuristics, and not (for example) based on arbitrary time-based re-auth checks. Unfortunately, though, there are many dumb organizations in the world, and lots of these types of decisions are ultimately motivated or made by people who are unable or unwilling to consider the net security impact of their bad policies, and/or blocked from making better systems. Most organizations today are "dumb" in this respect, and this is compounded by standards which mandate a level of nominal security (eg: time-based authentication expiration) which make "good" security effectively impossible, even for otherwise knowledgeable organizations. Thus, people will continue to downgrade the nominal security in the world, to mitigate these bad policy decisions, with the tacit acceptance from the industry that this is the best we can do, within the limitations imposed by the business reality in decision making.

It's a messy world; we just do the best we can within it.

Why "Move Fast and Break Things" is insightful, and how many companies still don't get it

Note: I have never worked for FB/Meta (got an offer once, but ended up going to Amazon instead), so I don't have any specific insight. I'm sure there are books, interviews, etc., but the following is my take. I like to think I might have some indirect insight, since the mantra was purportedly based on observing what made startups successful, and I've had some experience with that. See: https://en.wikipedia.org/wiki/Meta_Platforms#History

If you look inside a lot of larger companies, you'll find a lot of process, a lot of meetings, substantial overhead with getting anything off the ground, and a general top-down organizational directive to "not break anything", and "do everything possible to make sure nothing has bugs". I think this stems from how typical management addresses problems in general: if something breaks, it's seen as a failure or deficiency in the process [of producing products and services], and it can and should be addressed by improving the "process". This philosophy leads to the above, but that's not the only factor. For example, over time critical people move on, and that can lead to systems which everyone is afraid to touch, for fear of "breaking something" (which, in the organizational directives, is the worst thing you can do). These factors create an environment of fear, where your protection is carefully following "the process", which is an individual's shield against blame when something goes wrong. After all, deficiencies in the process are not anyone's fault, and as long as the process is continually improved, the products will continue to get better and have less deficiencies over time. That aggregate way of thinking is really what leads to the state described.

I describe that not to be overly critical: for many people in those organizations, this is an unequivocal good thing. Managers love process: it's measurable, it has metrics and dashboards, you can do schedule-based product planning with regular releases, you can objectively measure success against KPR's, etc. It can also be good for IC's, especially those who aspire to have a steady and predictable job, where they follow and optimize their work product for the process (which is usually much harder than optimizing for actual product success in a market, for example). Executives love metrics and predictable schedules, managers love process, and it's far easier to hire and retain "line workers" than creatives, and especially passionate ones. As long as the theory holds (ie: that optimal process leads to optimal business results), this strategy is perceived as optimal for many larger organizations.

It's also, incidentally, why smaller companies can crush larger established companies in markets. The tech boom proved this out, and some people noticed. Hence, Facebook's so-called hacker mentality was enshrined.

"Move fast" is generally more straightforward for people to grasp: the idea is to bias to action, rather than talking about something, particularly when the cost of trying and failing is low (this is related to the "fail fast" mantra). For software development, this tends to mean there's significantly less value in doing a complex design than a prototype: the former takes a lot of work and can diverge significantly from the finished product, while the latter provides real knowledge and lessons, with less overall inefficiency. "Most fast" also encapsulates the idea that you want engineers to be empowered to fix things directly, and not go through layers of approvals and process (eg: Jira) to get to a better incremental product state sooner. Most companies have some corporate value which aligns with this concept.

"Break things" is more controversial; here's my take. This is a direct rebuke of the "put process and gating in place to prevent bugs" philosophy, which otherwise negates the ability to "move fast". Moreover, though, this is also an open invitation to risk product instability in the name of general improvement. It is an acknowledgement that development velocity is fundamentally more valuable to an organization than the pursuit of "perfection". It is also an acknowledgement of the fundamental business risk of having product infrastructure which nobody is willing to touch (for fear of breaking it), and "cover" to try to make it better, even at the expense of stability. It is the knowing acceptance that to create something better, it can be necessary to rebuild that thing, and in the process new bugs might be introduced, and that's okay.

It's genius to put that in writing, even though it might be obvious in terms of the end goal: it's basically an insight and acknowledgement that developer velocity wins, and then a codification of the principles which are fundamentally necessary to optimize for developer velocity. It's hard to understate how valuable that insight was and continues to be in the industry.

Why the mantra evolved to add "with stable infrastructure"

I think this evolution makes sense, as an acknowledgement of a few additional things in particular, which are both very relevant to a larger company (ie: one which has grown past the "build to survive" phase, and into the "also maintain your products" phase):

You need your products to continue to function in the market, at least in terms of "core" functionality
You need your internal platforms to function, otherwise you cannot maintain internal velocity
You want stable foundations upon which to build on, to sustain (or continue to increase) velocity as you expand product scope

I think the first two are obvious, so let me just focus on the third point, as it pertains to development. Scaling development resources linearly with code size doesn't work well, because there is overhead in product maintenance, and inter-people communications. Generally you want to raise the level of abstraction involved in producing and maintaining functionality, such that you can "do more with less", However, this is not generally possible unless you have reliable "infrastructure" (at the code level) which you can build on top of, with high confidence that the resulting product code will be robust (at least in so far as the reliance on the infrastructure). This, fundamentally, allows scaling the development resources linearly with product functionality (not code size), which is a much more attainable goal.

Most successful companies get to this point in their evolution (ie: where they would otherwise get diminishing returns from internal resource scaling based on overhead). The smart ones recognize the problem, and shift to building stable infrastructure as a priority (while still moving fast and breaking things, generally), so as to be able to continue to scale product value efficiently. The ones with less insightful leadership end up churning with rewrites and/or lack of code reusability, scramble to fix compounding bugs, struggle with code duplication and legacy tech debt, etc. This is something which continues to be a challenge to even many otherwise good companies, and the genius of FB/Meta (imho) is recognizing this and trying to enshrine the right approach into their culture.

That's my take, anyway, fwiw.

Codeslinger

Monday, February 19, 2024

Mobile devices and security

Sunday, February 18, 2024

The Genius of FB's Motto

Why "Move Fast and Break Things" is insightful, and how many companies still don't get it

Why the mantra evolved to add "with stable infrastructure"

About Me