Assorted Links | Steven Engelhardt

Friday 2024-07-05 Assorted Links

Assorted Links links

Published: 2024-07-05

Assorted links for Friday, July 5:

Thursday 2024-07-04 Assorted Links

Assorted Links links

Published: 2024-07-04

Assorted links for Thursday, July 4:

Wednesday 2024-07-03 Assorted Links

Assorted Links links

Published: 2024-07-03

Assorted links for Wednesday, July 3:

Error and Transaction Handling in SQL Server: Part One – Jumpstart Error Handling
Error and Transaction Handling in SQL Server: Part Two – Commands and Mechanisms
Error and Transaction Handling in SQL Server: Part Three – Implementation
SLICK: Adopting SLOs for improved reliability

SLICK can help us locate metric and performance data regarding the reliability of a specific service just by knowing its name. It does this by building an index of onboarded services that link to dashboards with standard visualizations to analyze and assess the service reliability. So, with a single click, it becomes possible to know whether a service currently meets or doesn’t meet user expectations. We can then start asking why.
Using Admission Controllers to Detect Container Drift at Runtime

Tuesday 2024-07-02 Assorted Links

Assorted Links links

Published: 2024-07-02

Assorted links for Tuesday, July 2:

How to Measure DevSecOps Success: Key Metrics Explained
Key DevSecOps metrics:
1. Number of security vulnerabilities over time
2. Compliance with security policies
“Energy-smart” bricks need less power to make, are better insulation

According to the RMIT researchers, “Brick kilns worldwide consume 375 million tonnes (~340 million metric tons) of coal in combustion annually, which is equivalent to 675 million tonnes of CO2 emission (~612 million metric tons).” This exceeds the combined annual carbon dioxide emissions of 130 million passenger vehicles in the US.
Researchers upend AI status quo by eliminating matrix multiplication in LLMs

In the new paper, titled “Scalable MatMul-free Language Modeling,” the researchers describe creating a custom 2.7 billion parameter model without using MatMul ([matrix multiplication]) that features similar performance to conventional large language models (LLMs). They also demonstrate running a 1.3 billion parameter model at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that uses about 13 watts of power (not counting the GPU’s power draw). The implication is that a more efficient FPGA “paves the way for the development of more efficient and hardware-friendly architectures,” they write.
Enhancing Netflix Reliability with Service-Level Prioritized Load Shedding

We implemented a concurrency limiter within PlayAPI that prioritizes user-initiated requests over prefetch requests without physically sharding the two request handlers. This mechanism uses the partitioning functionality of the open source Netflix/concurrency-limits Java library.
Explaining generative language models to (almost) anyone

Monday 2024-07-01 Assorted Links

Assorted Links links

Published: 2024-07-01

Assorted links for Monday, July 1:

The Danger of Atomic Operations

Most engineers reach for atomic operations in an attempt to produce some lock-free mechanism. Furthermore, programmers enjoy the intellectual puzzle of using atomic operations. Both of these lead to clever implementations which are almost always ill-advised and often incorrect.
What an SBOM can do for you
sched_ext: a BPF-extensible scheduler class (Part 1)

sched_ext allows you to write and run your custom process scheduler optimized for your target workloads and hardware architectures using BPF programs.
sched_ext: scheduler architecture and interfaces (Part 2)
Leveraging AI for efficient incident response

We’ve streamlined our investigations through a combination of heuristic-based retrieval and large language model (LLM)-based ranking to provide AI-assisted root cause analysis. During backtesting, this system has achieved promising results: 42% accuracy in identifying root causes for investigations at their creation time related to our web monorepo.

Friday 2024-06-28 Assorted Links

Assorted Links links

Published: 2024-06-28

Assorted links for Friday, June 28:

IncludeOS

IncludeOS is a minimal unikernel operating system for C++ services running in the cloud and on real hardware. Starting a program with #include <os> will include a tiny operating system into your service during link-time.
Designing Uber
Designing Tinder
Fixing Performance Regressions Before they Happen
How we used C++20 to eliminate an entire class of runtime bugs

Thursday 2024-06-27 Assorted Links

Assorted Links links

Published: 2024-06-27

Assorted links for Thursday, June 27:

Wednesday 2024-06-26 Assorted Links

Assorted Links links

Published: 2024-06-26

Assorted links for Wednesday, June 26:

Speed Up Your CI/CD Pipeline with Change-Based Testing in a Yarn-Based Monorepo: I note that only building and testing what changed is one of the core value propositions of Bazel, but adopting Bazel often requires large investment in engineering and training.
What makes a good REST API?
How to use DORA metrics to improve software delivery
Don’t Get Lost in the Metrics Maze: A Practical Guide to SLOs, SLIs, Error Budgets, and Toil
Static B-Trees
In this section, we generalize the techniques we developed for binary search to static B-trees and accelerate them further using SIMD instructions. In particular, we develop two new implicit data structures:
- The first is based on the memory layout of a B-tree, and, depending on the array size, it is up to 8x faster than std::lower_bound while using the same space as the array and only requiring a permutation of its elements.
- The second is based on the memory layout of a B+ tree, and it is up to 15x faster than std::lower_bound while using just 6-7% more memory — or 6-7% of the memory if we can keep the original sorted array.

Tuesday 2024-06-25 Assorted Links

Assorted Links links

Published: 2024-06-25

Assorted links for Tuesday, June 25:

Radioactive drugs strike cancer with precision

Pluvicto and Lutathera are both built around small protein sequences, known as peptides. These peptides specifically bind to target receptors on cancer cells—PSMA in the case of prostate cancer and somatostatin receptors in the case of Lutathera—and deliver radiation through the decay of unstable lutetium.

Administered via infusion into the bloodstream, these drugs circulate throughout the body until they firmly attach to the surfaces of tumor cells they encounter. Anchored at these target sites, the lutetium isotope then releases two types of radiation that aid in cancer treatment. The primary emission consists of beta particles, high-energy electrons capable of penetrating tumors and surrounding cells, tearing into DNA and causing damage that ultimately triggers cell death.
Amazon Exploring MM-Local Memory Allocations To Help With Current/Future Speculation Attacks

Back in 2019 after various speculation-based CPU vulnerabilities began coming to light, Amazon engineers proposed process-local memory allocations for hiding KVM secrets. They were striving for an alternative mitigation for vulnerabilities like L1TF by essentially providing some memory regions for kernel allocations out of view/access from other kernel code. Amazon engineers this week laid out a new proposal after five years of ongoing Linux kernel improvements for MM-local memory allocations for dealing with current and future speculation-based cross-process attacks.
TypeSpec: An API design language that either competes with, or augments, OpenAPI.
Optimize Kubernetes Pods’ Startup Time Using VolumeSnapshots: If your K8S application uses enormous, static data sources, using VolumeSnapshots may speed up its launch time significantly.
Building a GitOps CI/CD Pipeline with GitHub Actions (SOC 2)

Monday 2024-06-24 Assorted Links

Assorted Links links

Published: 2024-06-24

Assorted links for Monday, June 24:

The time smart quotes prevented the entire Office division from committing code
Video annotator: a framework for efficiently building video classifiers using vision-language models and active learning

We introduce a novel framework, Video Annotator (VA), which leverages active learning techniques and zero-shot capabilities of large vision-language models to guide users to focus their efforts on progressively harder examples, enhancing the model’s sample efficiency and keeping costs low.

VA seamlessly integrates model building into the data annotation process, facilitating user validation of the model before deployment, therefore helping with building trust and fostering a sense of ownership. VA also supports a continuous annotation process, allowing users to rapidly deploy models, monitor their quality in production, and swiftly fix any edge cases by annotating a few more examples and deploying a new model version.
PVF: A novel metric for understanding AI systems’ vulnerability against SDCs in model parameters

Parameter vulnerability factor (PVF) is a novel metric we’ve introduced with the aim to standardize the quantification of AI model vulnerability against parameter corruptions.
Keeping main green in a monorepo
Researchers describe how to tell if ChatGPT is confabulating

…[T]he researchers focus on what they call semantic entropy. This evaluates all the statistically likely answers evaluated by the LLM and determines how many of them are semantically equivalent. If a large number all have the same meaning, then the LLM is likely uncertain about phrasing but has the right answer. If not, then it is presumably in a situation where it would be prone to confabulation and should be prevented from doing so.