
Assorted links for Friday, July 5:
Assorted links for Friday, July 5:
Assorted links for Thursday, July 4:
Assorted links for Wednesday, July 3:
SLICK can help us locate metric and performance data regarding the reliability of a specific service just by knowing its name. It does this by building an index of onboarded services that link to dashboards with standard visualizations to analyze and assess the service reliability. So, with a single click, it becomes possible to know whether a service currently meets or doesn’t meet user expectations. We can then start asking why.
Assorted links for Tuesday, July 2:
Key DevSecOps metrics:
- Number of security vulnerabilities over time
- Compliance with security policies
According to the RMIT researchers, “Brick kilns worldwide consume 375 million tonnes (~340 million metric tons) of coal in combustion annually, which is equivalent to 675 million tonnes of CO2 emission (~612 million metric tons).” This exceeds the combined annual carbon dioxide emissions of 130 million passenger vehicles in the US.
In the new paper, titled “Scalable MatMul-free Language Modeling,” the researchers describe creating a custom 2.7 billion parameter model without using MatMul ([matrix multiplication]) that features similar performance to conventional large language models (LLMs). They also demonstrate running a 1.3 billion parameter model at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that uses about 13 watts of power (not counting the GPU’s power draw). The implication is that a more efficient FPGA “paves the way for the development of more efficient and hardware-friendly architectures,” they write.
We implemented a concurrency limiter within PlayAPI that prioritizes user-initiated requests over prefetch requests without physically sharding the two request handlers. This mechanism uses the partitioning functionality of the open source Netflix/concurrency-limits Java library.
Assorted links for Monday, July 1:
Most engineers reach for atomic operations in an attempt to produce some lock-free mechanism. Furthermore, programmers enjoy the intellectual puzzle of using atomic operations. Both of these lead to clever implementations which are almost always ill-advised and often incorrect.
sched_ext
allows you to write and run your custom process scheduler optimized for your target workloads and hardware architectures using BPF programs.
We’ve streamlined our investigations through a combination of heuristic-based retrieval and large language model (LLM)-based ranking to provide AI-assisted root cause analysis. During backtesting, this system has achieved promising results: 42% accuracy in identifying root causes for investigations at their creation time related to our web monorepo.
Assorted links for Friday, June 28:
IncludeOS is a minimal unikernel operating system for C++ services running in the cloud and on real hardware. Starting a program with
#include <os>
will include a tiny operating system into your service during link-time.
Assorted links for Thursday, June 27:
Assorted links for Wednesday, June 26:
In this section, we generalize the techniques we developed for binary search to static B-trees and accelerate them further using SIMD instructions. In particular, we develop two new implicit data structures:
- The first is based on the memory layout of a B-tree, and, depending on the array size, it is up to 8x faster than
std::lower_bound
while using the same space as the array and only requiring a permutation of its elements.- The second is based on the memory layout of a B+ tree, and it is up to 15x faster than
std::lower_bound
while using just 6-7% more memory — or 6-7% of the memory if we can keep the original sorted array.
Assorted links for Tuesday, June 25:
Pluvicto and Lutathera are both built around small protein sequences, known as peptides. These peptides specifically bind to target receptors on cancer cells—PSMA in the case of prostate cancer and somatostatin receptors in the case of Lutathera—and deliver radiation through the decay of unstable lutetium.
Administered via infusion into the bloodstream, these drugs circulate throughout the body until they firmly attach to the surfaces of tumor cells they encounter. Anchored at these target sites, the lutetium isotope then releases two types of radiation that aid in cancer treatment. The primary emission consists of beta particles, high-energy electrons capable of penetrating tumors and surrounding cells, tearing into DNA and causing damage that ultimately triggers cell death.
Back in 2019 after various speculation-based CPU vulnerabilities began coming to light, Amazon engineers proposed process-local memory allocations for hiding KVM secrets. They were striving for an alternative mitigation for vulnerabilities like L1TF by essentially providing some memory regions for kernel allocations out of view/access from other kernel code. Amazon engineers this week laid out a new proposal after five years of ongoing Linux kernel improvements for MM-local memory allocations for dealing with current and future speculation-based cross-process attacks.
Assorted links for Monday, June 24:
We introduce a novel framework, Video Annotator (VA), which leverages active learning techniques and zero-shot capabilities of large vision-language models to guide users to focus their efforts on progressively harder examples, enhancing the model’s sample efficiency and keeping costs low.
VA seamlessly integrates model building into the data annotation process, facilitating user validation of the model before deployment, therefore helping with building trust and fostering a sense of ownership. VA also supports a continuous annotation process, allowing users to rapidly deploy models, monitor their quality in production, and swiftly fix any edge cases by annotating a few more examples and deploying a new model version.
Parameter vulnerability factor (PVF) is a novel metric we’ve introduced with the aim to standardize the quantification of AI model vulnerability against parameter corruptions.
…[T]he researchers focus on what they call semantic entropy. This evaluates all the statistically likely answers evaluated by the LLM and determines how many of them are semantically equivalent. If a large number all have the same meaning, then the LLM is likely uncertain about phrasing but has the right answer. If not, then it is presumably in a situation where it would be prone to confabulation and should be prevented from doing so.