We are excited to announce the General Availability (GA) of the native JSON data type and JSON
aggregates – JSON_OBJECTAGG & JSON_ARRAYAGG. You can use the JSON data type and JSON
aggregates to integrate and work with JSON documents more efficiently in the database.
This functionality is available in Azure SQL Database, Azure SQL Managed Instance with the
Always-up-to-date update policy.
We are excited to announce the addition of the Microsoft.Extensions.AI.Evaluation.Safety
package to the Microsoft.Extensions.AI.Evaluation libraries! This new package provides
evaluators that help you detect harmful or sensitive content — such as hate speech, violence,
copyrighted material, insecure code, and more — within AI-generated content in your
Intelligent Applications.
Azure Cosmos DB now includes Sharded DiskANN, a powerful capability that optimized for
large-scale multitenant apps by splitting a DiskANN index into smaller, more performant pieces.
The Data API is a RESTful HTTPS interface that enables developers to interact with their
MongoDB data hosted in vCore-based Azure Cosmos DB directly from their applications—no
drivers or complex query logic required. It offers a simple, efficient way to connect your
MongoDB data with web apps, mobile apps, and tools like Power BI.
Today, we’re excited to announce the preview of Per Partition Automatic Failover (PPAF) for
Azure Cosmos DB, a significant improvement to our single-region write accounts that boosts
availability and resilience.
Managing and understanding large-scale data ecosystems is a significant challenge for many
organizations, requiring innovative solutions to efficiently safeguard user data. Meta’s
vast and diverse systems make it particularly challenging to comprehend its structure,
meaning, and context at scale.
To address these challenges, we made substantial investments in advanced data understanding
technologies, as part of our Privacy Aware Infrastructure (PAI). Specifically, we have
adopted a “shift-left” approach, integrating data schematization and annotations early in
the product development process. We also created a universal privacy taxonomy, a
standardized framework providing a common semantic vocabulary for data privacy management
across Meta’s products that ensures quality data understanding and provides developers with
reusable and efficient compliance tooling.
Storing and querying text embeddings in a database it might seem challenging, but with the
right schema design, it’s not only possible, it’s powerful. Whether you’re building AI-powered
search, semantic filtering, or recommendation features, embeddings, and thus vectors, are now
a first-class data type. So how do you model them well inside a database like SQL Server and
Azure SQL?
T-SQL Analyzer is a free, open-source, cross platform command line tool for identifying, and
reporting the presence of anti-patterns and design issues in SQL Server T-SQL scripts.
Roughly speaking, the cost of a system scales with its (short-term1) peak traffic, but for
most applications the value the system generates scales with the (long-term) average traffic.
The gap between “paying for peak” and “earning on average” is critical to understand how the
economics of large-scale cloud systems differ from traditional single-tenant systems.
This is the story of how team members across NuGet, Visual Studio, and .NET embarked on a
journey to fully rewrite the NuGet Restore algorithm to achieve break-through scale and
performance. Written from the perspective of several team members, this entry provides a deep
dive into the internals of NuGet, as well as strategies to identify and address performance
issues. We hope that you enjoy it!
In a previous blog post, we described how Netflix uses eBPF to capture TCP flow logs at scale
for enhanced cloud network insights. In this post, we delve deeper into how Netflix solved a
core problem: accurately attributing flow IP addresses to workload identities.
In this post I first briefly describe the standard lifetimes available in the .NET DI
container. I then briefly describe the three hypothetical lifetimes described in the podcast.
Finally, I show how you could implement one of these lifetimes in practice. In the next post
I show a possible implementation for the remaining lifetime.
In today’s data-driven world, delivering precise and contextually relevant search results is
critical. SQL Server and Azure SQL Database now enable this through Hybrid Search—a technique
that combines traditional full-text search with modern vector similarity search. This allows
developers to build intelligent, AI-powered search experiences directly inside the database
engine.
In this work, we significantly further the understanding of real-world cache workloads by
collecting production traces from 153 in-memory cache clusters at Twitter, sifting through
over 80 TB of data, and sometimes interpreting the workloads in the context of the business
logic behind them.
[Y]ou should assume your data is corrupt between when a write is issued until after a flush or
force unit access write completes. However, most programs use system calls to write data. This
article looks at the guarantees provided by the Linux file APIs. It seems like this should be
simple: a program calls write() and after it completes, the data is durable. However,
write() only copies data from the application into the kernel’s cache in memory. To force
the data to be durable you need to use some additional mechanism.
Traefik is The Cloud Native Edge Router yet another reverse proxy and load balancer. Omitting
all the Cloud Native buzzwords, what really makes Traefik different from Nginx, HAProxy, and
alike is the automatic and dynamic configurability it provides out of the box. And the most
prominent part of it is probably its ability to do automatic service discovery.
Which is better, Rust or Go—and does that question even make sense? Which language should you
choose for your next project in 2025, and why? How does Rust compare with Go in areas like
performance, simplicity, safety, features, scale, and concurrency?
Twine is our homegrown cluster management system, which has been running in production for the
past decade. A cluster management system allocates workloads to machines and manages the life
cycle of machines, containers, and workloads. Kubernetes is a prominent example of an open
source cluster management system. Twine has helped convert our infrastructure from a
collection of siloed pools of customized machines dedicated to individual workloads to a
large-scale ubiquitous shared infrastructure in which any machine can run any workload.
The purpose of this document is to describe the path data takes from the application down to
the storage, concentrating on places where data is buffered, and to then provide best
practices for ensuring data is committed to stable storage so it is not lost along the way in
the case of an adverse event. The main focus is on the C programming language, though the
system calls mentioned should translate fairly easily to most other languages.
As one of the underlying engines, Uber Money fulfills some of the most important aspects of
people’s engagement in the Uber experience. A system like this should not only be robust, but
should also be highly available with zero-tolerance to downtime, after our success mantra:
“To collect and disburse on-time, accurately and in-compliance”.
While we expand to multiple lines of businesses, and strategize the next best, the engineers
in Uber Money also thrive on building the next generation’s Payments Platform which extends
Uber’s growth. In this blog, we introduce you to this platform and provide insights into our
learnings. This includes migrating hundreds of millions customers between two asynchronous
systems while maintaining data-consistency with a goal of zero impact on our users.
Within AWS, a common pattern is to split the system into services that are responsible for
executing customer requests (the data plane), and services that are responsible for managing
and vending customer configuration (the control plane). In this article, I discuss a number
of different ways the data plane and the control plane interact with each other to avoid
system overload. In many of these architectures the larger data plane fleet calls the smaller
control plane fleet, but I also want to share the success we’ve had at Amazon when we put the
smaller fleet in control.
As I’ve lamented previously, the documentation for xperf (Windows Performance Toolkit) is a
bit light. The names of the columns in the summary tables can be exquisitely subtle, and I
have never found any documentation for them. But, I’ve talked to the xperf authors, and I’ve
used xperf a lot, and I’ve done some experiments, and here I share some more results, this
time for the Disk Usage summary table.
In just 20 years, software engineering has shifted from architecting monoliths with a single
database and centralized state to microservices where everything is distributed across
multiple containers, servers, data centers, and even continents. Distributing things solves
scaling concerns, but introduces a whole new world of problems, many of which were previously
solved by monoliths.
FioSynth is a benchmark tool used to automate the execution of storage workload suites and to
parse results. It contains a base set of block level storage workloads, synthesized from
production I/O traces, that simulate a diverse range of Facebook production services. It is
useful for predicting how a storage device will perform in realistic production environments
and for assisting with performance tuning.
Project Teleport removes the cost of download and decompression by SMB mounting pre-expanded
layers from the Azure Container Registry to Teleport enabled Azure container hosts.
In July 2020 I went on a color-scheme vision quest. This led to some research on various color
spaces and their utility, some investigation into the styling guidelines outlined by the
base16 project, and the color utilities that ship within the GNU Emacs text editor. This
article will be a whirlwind tour of things you can do to individual colors, and at the end how
I put these blocks together.
In this article I will demonstrate that while hardware changed dramatically over the past
decade, software APIs have not, or at least not enough. Riddled with memory copies, memory
allocations, overly optimistic read ahead caching and all sorts of expensive operations,
legacy APIs prevent us from making the most of our modern devices.
[eBPF and io_uring] may look evolutionary, but they are revolutionary in the sense that they
will — we bet — completely change the way applications work with and think about the Linux
Kernel.
I thought it would be helpful to write a guide to dev tools outside of Google for the
ex-Googler, written with an eye toward pragmatism and practicality. No doubt many ex-Googlers
wish they could simply clone the Google internal environment to their new company, but you
can’t boil the ocean. Here is my take on where you should start and a general path I think
ex-Googlers can take to find the tools that will make them - and their new teams - as
productive as possible.
There have been recent attempts to enrich large-scale data stores, such as HBase and BigTable,
with transactional support. Not surprisingly, inspired by traditional database management
systems, serializability is usually compromised for the benefit of efficiency. For example,
Google Percolator, implements lock-based snapshot isolation on top of BigTable. We show in
this paper that this compromise is not necessary in lock-free implementations of transactional
support. We introduce write-snapshot isolation, a novel isolation level that has a
performance comparable with that of snapshot isolation, and yet provides serializability.
This thesis presents the first implementation-independent specifications of existing ANSI
isolation levels and a number of levels that are widely used in commercial systems, e.g.,
Cursor Stability, Snapshot Isolation. It also specifies a variety of guarantees for
predicate-based operations in an implementation-independent manner. Two new levels are defined
that provide useful consistency guarantees to application writers; one is the weakest level
that ensures consistent reads, while the other captures some useful consistency properties
provided by pessimistic implementations.
This post is about gaining intuition for Write Skew, and, by extension, Snapshot Isolation.
Snapshot Isolation is billed as a transaction isolation level that offers a good mix between
performance and correctness, but the precise meaning of “correctness” here is often vague. In
this post I want to break down and capture exactly when the thing called “write skew” can
happen.
Here are some observations on how parsers can be constructed in a way that makes it easier to
recover from parse errors, produce multiple diagnostics in one pass, and provide partial
results for further analysis even in the face of errors, providing a better experience for
user-driven command line tools and interactive environments.
Build systems are awesome, terrifying – and unloved. They are used by every developer around
the world, but are rarely the object of study. In this paper, we offer a systematic, and
executable, framework for developing and comparing build systems, viewing them as related
points in a landscape rather than as isolated phenomena. By teasing apart existing build
systems, we can recombine their components, allowing us to prototype new build systems with
desired properties.
In this paper we introduce a new set of codes for erasure coding called Local Reconstruction
Codes (LRC). LRC reduces the number of erasure coding fragments that need to be read when
reconstructing data fragments that are offline, while still keeping the storage overhead
low.
Software dependencies carry with them serious risks that are too often overlooked. The shift
to easy, fine-grained software reuse has happened so quickly that we do not yet understand the
best practices for choosing and using dependencies effectively, or even for deciding when they
are appropriate and when not. My purpose in writing this article is to raise awareness of the
risks and encourage more investigation of solutions.