March, 2024

article thumbnail

Introducing DBRX: A New State-of-the-Art Open LLM by Databricks

databricks

Comments

145
145
article thumbnail

Building Meta’s GenAI Infrastructure

Engineering at Meta

Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads. We use this cluster design for Llama 3 training. We are strongly committed to open compute and open source.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Introducing Tableflow

Confluent

Seamlessly integrate Apache Kafka data into your lakehouse as Apache Iceberg tables, bridging the operational and analytical divide, with Tableflow. Read more in our blog post.

133
133
article thumbnail

World Backup Day Is So 2023 – How About World Data Resilience Day?

Dataversity

Instead of celebrating World Backup Day 2024 for accomplishing another year of successful backups, I recommend using it to look forward to a year of testing recovery. Instead of starting data protection strategies by planning backups, organizations should flip their mindset and start by planning recovery: What data needs to be recovered first? What systems […] The post World Backup Day Is So 2023 – How About World Data Resilience Day?

article thumbnail

Cloudera’s RHEL-volution: Powering the Cloud with Red Hat

Cloudera Blog

As enterprise AI technologies rapidly reshape our digital environment, the foundation of your cloud infrastructure is more critical than ever. That’s why Cloudera and Red Hat , renowned for their open-source solutions, have teamed up to bring Red Hat Enterprise Linux ( RHEL ) to Cloudera on public cloud as the operating system for all of our public cloud platform images.

Cloud 115
article thumbnail

The New AI Era: Networking for AI and AI for Networking*

Arista Networks

As we all recover from NVIDIA’s exhilarating GTC 2024 in San Jose last week, AI state-of-the-art news seems fast and furious. Nvidia’s latest Blackwell GPU announcement and Meta’s blog validating Ethernet for their pair of clusters with 24,000 GPUs to train on their Llama 3 large language model (LLM) made the headlines. Networking has come a long way, accelerating pervasive compute, storage, and AI workloads for the next era of AI.

article thumbnail

Announcing DBRX: A new standard for efficient open source LLMs

databricks

Databricks’ mission is to deliver data intelligence to every enterprise by allowing organizations to understand and use their unique data to build their.

144
144

More Trending

article thumbnail

Best Practices for Confluent Schema Registry

Confluent

Learn the best practices for using Confluent Schema Registry, including using schema IDs, understanding subjects and versions, using data contracts, pre-registering schemas, and more.

111
111
article thumbnail

Four New Apache Cassandra 5.0 Features to Be Excited About

Dataversity

With the recent beta release of Apache Cassandra 5.0, now is a great time for teams to give it a spin and discover 5.0’s most interesting and anticipated new capabilities. As I’ve poked around with the new beta, here are four features introduced with open-source Cassandra 5.0 that developer teams should be excited about: 1. Vector […] The post Four New Apache Cassandra 5.0 Features to Be Excited About appeared first on DATAVERSITY.

Education 115
article thumbnail

A Look Ahead at the Gartner Data & Analytics Summit

Cloudera Blog

As we enter into a new month, the Cloudera team is getting ready to head off to the Gartner Data & Analytics Summit in Orlando, Florida for one of the most important events of the year for Chief Data Analytics Officers (CDAOs) and the field of data and analytics. We’re at a crucial point in time where the excitement and potential surrounding AI has elevated the importance of improving access to the mission-critical data that helps organizations implement it at scale.

article thumbnail

Creative Ways to Surf Your Data Using Virtual and Augmented Reality

TDAN

Organizations often struggle with finding nuggets of information buried within their data to achieve their business goals. Technology sometimes comes along to offer some interesting solutions that can bridge that gap for teams that practice good data management hygiene.

article thumbnail

Lilac Joins Databricks to Simplify Unstructured Data Evaluation for Generative AI

databricks

Today, we are thrilled to announce that Lilac is joining Databricks. Lilac is a scalable, user-friendly tool for data scientists to search, cluster.

142
142
article thumbnail

Making messaging interoperability with third parties safe for users in Europe

Engineering at Meta

To comply with a new EU law, the Digital Markets Act (DMA), which comes into force on March 7th, we’ve made major changes to WhatsApp and Messenger to enable interoperability with third-party messaging services. We’re sharing how we enabled third-party interoperability (interop) while maintaining end-to-end encryption (E2EE) and other privacy guarantees in our services as far as possible.

Protocol 135
article thumbnail

Article: Relational Data at the Edge: How Cloudflare Operates Distributed PostgreSQL Clusters

InfoQ Articles

Explore Cloudflare's distributed PostgreSQL clusters and learn how a cross-region architecture ensures resilience. Discover how data storage and access at the edge deliver massive performance gains by reducing location-sensitive latency and why architecting for degraded states is much harder than for failure states.

article thumbnail

Future-Proof Your Cyber Risk Management with These Top Trends in 2024 (Part II)

Dataversity

As shared in part one of this installment, the global marketplace faces an increasingly destructive cyber risk landscape each year, and 2024 is set to confirm this trend. The cost of data breaches alone is expected to reach $5 trillion, a growth of 11% from 2023. As technology advances, attackers continue to develop new, more sophisticated methods […] The post Future-Proof Your Cyber Risk Management with These Top Trends in 2024 (Part II) appeared first on DATAVERSITY.

Education 109
article thumbnail

Setting Up Kafka Multi-Tenancy 

DoorDash Engineering

Real-time event processing is a critical component of a distributed system’s scalability. At DoorDash, we rely on message queue systems based on Kafka to handle billions of real-time events. One of the challenges we face, however, is how to properly validate the system before going live. Traditionally, an isolated environment such as staging is used to validate new features.

article thumbnail

Data-Centric: How Big Things Get Done (in IT)

TDAN

I read “How Big Things Get Done” when it first came out about six months ago.[1] I liked it then. But recently, I read another review of it, and another coin dropped.

Education 105
article thumbnail

Delivering the Next Generation of Consumer Experiences: Databricks and Adobe Announce Strategic Partnership

databricks

By Steve Sobel - Global Industry Leader; Communications, Media & Entertainment Today Databricks and Adobe are excited to announce a strategic partnership focused.

Media 135
article thumbnail

Bringing HDR photo support to Instagram and Threads

Engineering at Meta

Meta’s family of apps serves trillions of image download requests every day. And if you’re into high-quality images, you’ve probably noticed that Instagram and Threads have added support for high dynamic range (HDR) photos. Now people on Threads and Instagram can upload and share images that are more true-to-life, with the full color and range their device is capable of capturing.

Media 105
article thumbnail

Article: The Hidden Cost of Using Managed Databases

InfoQ Articles

The rising popularity of managed relational databases brings hidden costs and informed decisions are crucial for optimal use. This article shows the importance of monitoring service expenses, revising default settings, and understanding operational constraints, considering limitations like reduced flexibility and observability.

article thumbnail

The Cool Kids Corner: Neurodiversity in Your Team

Dataversity

Hello! I’m Mark Horseman, and welcome to The Cool Kids Corner. This is my monthly check-in to share with you the people and ideas I encounter as a data evangelist with DATAVERSITY. (Last month, we discussed the importance of communication.) This month, we’re talking about neurodiversity in your data team. The term neurodiversity was first […] The post The Cool Kids Corner: Neurodiversity in Your Team appeared first on DATAVERSITY.

article thumbnail

How to make architecture decisions

Ben Morris

Knowing what decisions to make and when to make them can be something of a fine art

100
100
article thumbnail

Data Architecture and Strategy in the AI Era

Cloudera Blog

At a time when AI is exploding in popularity and finding its way into nearly every facet of business operations, data has arguably never been more valuable. More recently, that value has been made clear by the emergence of AI-powered technologies like generative AI (GenAI) and the use of Large Language Models (LLMs). But, even with the backdrop of an AI-dominated future, many organizations still find themselves struggling with everything from managing data volumes and complexity to security conc

article thumbnail

Databricks invests in Mistral AI and integrates Mistral AI’s models into the Databricks Data Intelligence Platform

databricks

Sharing a belief that open source solutions will foster innovation and transparency in generative AI development, Databricks has announced a partnership and participation.

133
133
article thumbnail

Logarithm: A logging engine for AI training workflows and services

Engineering at Meta

Systems and application logs play a key role in operations, observability, and debugging workflows at Meta. Logarithm is a hosted, serverless, multitenant service, used only internally at Meta, that consumes and indexes these logs and provides an interactive query interface to retrieve and view logs. In this post, we present the design behind Logarithm, and show how it powers AI training debugging use cases.

article thumbnail

Article: AWS Lambda Under the Hood

InfoQ Articles

Mike Danilov, a senior principal engineer at AWS, presented on AWS Lambda and what is under the hood during QCon San Francisco 2023. This article represents the talk, which will start with an introduction to Lambda itself to outline the key concepts of the service and its fundamentals, which will facilitate a deep dive into the understanding of the system.

article thumbnail

Mastering Microsoft SQL Server: Analyzing and Optimizing Complex Queries

Dataversity

In the realm of database management, particularly with Microsoft SQL Server, understanding and optimizing complex queries is crucial for maintaining system performance and efficiency. As databases grow and complexity, the queries used to retrieve, update, or manipulate data can become increasingly intricate, potentially leading to slower response times and decreased application performance.

Server 105
article thumbnail

Confluent Cloud for Apache Flink Is Now Generally Available

Confluent

Confluent Cloud's serverless Flink offering is now available on all major clouds, offering a unified, managed platform for real-time data processing.

Cloud 93
article thumbnail

Navigating the Network: The Quest for Innocence in a World of Complexity

Arista Networks

Welcome to the digital age, where the marvels of self-driving cars and sophisticated AI like ChatGPT grace our everyday lives. Yet, amidst these advancements, a battleground often goes unnoticed, hidden within the layers of our network infrastructures. It's a world where network teams are the unsung heroes, tirelessly working behind the scenes to keep our digital lifelines seamless and uninterrupted.

article thumbnail

Implementing LLM Guardrails for Safe and Responsible Generative AI Deployment on Databricks

databricks

Introduction Let’s explore a common scenario – your team is eager to leverage open source LLMs to build chatbots for customer support interactions.

119
119
article thumbnail

Better video for mobile RTC with AV1 and HD

Engineering at Meta

At Meta, we support real-time communication (RTC) for billions of people through our apps, including Messenger, Instagram, and WhatsApp. We’ve seen significant benefits by adopting the AV1 codec for RTC. Here’s how we are improving the RTC video quality for our apps with tools like the AV1 codec, the challenges we face, and how we mitigate those challenges.

Bandwidth 104
article thumbnail

Article: How to Use Rust Procedural Macros to Replace Panic with syn’s Fold

InfoQ Articles

In this article, we show how you can write advanced macros to step through Rust code and modify it. Using the standard tooling available in the syn crate, we first show how to change the occurrence of a panic into an Err. Then we go a step beyond and use the Fold trait to recursively step through the entire function, automatically executing a change in every applicable location.

article thumbnail

Improving ETAs with Multi-Task Models, Deep Learning, and Probabilistic Forecasts

DoorDash Engineering

The DoorDash ETA team is committed to providing an accurate and reliable estimated time of arrival (ETA) as a cornerstone DoorDash consumer experience. We want to ensure that every customer can trust our ETAs, ensuring a high-quality experience in which their food arrives on time every time. With more than 2 billion orders annually, our dynamic engineering challenge is to improve and maintain accuracy at scale while managing a variety of conditions within diverse delivery and merchant scenarios.

article thumbnail

A Closer Look at The Next Phase of Cloudera’s Hybrid Data Lakehouse

Cloudera Blog

Artificial Intelligence (AI) is primed to reshape the way just about every business operates. Cloudera research projected that more than one third (36%) of organizations in the U.S. are in the early stages of exploring the potential for AI implementation. But even with its rise, AI is still a struggle for some enterprises. AI, and any analytics for that matter, are only as good as the data upon which they are based.

article thumbnail

Real-Time Data Streaming for Smart Warehouses

Confluent

Leverage Confluent Data Streaming Platform to bring real time to your smart warehouse, powering greater IoT automation, efficiency and cost savings.

IoT 84
article thumbnail

Introducing the Databricks AI Security Framework (DASF)

databricks

We are excited to announce the release of the Databricks AI Security Framework (DASF) version 1.0 whitepaper! The framework is designed to improve.

116
116
article thumbnail

The Rise of Generative AI in Insurance

Dataversity

The global market for artificial intelligence (AI) in insurance is predicted to reach nearly $80 billion by 2032, according to Precedence Research. This growth is being driven by the increased adoption of AI within insurance companies, enhancing their operational efficiency, risk management, and customer engagement. Despite widespread integration of AI in the industry today, its full […] The post The Rise of Generative AI in Insurance appeared first on DATAVERSITY.