January, 2024

article thumbnail

Building Trust in the Digital Age: The Role of Data Verification

Dataversity

Data has famously been referred to as the “new oil,” powering the fifth industrial revolution. As our reliance on data-intensive sectors like finance, healthcare, and the Internet of Things (IoT) grows, the question of trust becomes paramount. Trust is a multifaceted issue when dealing with data and events, and one core component is data verification. […] The post Building Trust in the Digital Age: The Role of Data Verification appeared first on DATAVERSITY.

article thumbnail

LLM Training and Inference with Intel(R) Gaudi(R) 2 AI Accelerators

databricks

At Databricks, we want to help our customers build and deploy generative AI applications on their own data without sacrificing data privacy or.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Article: Cloud-Computing in the Post-Serverless Era: Current Trends and Beyond

InfoQ Articles

Discover the evolution of cloud-computing in the post-serverless era, with a shift towards hyper-specialized vertical multi-cloud services and a trend moving from Infrastructure as Code to Composition as Code. Microservices are being redefined in the cloud landscape and upcoming cloud services are set to be rich in constructs.

Cloud 131
article thumbnail

The AI Playbook: Providing Important Reminders to Data Professionals

TDAN

Eric Siegel’s “The AI Playbook” serves as a crucial guide, offering important insights for data professionals and their internal customers on effectively leveraging AI within business operations.

article thumbnail

Architecture governance is a spectrum: exploring styles of enterprise architecture

Ben Morris

There is a spectrum of different styles of architecture governance, from the tightly structured and framework-driven, through to the deliberate absence of formal architecture.

article thumbnail

Improving machine learning iteration speed with faster application build and packaging

Engineering at Meta

Slow build times and inefficiencies in packaging and distributing execution files were costing our ML/AI engineers a significant amount of time while working on our training stack. By addressing these issues head-on, we were able to reduce this overhead by double-digit percentages. In the fast-paced world of AI/ML development, it’s crucial to ensure that our infrastructure can keep up with the increasing demands and needs of our ML engineers, whose workflows include checking out code, writing c

article thumbnail

Why Data Intelligence Is Imperative to Achieve a Clean Energy Future

Dataversity

We should be further along with the consumer adoption of renewable energy. Solar energy has long since emerged as the lowest cost per kWh power source, making it a key puzzle piece in our transition to clean energy. Added to that, the Inflation Reduction Act of 2022 was passed to significantly incentivize private investment into clean energy […] The post Why Data Intelligence Is Imperative to Achieve a Clean Energy Future appeared first on DATAVERSITY.

Energy 123

More Trending

article thumbnail

Article: Architecting with Java Persistence: Patterns and Strategies

InfoQ Articles

Explore a spectrum of Java persistence patterns, from data-oriented to domain-centric. Delve into Driver, Mapper, DAO, Active Record, and Repository for robust architectural foundations.

130
130
article thumbnail

Confluent Partner Awards 2023

Confluent

We’re celebrating excellence across the Data Streaming ecosystem. In this blog post, we announce the global and regional categories that Confluent will recognize across its 2023 Partner of the Year award winners.

92
article thumbnail

Staying in the Zone: How DoorDash used a service mesh to manage  data transfer, reducing hops and cloud spend

DoorDash Engineering

There have been many benefits gained through DoorDash’s evolution from a monolithic application architecture to one that is based on cells and microservices. The new architecture has reduced the time required for development, test, and deployment and at the same time has improved scalability and resiliency for end-users including merchants, Dashers, and consumers.

Cloud 84
article thumbnail

Lazy is the new fast: How Lazy Imports and Cinder accelerate machine learning at Meta

Engineering at Meta

At Meta, the quest for faster model training has yielded an exciting milestone: the adoption of Lazy Imports and the Python Cinder runtime. The outcome? Up to 40 percent time to first batch (TTFB) improvements, along with a 20 percent reduction in Jupyter kernel startup times. This advancement facilitates swifter experimentation capabilities and elevates the ML developer experience (DevX).

article thumbnail

Data Under Siege? Combatting the Weaponization of Information in the Digital Age

Dataversity

The emergence of generative AI marks a pivotal shift in the digital landscape, profoundly impacting our ability to discern reality from fabrication. This technology, capable of producing highly convincing and realistic content such as news articles, social media posts, images, and videos, blurs the line between what’s authentic and what’s engineered.

Media 119
article thumbnail

Serving Quantized LLMs on NVIDIA H100 Tensor Core GPUs

databricks

Quantization is a technique for making machine learning models smaller and faster. We quantize Llama2-70B-Chat, producing an equivalent-quality model that generates 2.2x more.

126
126
article thumbnail

Article: Maximizing the Utility of Large Language Models (LLMs) through Prompting

InfoQ Articles

In this article, authors Numa Dhamani and Maggie Engler discuss how prompt engineering techniques can help use the large language models (LLMs) more effectively to achieve better results. Prompting techniques discussed include few-shot, chain-of-thought, self-consistency, and tree-of-thoughts prompting.

article thumbnail

Turbo-Charging Confluent Cloud To Be 10x Faster Than Apache Kafka®

Confluent

Confluent Cloud is now 10x faster than Apache Kafka. Read our latency benchmarking results, the innovations behind-the-scenes, and the lessons we learned.

Cloud 89
article thumbnail

Cassandra Unleashed: How We Enhanced Cassandra Fleet’s Efficiency and Performance

DoorDash Engineering

In the realm of distributed databases, Apache Cassandra stands out as a significant player. It offers a blend of robust scalability and high availability without compromising on performance. However, Cassandra also is notorious for being hard to tune for performance and for the pitfalls that can arise during that process. The system’s expansive flexibility, while a key strength, also means that effectively harnessing its full capabilities often involves navigating a complex maze of configu

article thumbnail

How Meta is advancing GenAI

Engineering at Meta

What’s going on with generative AI (GenAI) at Meta? And what does the future have in store? In this episode of the Meta Tech Podcast, Meta engineer Pascal Hartig ( @passy ) speaks with Devi Parikh, an AI research director at Meta. They cover a wide range of topics, including the history and future of GenAI and the most interesting research papers that have come out recently.

article thumbnail

Why Organizations Are Transitioning from OpenAI to Fine-Tuned Open-Source Models

Dataversity

In the rapidly evolving generative AI landscape, OpenAI has revolutionized the way developers build prototypes, create demos, and achieve remarkable results with large language models (LLMs). However, when it’s time to put LLMs into production, organizations are increasingly moving away from commercial LLMs like OpenAI in favor of fine-tuned open-source models.

Education 119
article thumbnail

Welcome to the Data Intelligence Platform: Databricks + Einblick

databricks

At Databricks, we believe that AI will change the way that enterprises interact with their data. That’s why today, we're excited to welcome t.

117
117
article thumbnail

Article: How Much Architecture Is “Enough?”: Balancing the MVP and MVA Helps You Make Better Decisions

InfoQ Articles

The Minimum Viable Architecture (MVA) is the architectural complement to a Minimum Viable Product (MVP). The MVA and MVP must evolve together for a product to be successful. As new features are delivered to customers, corresponding incremental improvements need to be made in the architecture. Also, the architecture should not get too far ahead of what is needed for the product.

94
article thumbnail

Introducing the New Fully Managed BigQuery Sink V2 Connector for Confluent Cloud: Streamlined Data Ingestion and Cost-Efficiency

Confluent

The new fully managed BigQuery Sink V2 connector for Confluent Cloud offers streamlined data ingestion and cost-efficiency. Learn about the Google-recommended Storage Write API and OAuth 2.0 support.

Cloud 78
article thumbnail

Meeting DoorDash Growth with a Self-Service Logistics Configuration Platform 

DoorDash Engineering

DoorDash has grown from executing simple restaurant deliveries to working with a wide variety of businesses, ranging from grocery and retail to parcels and pet supplies. Each business faces its own set of constraints as it strives to meet its goals. Our logistics teams — which range across a number of functions, including Dashers, assignment, payment processes, and time estimations — seek to achieve these goals by tuning a variety of configurations for each use case and type of business.

article thumbnail

5 Technologies That Will Help You Comply with GDPR

TDAN

GDPR stands for General Data Protection Regulation. It’s a regulation introduced by the European Union in 2018 to protect the privacy and personal data of EU citizens. It applies to all companies that process personal data of people living in the EU, regardless of the company’s location.

article thumbnail

IoT Data Governance: Taming the Deluge in Connected Environments

Dataversity

The Internet of Things (IoT) has rapidly redefined many aspects of our lives, permeating everywhere from our jobs to our homes and every space in between. However, the sheer volume and complexity of data generated by an ever-growing network of connected devices presents unprecedented challenges. This article, which is infused with insights from leading experts, aims to demystify […] The post IoT Data Governance: Taming the Deluge in Connected Environments appeared first on DATAVERSITY.

IoT 119
article thumbnail

Introducing AI Model Sharing with Databricks

databricks

Today, we're excited to announce that AI model sharing is available in both Databricks Delta Sharing and on the Databricks Marketplace. With Delta.

114
114
article thumbnail

Article: Understanding Architectures for Multi-Region Data Residency

InfoQ Articles

This article focuses on implementing data residency strategies for a positive stakeholder experience. It underscores the need to diversify data locations, driven by motivations like disaster recovery and geo-redundancy. The core principle is data distribution, ensuring specific sets reside in distinct regions without overlap - a practice termed data residency.

Cloud 92
article thumbnail

The Show Must Go On: How Network Automation with Confluent Helps BT Stream TV Signals Across the UK

Confluent

Learn how BT broadcasts 95% of the TV signals in the UK using Confluent, which includesB BC 1, Radio 4, and the World Service.

Network 78
article thumbnail

The 2024 Mixpanel Benchmarks Report is here

Mixpanel

Our 2024 Benchmarks Report explores what average and best-in-class digital product growth looks like in your industry. Get the report now. What is a “great” user acquisition rate for our product? Are we keeping up with industry average retention? What should our marketing performance goals be? All of these questions have one thing in common: reassurance.

article thumbnail

The Currency of Information: Measuring the Value of Data (Part Four)

TDAN

Data professionals often talk about the importance of managing data and information as organizational assets, but what does this mean? What is the actual business value of data and information? How can this value be measured? How do we manage data and information as assets?

article thumbnail

2024: When IT And AI Collide

Dataversity

Stressed to the limit and buried under busy work, IT teams were told to “do more with less” in 2023. That meant that despite more shadow IT, more security vulnerabilities, and more questions, these tech pros were equipped with the same resources or fewer. Now, as we look at a new year with new opportunities, […] The post 2024: When IT And AI Collide appeared first on DATAVERSITY.

article thumbnail

Databricks SQL Year in Review (Part I): AI-optimized Performance and Serverless Compute

databricks

This is part 1 of a blog series where we look back at the major areas of progress for Databricks SQL in 2023.

115
115
article thumbnail

Article: Breaking Changes Are Broken

InfoQ Articles

In this article, we address the most contentious and misinterpreted parts of the SemVer standard, i.e backward compatibility and breaking changes. With the proliferation of SaaS APIs for Generative AI continuing, now is a good time for a retrospection on what constitutes a breaking change and how you can trade off backward compatibility and upgradability with modernization and iterability.

Cloud 89
article thumbnail

Announcing the 2024 Data Streaming Startup Challenge Semifinalists

Confluent

The finalists and winner of the Data Streaming Startup Challenge will be announced at Kafka Summit London.

75
article thumbnail

D2C231: Cloud Repatriation: Can Workloads Ever Come Home Again?

Packet Pushers

Cloud repatriation: Is it a good idea? Guest Marino Wijay, an OSI and networking open source advocate, joins hosts Ethan Banks and Ned Bellavance to discuss the recent interest in cloud repatriation. They cover the intricacies of moving workloads from the cloud back to on-premises or edge environments, and question if it is possible to. Read more » Cloud repatriation: Is it a good idea?

Cloud 52
article thumbnail

Data Professional Introspective: Demystifying Data Culture

TDAN

The term “data culture” is frequently used to describe a normative view about how an organization functions (or more precisely, should function) with respect to its data. The term is not particularly well defined, and the notions held about this term can vary significantly.

article thumbnail

Ask a Data Ethicist: Why Does Data Ethics Matter?

Dataversity

Whenever I give a talk, I always share how much I love Q&A. It’s a real joy to hear what people are curious about and provide resources or share insightful lived experiences as a consultant in the data ethics space. In this line of work, it’s usually not about having tidy, easy answers or the […] The post Ask a Data Ethicist: Why Does Data Ethics Matter?

article thumbnail

Boost your data & AI skills with our latest offerings: Databricks Academy Labs and Blended Learning

databricks

Databricks launches hands-on labs solution and cohort-based learning From the data + AI experts, today, we're announcing two unique ways that practitioners can.

113
113