Introducing Apache Kafka 3.7
Confluent
FEBRUARY 27, 2024
Apache Kafka 3.7 introduces updates to the Consumer rebalance protocol, an official Apache Kafka Docker image, JBOD support in Kraft-based clusters, and more!
Confluent
FEBRUARY 27, 2024
Apache Kafka 3.7 introduces updates to the Consumer rebalance protocol, an official Apache Kafka Docker image, JBOD support in Kraft-based clusters, and more!
Engineering at Meta
FEBRUARY 6, 2024
We’ve open sourced DotSlash , a tool that makes large executables available in source control with a negligible impact on repository size, thus avoiding I/O-heavy clone operations. With DotSlash, a set of platform-specific executables is replaced with a single script containing descriptors for the supported platforms. DotSlash handles transparently fetching, decompressing, and verifying the appropriate remote artifact for the current operating system and CPU.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
databricks
FEBRUARY 1, 2024
As Chief Scientist (Neural Networks) at Databricks, I lead our research team toward the goal of giving everyone the ability to build and.
DoorDash Engineering
FEBRUARY 27, 2024
We reviewed the architecture of our global search at DoorDash in early 2022 and concluded that our rapid growth meant within three years we wouldn’t be able to scale the system efficiently, particularly as global search shifted from store-only to a hybrid item-and-store search experience. Our analysis identified Elasticsearch as our architecture’s primary bottleneck.
Dataversity
FEBRUARY 8, 2024
In the fast-paced landscape of 2023, organizations embraced artificial intelligence (AI) and its related technologies, experiencing a surge in diverse AI applications. According to data from McKinsey, there was a significant 55% adoption rate of AI across global industries by employees. However, as we step into 2024, organizations recognize that while AI is critical for competitiveness and […] The post Three Ways AI Will Change in the New Year appeared first on DATAVERSITY.
InfoQ Articles
FEBRUARY 5, 2024
Understand idempotence in AWS serverless, tackling challenges from at-least-once delivery. Implement and automate with AWS Lambda, emphasizing early planning for consistent outcomes. Use tools like Lambda Powertools and prioritize testing for reliability.
Confluent
FEBRUARY 7, 2024
Confluent has hired many Noteable employees to help make application development easier for both Kafka and Flink developers.
IT Networking Pro Today brings together the best content for advertising professionals from the widest variety of industry thought leaders.
databricks
FEBRUARY 29, 2024
Special thanks to Phillip Jones, Senior Product Manager, and Harshal Brahmbhatt, Systems Engineer from Cloudflare for their contributions to this blog. Organizations across.
TDAN
FEBRUARY 21, 2024
Many Data Governance or Data Quality programs focus on “critical data elements,” but what are they and what are some key features to document for them? A critical data element is any data element in your organization that has a high impact on your organization’s ability to execute its business strategy.
Dataversity
FEBRUARY 12, 2024
In last month’s column, I asked readers to send in their “big questions” when it comes to data and AI. This month’s question more than answered that call! It encompasses the enormous areas of trust in AI tools and explainability. How can we know if an AI tool is delivering an ethical result if we have […] The post Ask a Data Ethicist: Can We Trust Unexplainable AI?
InfoQ Articles
FEBRUARY 7, 2024
Spring Framework 6.1 and Spring Boot 3.2 run on Java 21. They make concurrent programming simpler and more efficient with virtual threads, as well as improving reactive programming and Kotlin coroutines. For “Scale to Zero” startup time reduction, the OpenJDK project CRaC received initial support, while the existing GraalVM Native Image integration got faster through a GraalVM release.
Confluent
FEBRUARY 14, 2024
Confluent Platform 7.6 brings upgrading for existing clusters from ZooKeeper to KRaft, compaction support for Tiered Storage, OAuth (early access), improvements to the Oracle CDC premium connector, and more.
Engineering at Meta
FEBRUARY 20, 2024
We’ve partnered with Voltron Data and the Arrow community to align and converge Apache Arrow with Velox , Meta’s open source execution engine. Apache Arrow 15 includes three new format layouts developed through this partnership: StringView, ListView, and Run-End-Encoding (REE). This new convergence helps Meta and the larger community build data management systems that are unified, more efficient, and composable.
databricks
FEBRUARY 21, 2024
We are excited to announce the upcoming general availability of Azure Private Link support for Databricks SQL (DBSQL) Serverless, planned in April 2024.
Cloudera Blog
FEBRUARY 7, 2024
How enterprise-grade data management creates better and more efficient care. In the last few years, the acceptance of telehealth has become more widespread as patients and providers found they could maintain continuity through phone and video collaboration, instead of in-person visits. In many cases, a level of care that once required a drive to the clinic or hospital could be delivered over a mobile phone or laptop, with no travel and no waiting room.
Dataversity
FEBRUARY 28, 2024
In our increasingly digital world, organizations recognize the importance of securing their data. As cloud-based technologies proliferate, the need for a robust identity and access management (IAM) strategy is more critical than ever. IAM serves as the gatekeeper to an organization’s sensitive information, ensuring that only authorized individuals have an appropriate level of access.
InfoQ Articles
FEBRUARY 6, 2024
Companies are now looking to grow and more effectively manage DevOps with platform engineering and site reliability engineering roles. No one has these roles perfectly carved out right now — there’s just too much to do and not enough people to do it — but knowing where these three disciplines do and don’t overlap will help organizations evolve and take advantage when they are ready.
Confluent
FEBRUARY 6, 2024
Confluent enables real-time, reliable, scalable, and secure communication between IoT devices, applications, and backend systems. Streamline data processing and unlock analytics to boost productivity and time to market while lowering infrastructure costs.
Engineering at Meta
FEBRUARY 12, 2024
By now you’re already aware that Python 3.12 has been released. But did you know that several of its new features were developed by Meta ? Meta engineer Pascal Hartig ( @passy ) is joined on the Meta Tech Podcast by Itamar Oren and Carl Meyer, two software engineers at Meta, to discuss their teams’ contributions to the latest Python release, including new hooks that allow for custom JITs like Cinder , Immortal Objects , improvements to the type system, faster comprehensions, and more.
databricks
FEBRUARY 14, 2024
For the past two years, Databricks has collaborated with leading consulting partners to build innovative solutions for industry, migration, and data and AI.
Mixpanel
FEBRUARY 2, 2024
One of the biggest challenges of building a product is that your users often don’t know what features they want or need. They just know what outcome they want to achieve by using the product. Even when users think they know what they want, they may not always be right. It’s only after they’ve made a feature request, you’ve shipped it, and they’ve tried it out that they realize, “Oh wait, that’s not the outcome I expected.
Dataversity
FEBRUARY 5, 2024
Hello! I’m Mark Horseman, and welcome to The Cool Kids Corner. This is my monthly check-in to share with you the people and ideas I encounter as the data evangelist with DATAVERSITY. (Read last month’s column here.) This month, we’re talking about communication. Communication is the cornerstone of socializing anything you do with data, whether that’s […] The post The Cool Kids Corner: CLEAR Communication appeared first on DATAVERSITY.
Ben Morris
FEBRUARY 3, 2024
Despite growing excitement about the potential for AI-driven agents, there are a lot of problems to solve before we can build agent-based architectures on any scale…
TDAN
FEBRUARY 21, 2024
Hands down one of the most frequent observations when walking the data factory at different clients is the excessive use of spreadsheets for data collection and purification. These spreadsheets are part of a critical data enrichment process for getting reports out the door on time.
Engineering at Meta
FEBRUARY 26, 2024
Andres Suarez and Michael Bolin, two software engineers at Meta, join Pascal Hartig ( @passy ) on the Meta Tech Podcast to discuss the ins and outs of DotSlash , a new open source tool from Meta. DotSlash takes the pain out of distributing binaries and toolchains to developers. Instead of committing large, platform-specific executables to a repository, DotSlash combines a fast Rust program with a JSON manifest prefixed with a #!
databricks
FEBRUARY 27, 2024
Introduction Apache Spark™ Structured Streaming is a popular open-source stream processing platform that provides scalability and fault tolerance, built on top of the S.
InfoQ Articles
FEBRUARY 15, 2024
This article explores how generative AI affects fraud detection by reducing false positives and dynamically adapting to changing fraud patterns. This combination offers a potent preventive solution when integrated with machine learning. The efficacy and scalability of fraud prevention initiatives are enhanced by this innovative approach.
Dataversity
FEBRUARY 1, 2024
Artificial intelligence is the top investment area for CIOs in 2024. IT leaders see in generative AI an opportunity to accelerate innovation, improve employee productivity, and gain competitive advantage. Unfortunately, investing in AI is not cheap. CIOs will need to find significant budget to make traction in their AI roadmap and we believe IT asset […] The post IT leaders Need to Invest in AI – Could ITAM and FinOps Be the Solution?
Cloudera Blog
FEBRUARY 8, 2024
Overview This blog post describes support for materialized views for the Iceberg table format. Apache Iceberg is a high-performance open table format for petabyte-scale analytic datasets. It has been designed and developed as an open community standard to ensure compatibility across languages and implementations. It brings the reliability and simplicity of SQL tables to big data while enabling engines like Hive, Impala, Spark, Trino, Flink, and Presto to work with the same tables at the same
TDAN
FEBRUARY 7, 2024
Organizations are drowning in a sea of data, facing challenges that range from inconsistent quality to inefficient and ineffective management. It’s easy to complain about the state of your data, but a more productive tactic involves taking actionable steps to address these issues.
Confluent
FEBRUARY 21, 2024
Discover how to build resilient data pipelines with Confluent Data Portal. Learn essential strategies for isolating upstream systems and empowering downstream consumers.
databricks
FEBRUARY 8, 2024
At Databricks, we've upheld principles of responsible development throughout our long-standing history of building innovative data and AI products. We are committed to.
InfoQ Articles
FEBRUARY 29, 2024
As an engineering manager, it is your responsibility to help facilitate creative thinking skills among the development team, but that's easier said than done. This article provides advice on how can you help amplify the creative thinking skills of your software development colleagues. we examine how different levels of creativity influence creativity and strategies to encourage creativity.
DoorDash Engineering
FEBRUARY 13, 2024
Business Policy Experiments Using Fractional Factorial Designs At DoorDash, we constantly strive to improve our experimentation processes by addressing four key dimensions, including velocity to increase how many experiments we can conduct, toil to minimize our launch and analysis efforts, rigor to ensure a sound experimental design and robustly efficient analyses, and efficiency to reduce costs associated with our experimentation efforts.
Cloudera Blog
FEBRUARY 15, 2024
It’s hard to believe it’s been 15 years since the global financial crisis of 2007/2008. While this might be a blast from the past we’d rather leave in the proverbial rear-view mirror, in March of 2023 we were back to the future with the collapse of Silicon Valley Bank (SVB), the largest US bank to fail since 2008. While there are clear reasons SVB collapsed, which can be reviewed here , my purpose in this post isn’t to rehash the past but to present some of the regulatory and compliance c
Dataversity
FEBRUARY 21, 2024
The rapid adoption of artificial intelligence and machine learning (AI/ML) over the past year has transformed just about everything – ushering in a new era of innovation and growth the world has never seen. The same goes for data storage, where the technologies’ impact will be transformative, enabling greater business agility that companies need to […] The post 7 Ways AI Will Transform Data Storage appeared first on DATAVERSITY.
Confluent
FEBRUARY 13, 2024
Confluent Migration Accelerator, a new program in partnership with the Confluent partner ecosystem to jump-start organizations' data streaming journeys.
Let's personalize your content