エピソード

  • Nuts and Bolts of Apache Kafka
    2024/06/09
    Topics, Partitions, and APIs oh my! This episode we're getting further into how Apache Kafka works and its use cases. Also, Allen is staying dry, Joe goes for broke, and Michael (eventually) gets on the right page. The full show notes are available on the website at https://www.codingblocks.net/episode236 News Thanks for the reviews! angingjellies and Nick Brooker Please leave us a review! (/review) Atlanta Dev Con is coming up, on September 7th, 2024 (www.atldevcon.com) Kafka Topics They are partitioned - this means they are distributed (or can be) across multiple Kafka brokers into "buckets"New events written to Kafka are appended to partitions The distribution of data across brokers is what allows Kafka to scale so well as data can be written to and read from many brokers simultaneously Events with the same key are written to the same partition as the original event Kafka guarantees reads of events within a partition are always read in the order that they were written For fault tolerance and high availability, topics can be replicated…even across regions and data centers NOTE: If you're using a cloud provider, know that this can be very costly as you pay for inbound and outbound traffic across regions and availability zonesTypical replication configurations for production setups are 3 replicas Kafka APIS Admin API - used for managing and inspecting topics, brokers, and other Kafka objectsProducer API - used to write events to Kafka topicsConsumer API - used to read data from Kafka topicsKafka Streams API - the ability to implement stream processing applications/microservices. Some of the key functionality includes functions for transformations, stateful operations like aggregations, joins, windowing, and more In the Kafka streams world, these transformations and aggregations are typically written to other topics (in from one topic, out to one or more other topics)Kafka Connect API - allows for the use of reusable import and export connectors that usually connect external systems. These connectors allow you to gather data from an external system (like a database using CDC) and write that data to Kafka. Then you could have another connector that could push that data to another system OR it could be used for transforming data in your streams application These connectors are referred to as Sources and Sinks in the connector portfolio (confluent.io)Source - gets data from an external system and writes it to a Kafka topicSink - pushes data to an external system from a Kafka topic Use Cases Message queue - usually talking about replacing something like ActiveMQ or RabbitMQMessage brokers are often used for responsive types of processing, decoupling systems, etc. - Kafka is usually a great alternative that scales, generally has faster throughput, and offers more functionalityWebsite activity tracking - this was one of the very first use cases for Kafka - the ability to rebuild user actions by recording all the user activities as eventsHow and why Kafka was developed (LinkedIn) Typically different activity types would be written to different topics - like web page interactions to one topic and searches to another Metrics - aggregating statistics from distributed applicationsLog aggregation - some use Kafka for storage of event logs rather than using something like HDFS or a file server or cloud storage - but why? Because using Kafka for the event storage abstracts away the events from the filesStream processing - taking events in and further enriching those events and publishing them to new topicsEvent sourcing - using Kafka to store state changes from an application that are used to replay the current state of an object or systemCommit log - using Kafka as an external commit log is a way for synchronizing data between distributed systems, or help rebuild the state in a failed system https://youtu.be/IuUDRU9-HRk Tip of the Week Rémi Gallego is a music producer who makes music under a variety of names like The Algorithm and Boucle Infini, almost all of it is instrumental Synthwave with a hard-rock edge. They also make a lot of video game music, including 2 of my favorite game soundtracks of all time "The Last Spell" and "Hell is for Demons" (YouTube)Did you know that the Kubernetes-focused TUI we've raved about before can be used to look up information about other things as well, like :helm and :events. Events is particularly useful for figuring out mysteries. You can see all the "resources" available to you with "?". You might be surprised at everything you see (pop-eye, x-ray, and monitoring)WarpStream is an S3 backed, API compliant Kafka Alternative. Thanks MikeRg! (warpstream.com)Cloudflare's trillion message Kafka setup, thanks Mikerg! (blog.bytebytego.com)Want the power and flexibility of jq, but for yaml? Try yq! (gitbook.io)Zenith is terminal graphical metrics for your *nix system written in Rust, thanks MikeRg! (github.com)8 Big (O)Notation Every Developer should Know ...
    続きを読む 一部表示
    1 時間 37 分
  • Intro to Apache Kafka
    2024/05/26
    We finally start talking about Apache Kafka! Also, Allen is getting acquainted with Aesop, Outlaw is killing clusters, and Joe is paying attention in drama class. The full show notes are available on the website at https://www.codingblocks.net/episode235 News Atlanta Dev Con is coming up, on September 7th, 2024 (www.atldevcon.com) Intro to Apache Kafka What is it? Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications. Core capabilities High throughput - Deliver messages at network-limited throughput using a cluster of machines with latencies as low as 2ms.Scalable - Scale production clusters up to a thousand brokers, trillions of messages per day, petabytes of data, and hundreds of thousands of partitions. Elastically expand and contract storage and processingPermanent storage - Store streams of data safely in a distributed, durable, fault-tolerant cluster.High availability - Stretch clusters efficiently over availability zones or connect separate clusters across geographic regions. Ecosystem Built-in stream processing - Process streams of events with joins, aggregations, filters, transformations, and more, using event-time and exactly-once processing.Connect to almost anything - Kafka’s out-of-the-box Connect interface integrates with hundreds of event sources and event sinks including Postgres, JMS, Elasticsearch, AWS S3, and more.Client libraries - Read, write, and process streams of events in a vast array of programming languagesLarge ecosystem of open source tools - Large ecosystem of open source tools: Leverage a vast array of community-driven tooling. Trust and Ease of Use Mission critical - Support mission-critical use cases with guaranteed ordering, zero message loss, and efficient exactly-once processing.Trusted by thousands of organizations - Thousands of organizations use Kafka, from internet giants to car manufacturers to stock exchanges. More than 5 million unique lifetime downloads.Vast user community - Kafka is one of the five most active projects of the Apache Software Foundation, with hundreds of meetups around the world. What is it? Getting data in real-time from event sources like databases, sensors, mobile devices, cloud services, applications, etc. in the form of streams of events. Those events are stored "durably" (in Kafka) for processing, either in real-time or retrospectively, and then routed to various destinations depending on your needs. It's this continuous flow and processing of data that is known as "streaming data" How can it be used? (some examples)Processing payments and financial transactions in real-timeTracking automobiles and shipments in real time for logistical purposesCapture and analyze sensor data from IoT devices or other equipmentTo connect and share data from different divisions in a company Apache Kafka as an event streaming platform? It contains three key capabilities that make it a complete streaming platform Can publish and subscribe to streams of eventsCan store streams of events durably and reliably for as long as necessary (infinitely if you have the storage)To process streams of events in real-time or retrospectively Can be deployed to bare metal, virtual machines or to containers on-prem or in the cloudCan be run self-managed or via various cloud providers as a managed service How does Kafka work? A distributed system that's composed of servers and clients that communicate using a highly performant TCP protocol Servers Kafka runs as a cluster of one or more servers that can span multiple data centers or cloud regionsBrokers - these are a portion of the servers that are the storage layerKafka Connect - these are servers that constantly import and export data from existing systems in your infrastructure such as relational databasesKafka clusters are highly scalable and fault-tolerant Clients Allows you to write distributed applications that allow to read, write and process streams of events in parallel that are fault-tolerant and scale These clients are available in many programming languages - both the ones provided by the core platform as well as 3rd party clients Concepts Events It's a record of something that happened - also called a "record" in the documentation Has a keyHas a valueHas an event timestampCan have additional metadata Producers and Consumers Producers - these are the client applications that publish/write events to KafkaConsumers - these are the client applications that read/subscribe to events from KafkaProducers and consumers are completely decoupled from each other Topics Events are stored in topicsTopics are like folders on a file system - events would be the equivalent of files within that folderTopics are mutli-producer and multi-subscriber There can be zero, one or many producers or subscribers to a topic that ...
    続きを読む 一部表示
    2 時間 5 分
  • StackOverflow AI Disagreements, Kotlin Coroutines and More
    2024/05/13
    https://www.codingblocks.net/episode234 Reviews iTunes: ivan.kuchin News Atlanta Dev Con September 7th, 2024 https://www.atldevcon.com/ Topics People trying to remove their answers from StackOverflow to not allow OpenAI to use their answers without permission/recognition? https://www.tomshardware.com/tech-industry/artificial-intelligence/stack-overflow-bans-users-en-masse-for-rebelling-against-openai-partnership-users-banned-for-deleting-answers-to-prevent-them-being-used-to-train-chatgpt Obfuscate data dumps with PostgreSQL https://github.com/GreenmaskIO/greenmask/ Kotlin Coroutines https://kotlinlang.org/docs/coroutines-overview.html https://kotlinlang.org/docs/coroutine-context-and-dispatchers.html#dispatchers-and-threads Reminded Outlaw of the Cloudflare Workers we mentioned a while back https://developers.cloudflare.com/workers/ Please leave us a review! https://www.codingblocks.net/review You can control if YouTube keeps track of your history (at least that you can see) 100 Things You Didn't Know About Kubernetes https://www.devopsinside.com/100-things-you-didnt-know-about-kubernetes-part-1/ Do the IDE AI's really make you more productive? Random Bits Tesla Las Vegas Loop https://www.lvcva.com/vegas-loop/ What actually happens when you overfill the oil in a vehicle? https://www.youtube.com/watch?v=VaTbfvzNbxQ Fisker Ocean totalled after a $900 door ding...really https://jalopnik.com/fisker-ocean-totaled-over-910-door-ding-after-insurer-1851451187 A Ford Mustang painted with the blackest black paint available https://youtu.be/Ll27OkWuE1g Tip of the Week Docker Blog is pretty excellent https://www.docker.com/blog/ Car Research Car reliability information https://www.truedelta.com/ Actual problems logged with car models by year https://www.carcomplaints.com/ Great search engine for finding cars and more metadata about the listing like how long the car has been listed https://caredge.com/ Utilizing wood sheet goods by utilizing cut lists https://www.opticutter.com/cut-list-optimizer Docker's chicken-n-egg problem Use a multi-stage Dockerfile where an earlier stage has the tools you need Manually dearmor a PGP public key (Hint: it's the opposite of: https://superuser.com/questions/764465/how-to-ascii-armor-my-public-key-without-installing-gpg) Download the file using the server suggested name With wget ... --content-disposition https://man7.org/linux/man-pages/man1/wget.1.html Wth curl ... -JO -J, --remote-header-name -O, --remote-name https://curl.se/docs/manpage.html#-J
    続きを読む 一部表示
    1 時間 42 分
  • Llama 3 is Here, Spending Time on Environmental Setup and More
    2024/04/28

    Full episode show notes can be found at:

    https://www.codingblocks.net/episode233

    続きを読む 一部表示
    1 時間 34 分
  • Ktor, Logging Ideas, and Plugin Safety
    2024/04/14

    Picture, if you will, a nondescript office space, where time seems to stand still as programmers gather around a water cooler. Here, in the twilight of the workday, they exchange eerie tales of programming glitches, security breaches, and asynchronous calls. Welcome to the Programming Zone, where reality blurs and (silent) keystrokes echo in the depths of the unknown. Also, Allen is ready to boom, Outlaw is not happy about these category choices, and Joe takes the easy (but not longest) road.

    The full show notes are available on the website at https://www.codingblocks.net/episode232

    News

    • Thanks for the reviews! Want to help us out? Leave a review! (/reviews)
      • ivan.kuchin, Nick Brooker, Szymon, JT, Scott Harden
    • Text replacements are tricky, replacing links to "twitter.com" with "x.com" enabled a wave of domain spoofing attacks. (arstechnica.com)

    Around the Water Cooler

    • Ktor is an asynchronous web framework based on Kotlin, but can it compete with Spring? (ktor.io)
    • docker init is a great tool for getting started, but how much can you expect from a scaffolding tool? (docs.docker.com)
    • Logging, how much is too much? What if we could go back in time?
    • Boomer Hour: Let's talk about GChat UX
    • What do you know about browser extensions?
      • ViolentMonkey is a modern remake of the infamous GreaseMonkey, but can you trust it? (chromewebstore.google.com)
    • Can you trust any extensions?
      • XZ Tools backdown timeline, wow (arstechnica.com)
    • Bookmarklets still rock! (freecodecamp.org)
    • Silent Key Tester for mechanical keyboards, you can specify a wide variety of switches (thockking.com)
      • Joe's preferences:
        • Durock Shrimp Silent T1
        • Tactile Gazzew Boba U4 Silent
        • Liner Kailh Silent Brown
        • Linear Lichicx Lucy Silent
        • Linear WS Wuque Studio Gray Silent
        • Tactile WS Wuque Studio
        • White Silent - Linear
        • Tactile Kailh Silent Pink
        • Linear Cherry MX Silent Red

    Tip of the Week

    • Feeling nostalgic for the original GameBoy or GameBoy Color? GBStudio is a one-stop shop for making games, it's open-source and fully featured. You can do the art, music, and programming all in one tool and it's thoughtfully laid out and well-documented. Bonus…you games will work in GameBoy emulators AND you can even produce your own working physical copies. (If you don't want the high-level tools you can go old skool with "GBDK" too) (gbstudio.dev)
    • If you're going to do something, why not script it? If you're going to script it, save it for next time!
    • Dave's Garage is a YouTube channel that does deep dives into Windows internals, cool electronics projects, and everything in between! (YouTube)

    続きを読む 一部表示
    1 時間 39 分
  • Importance of Data Structures, Bad Documentation and Comments and More
    2024/04/01

    Full show notes at:
    https://www.codingblocks.net/episode231

    続きを読む 一部表示
    1 時間 41 分
  • Decorating your Home Office
    2024/03/18

    This time we are missing the "ocks", but we hope you enjoy this off...ice topic chat about personalizing our workspaces. Also, Joe had to put a quarter in the jar, and Outlaw needs a cookie.

    The full show notes are available on the website at https://www.codingblocks.net/episode230

    News

    Thank you for the review Szymon! Want to leave us a review?

    Decorating your Home Office

    • Joe's Uplift Desk Review
    • Mounting monitors, is there any other way?
    • To grommet or not to grommet?
    • How many keys do you want on your keyboard?
    • Wired vs Wireless
    • About that "fn" key…
    • Reddit for inspiration?
    • Office-Appropriate Art
      • Paintings
      • Prints / Silk Screens / Photography
      • Sculptures
      • Book Cases
      • There's a story for Outlaw about this print: https://www.johndyerbaizley.com/product/four-horsemen-full-color-ap

    Tip of the Week

    • If you have a car, you should consider getting a Mirror Dash Cam. It's a front and rear camera system that replaces your rearview mirror with a touchscreen. Impress all your friends with your recording, zoom, night vision, parking assistance, GPS, and 24/7 recording and monitoring. (Amazon)
    • Be careful about exercising after you give blood, else you might end up needing it back! (redcrossblood.org )

    続きを読む 一部表示
    1 時間 21 分
  • Multi-Value, Spatial, and Event Store Databases
    2024/03/04
    We are mixing it up on you again, no Outlaw this week, but we can offer you some talk of exotic databases. Also, Joe pronounces everything correctly and Allen leaves you with a riddle. The full show notes are available on the website at https://www.codingblocks.net/episode229 News Thanks for the reviews! ivan.kuchin (has taken the lead!), Yoondoggy, cykoduck, nehoraigoldWant to help us out? Leave a review! (reviews) Multivalue DBMS Popular: 86. Adabas, 87. UniData/UniVerse, 147. JBaseSimilar to RDBMS - store data in tables Store multiple values to a particular record's attribute Some RDBMS's can do this as well, BUT it's typically an exception to the rule when you'd store an array on an attributeIn a MultiValue DBMS - that's how you SHOULD do itPart of the reason it's done this way is these database systems are not optimized for JOINS Looked at the Adabas and UniData sites - the primary selling points seem to be rapid application development / ease of learning and getting up to speed as well as data modeling that closely mirrors your application data structures I BELIEVE it's a schema on write (docs.rocketsoftware.com)Supposed to be very performant as you access the data the way your application expects itPer the docs, it's easy to maintain (Wikipedia) Spatial DBMS Popular: 29. PostGIS, 59. Aerospike, 136. SpatiaLiteProvides the ability to efficiently store, modify, and query spatial data - data that appears in a geometrical space (maps, polygons, etc)Generally have custom data types for storing the spatial dataIndices that allow for quick retrieval of spatial data about other spatial dataAlso allow for performing spatial-specific operations on data, such as computing distances, merging or intersecting objects or even calculating areasGeospatial data is a subset of spatial data - they represent places / spatial data on the Earth's surfaceSpatio-temporal data is another variation - spatial data combined with timestampsPostGIS - basically a plugin for PostgreSQL that allows for storing of spatial data Additionally supports raster data - data for things like weather and elevationIf you want to learn how to use it and understand the data and what's stored (postgis.net) Spatial data types are: point, line, polygon, and more…basically shapesRather than using b-tree indexes for sorting data for fast retrieval, spatial indexes that are bounding boxes - rectangles that identify what is contained within them Typically accomplished with R-Tree and Quadtree implementationsRedFin - a real estate competitor to realtor.com and others, uses PostgreSQL / PostGISQuite a bit of software that supports OpenGIS so may be a good place to start if you're interested in storing/querying spatial data Event Stores Popular: 178. EventStoreDB, 336. IBM DB2 Event Store, 338. NEventStoreUsed for implementing the concept of Event Sourcing Event Sourcing - an application/data store where the current state of an object is obtained by "replaying" all the events that got it to its current state This contrasts with RDBMS's in that relational typically store the current state of an object - historical state CAN be stored, but that's an implementation detail that has to be implemented, such as temporal tables in SQL Server or "history tables" Only support adding new events and querying the order of events Not allowed to update or delete an event For performance reasons, many Event Store databases support snapshots for holding materialized states at points in time EventStoreDB - https://www.eventstore.com/eventstoredb Defined as an "immutable log"Features: guaranteed writes, concurrency model, granulated stream and stream APIsMany client interfaces: .NET, Java, Go, Node, Rust, and PythonRuns on just about all OSes - Windows, Mac, LinuxHighly available - can run in a clusterOptimistic concurrency checks that will return an error if a check fails"Projections" allow you to generate new events based off "interesting" occurrences in your existing dataFor example. You are looking for how many Twitter users said "happy" within 5 minutes of the word "foo coffee shop" and within 2 minutes of saying "London".Highly performant - 15k writes and 50k reads per second Resources we like Database Rankings (db-engines.com) Tip of the Week If your internet connection is good, but your cell phone service is bad then you might want to consider Ooma. Ooma sells devices that plug into your network or connect wireless and provide a phone number, and a phone jack so you can hook up an an old school home telephone. We've using it for about a week now with no problems and it's been a breeze to set up. The devices range from $99 to $129 and there's a monthly "premier" plan you can buy with nifty features like a secondary phone line, advanced call blocking, and call forwarding. (ooma.com)Why use "git reset --hard" when you can "git stash -u" instead? Reset is destructive, but stashing keeps your changes just in case you need them. ...
    続きを読む 一部表示
    1 時間 7 分