エピソード

  • Statistics 101 at work | Saturdata
    2026/03/15

    What if your A/B test needed 67 years to reach statistical significance? Sam found out the hard way. Join Sam and Shifra as they demystify statistical testing for the real world of data work, where the stakes are lower, the data is messier, and your stakeholders definitely do not know what a p-value is.

    We talk about:

    • P-values, null hypotheses, and why 0.05 was basically made up
    • Type 1 and type 2 errors through the lens of job interviews
    • When A/B testing actually makes sense (hint: you need more than 10 visitors a day)
    • T-tests, chi-square, ANOVA, and F1 scores explained without the jargon
    • Why a suspiciously high model accuracy is actually a red flag
    • The difference between statistical significance and practical significance

    Chapters:

    0:00 - The 67-year A/B test

    0:22 - Welcome to everyone's favorite hobby

    1:37 - Knowing how to interpret tests (not run them)

    2:27 - Is the analysis actually important to the business?

    3:37 - P-values refresher: what they are and aren't telling you

    6:07 - Why a raw p-value isn't enough

    7:40 - Null vs. alternative hypotheses explained

    10:16 - Type one and type two errors (a.k.a. the costly mix-ups)

    15:06 - Lift: measuring if your marketing actually did anything

    18:53 - When you already have all the data, statistics isn't the tool

    20:57 - Sample size, statistical significance, and the 67-year problem revisited

    24:04 - Common A/B test types: t-tests, chi-square, and ANOVAs

    26:44 - F1 scores, confusion matrices, and picking the right metric

    29:19 - Central limit theorem and the magic number 30

    31:31 - We never prove things — we just reject the null

    34:51 - Premortems and deciding if a project is even worth doing

    35:52 - When n is too small vs. too big (and why both are a problem)

    38:00 - Effect size: the stat that doesn't care how big your sample is

    41:39 - Regression, slope, and explaining it to real humans

    47:07 - Spend your time on the right things, not the fanciest model

    52:33 - Wrap-up and big takeaways

    続きを読む 一部表示
    53 分
  • Inconsistent colors are sabotaging your charts
    2026/03/16

    You know what's worse than color-coding your data groups?

    Switching the colors between charts. Shifra breaks down how inconsistent color use creates a false "design language" that misleads your audience. If group A is yellow in one chart, it better be yellow in the next one too! 🎨

    #shorts #saturdata #data #dataviz #charttips #designlanguage #datavisualization

    続きを読む 一部表示
    1 分
  • "Stop chasing every customer, your top 20% are doing all the heavy lifting "
    2026/03/13

    Ever wonder how companies figure out which customers actually matter most? Sam breaks down how lift models rank your entire customer base by purchase probability so you can zero in on the top 20% driving 80% of your revenue.

    Stop spreading your efforts thin and start focusing where it counts!

    #reels #saturdata #data #liftmodel #datascience #analytics #8020rule

    続きを読む 一部表示
    1 分
  • Why your SQL costs more than you think | Saturdata
    2026/03/07

    Think you know SQL? Sam and Shifra break down what separates a query writer from a true data thinker, from basic selects all the way to distributed systems, query plans, and the four pillars of production-ready code. Plus: why your data provider's incentives are working against you, how a 1,400-line monolith hid millions in overstated revenue, and the one approach that will save you from silent, soul-crushing data failures.

    🌊 Check out the deep dive here: https://youtu.be/ayNKmcIELEo

    We talk about:

    - Three levels of SQL (and a secret fourth)

    - What they don't teach you in school: data fluency, granularity, and audit patterns

    - Why SELECT DISTINCT and ORDER BY are secretly expensive

    - WORM vs. WARO: taking cost at write time vs. read time

    - Idempotency, query plans, and writing SQL like a query engine

    Follow Saturdata, your favorite weekend data podcast:

    Spotify: https://open.spotify.com/show/5QolhKm1jDZzVuHO0S9ZBo?si=910efb23833f4fc1

    LinkedIn: https://www.linkedin.com/company/saturdata

    Instagram: @SaturdataPod

    #Saturdata #SQL #DataEngineering #DataAnalytics #QueryOptimization

    Chapters:

    0:00 - Intro: Squeal and Sasquatch

    2:34 - The three levels of SQL: Basic, intermediate, and advanced

    6:47 - When advanced SQL isn't really SQL anymore

    9:12 - The 1,400-line monolith horror story

    13:00 - The four "ilities": Readability, maintainability, observability, and explainability

    15:34 - Write once, read many: Taking the cost at write time vs. read time

    18:04 - SQL as analyst vs. SQL as architect

    22:52 - Platform-dependent code and why your cloud provider is not your friend

    25:31 - What school never taught you: Data fluency, granularity, and knowing your tables

    35:09 - Why all of this matters: Defending your query like a lawyer

    41:49 - What's holding people back: Select star, distinct, and the monolith trap

    46:30 - Idempotency, query plans, and thinking like an engine

    続きを読む 一部表示
    54 分
  • Data skills nobody taught you | Saturdata
    2026/02/28

    Your SQL is great. But can you actually ship? Sam and Shifra kick off Season 1 (since Saturdata is zero-indexed) by covering the unsung skills that separate someone who writes queries from someone who builds things: terminal literacy, dependency management, Git, notebooks, and why UV might be Python's best friend right now. Plus, a deep dive into Marimo, the notebook tool that fixes everything you hate about Jupyter.

    🌊 Check out the deep dive here: https://youtu.be/SWcLulIhVkg

    We talk about:

    - Terminal basics and why middle-school-level literacy is all you need

    - Virtual environments, dependency conflicts, and why UV is the move

    - Git fundamentals: staging, committing, and why humans still need to be in the loop

    - What's wrong with Jupyter notebooks (and who's fault it is)

    - Marimo: reactive notebooks, SQL cells, and dashboards all in one

    Chapters:

    0:00 - Welcome back (and why season two is called season one)

    0:47 - Auxiliary data skills: the glue holding everything together

    2:08 - How this season works: yap episodes + deep dives

    4:01 - What auxiliary skills actually are (bash, docker, makefiles, oh my)

    5:43 - Terminal literacy: your middle school diploma starts here

    10:12 - Finding your terminal on Mac, Windows, and the WSL escape hatch

    11:52 - Essential commands every data person should know

    15:46 - Notebooks: an educational tool that accidentally went to production

    19:55 - The biggest problems with traditional Jupyter notebooks

    23:37 - Virtual environments and dependency conflicts, explained

    26:30 - UV: the one tool to rule them all

    30:24 - Git: version control, branching, and why humans still need to be in the loop

    38:15 - Marimo: the next-generation notebook that fixes everything

    47:28 - Closing thoughts

    続きを読む 一部表示
    51 分
  • AP statistics really thought we'd be checking math at home
    2026/03/12

    Shifra's favorite AP Stats memory? A free response question that assumed every student would casually verify study results at home, complete with test stats and standard deviations.

    Clearly written by armchair professors who thought high schoolers would just kick back and double-check the math for fun. Spoiler: nobody's doing that 😂

    #reels #saturdata #data #statistics #APstats #mathhumor #studentlife

    続きを読む 一部表示
    1 分
  • The technical skills aren't what get you promoted
    2026/03/11

    Sam keeps it real: the craziest model with the best significance won't get you recognized.

    What will? Delivering fast, communicating clearly, and understanding what actually moves the needle for the business.

    The "soft" skills you think don't matter? They're the whole game. 🎯

    #reels #saturdata #data #datascience #careertips #softskills #analytics

    続きを読む 一部表示
    1 分
  • P-values don't tell you the whole story
    2026/03/10

    Your result might be statistically significant, but does it actually matter? Shifra breaks down why effect size (hello, Cohen's D!) is the real MVP. Something can be significant at the scale of nanometers and still be a total snooze.

    Don't skip the "how much do we care?" step! 📏

    #reels #saturdata #data #statistics #effectsize #datascience #pvalue

    続きを読む 一部表示
    1 分