『When Clean Data Is Actually Dirty』のカバーアート

When Clean Data Is Actually Dirty

When Clean Data Is Actually Dirty

無料で聴く

ポッドキャストの詳細を見る

概要

“Cleaning” data is often treated as a harmless preprocessing step.

Delete missing rows.

Fill gaps with the mean.

Move forward.

But cleaning is not neutral.

It is a modeling decision that can change:

  • The estimand
  • The sampling mechanism
  • The bias–variance trade-off

In this episode, we examine the statistical dangers of deletion and simple imputation — and why naïve cleaning can quietly corrupt inference.

まだレビューはありません