Python Bytes - #473 A clean room rewrite?

Topics covered in this episode:
Watch on YouTube

About the show

Sponsored by us! Support our work through:

Michael #1: chardet ,AI, and licensing

  • Thanks Ian Lessing
  • Wow, where to start?
  • A bit of legal precedence research.
  • Chardet dispute shows how AI will kill software licensing, argues Bruce Perens on the Register
  • Also see this GitHub issue.
  • Dan Blanchard, maintainer of a Python character encoding detection library called chardet, released a new version of the library under a new software license. (LGPL → MIT)
  • Dan is allowed to make this change because v7 is a complete “clean room” rewrite using AI
  • BTW, v7 is WAY better:
    • The result is a 48x increase in detection speed for a project that lives in the hot loops of many projects. That will lead to noticeable performance increases for literally millions of users (the package gets ~130M downloads per month).
    • It paves a path towards inclusion in the standard library (assuming they don’t institute policies against using AI tools).
    • Thread-safe detect() and detect_all() with no measurable overhead; scales on free-threaded Python 3.13t+
  • An individual claiming to be Mark Pilgrim, the original creator of the library, opened an issue in the project's GitHub repo arguing that Blanchard had no right to change the software license, citing the LPGL requirement that the license remain unchanged.
  • A 'complete rewrite' is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a 'clean room' implementation).
  • Blanchard disagreed, citing how version 7.0.0 and 6.0.0 compare when subjected to JPlag, a library for detecting plagiarism.
  • Blanchard told The Register he had wanted to get chardet added to the Python standard library for more than a decade since it’s a core dependency to most Python projects.

Brian #2: refined-github

  • Suggested by Matthias Schöttle
  • A browser plugin that improves the GitHub experience
  • A sampling
    • Adds a build/CI status icon next to the repo’s name.
    • Adds a link back to the PR that ran the workflow.
    • Enables tab and shift tab for indentation in comment fields.
    • Auto-resizes comment fields to fit their content and no longer show scroll bars.
    • Highlights the most useful comment in issues.
    • Changes the default sort order of issues/PRs to Recently updated.
  • But really, it’s a huge list of improvements

Michael #3: pgdog: PostgreSQL connection pooler, load balancer and database sharder

  • PgDog is a proxy for scaling PostgreSQL.
  • It supports connection pooling, load balancing queries and sharding entire databases.
  • Written in Rust, PgDog is fast, secure and can manage thousands of connections on commodity hardware.
  • Features
    • PgDog is an application layer load balancer for PostgreSQL
    • Health Checks: PgDog maintains a real-time list of healthy hosts. When a database fails a health check, it's removed from the active rotation and queries are re-routed to other replicas
    • Single Endpoint: PgDog can detect writes (e.g. INSERT, UPDATE, CREATE TABLE, etc.) and send them to the primary, leaving the replicas to serve reads
    • Failover: PgDog monitors Postgres replication state and can automatically redirect writes to a different database if a replica is promoted
    • Sharding: PgDog is able to manage databases with multiple shards

Brian #4: Agentic Engineering Patterns

Extras

Brian:

Michael:

Joke: Ergonomic keyboard

Also pretty good and related:

Links

Python Bytes - #473 A clean room rewrite?

Topics covered in this episode:
Watch on YouTube

About the show

Sponsored by us! Support our work through:

Connect with the hosts

Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too.

Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it.

Michael #1: chardet ,AI, and licensing

  • Thanks Ian Lessing
  • Wow, where to start?
  • A bit of legal precedence research.
  • Chardet dispute shows how AI will kill software licensing, argues Bruce Perens on the Register
  • Also see this GitHub issue.
  • Dan Blanchard, maintainer of a Python character encoding detection library called chardet, released a new version of the library under a new software license. (LGPL → MIT)
  • Dan is allowed to make this change because v7 is a complete “clean room” rewrite using AI
  • BTW, v7 is WAY better:
    • The result is a 48x increase in detection speed for a project that lives in the hot loops of many projects. That will lead to noticeable performance increases for literally millions of users (the package gets ~130M downloads per month).
    • It paves a path towards inclusion in the standard library (assuming they don’t institute policies against using AI tools).
    • Thread-safe detect() and detect_all() with no measurable overhead; scales on free-threaded Python 3.13t+
  • An individual claiming to be Mark Pilgrim, the original creator of the library, opened an issue in the project's GitHub repo arguing that Blanchard had no right to change the software license, citing the LPGL requirement that the license remain unchanged.
  • A 'complete rewrite' is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a 'clean room' implementation).
  • Blanchard disagreed, citing how version 7.0.0 and 6.0.0 compare when subjected to JPlag, a library for detecting plagiarism.
  • Blanchard told The Register he had wanted to get chardet added to the Python standard library for more than a decade since it’s a core dependency to most Python projects.

Brian #2: refined-github

  • Suggested by Matthias Schöttle
  • A browser plugin that improves the GitHub experience
  • A sampling
    • Adds a build/CI status icon next to the repo’s name.
    • Adds a link back to the PR that ran the workflow.
    • Enables tab and shift tab for indentation in comment fields.
    • Auto-resizes comment fields to fit their content and no longer show scroll bars.
    • Highlights the most useful comment in issues.
    • Changes the default sort order of issues/PRs to Recently updated.
  • But really, it’s a huge list of improvements

Michael #3: pgdog: PostgreSQL connection pooler, load balancer and database sharder

  • PgDog is a proxy for scaling PostgreSQL.
  • It supports connection pooling, load balancing queries and sharding entire databases.
  • Written in Rust, PgDog is fast, secure and can manage thousands of connections on commodity hardware.
  • Features
    • PgDog is an application layer load balancer for PostgreSQL
    • Health Checks: PgDog maintains a real-time list of healthy hosts. When a database fails a health check, it's removed from the active rotation and queries are re-routed to other replicas
    • Single Endpoint: PgDog can detect writes (e.g. INSERT, UPDATE, CREATE TABLE, etc.) and send them to the primary, leaving the replicas to serve reads
    • Failover: PgDog monitors Postgres replication state and can automatically redirect writes to a different database if a replica is promoted
    • Sharding: PgDog is able to manage databases with multiple shards

Brian #4: Agentic Engineering Patterns

Extras

Brian:

Michael:

Joke: Ergonomic keyboard

Also pretty good and related:

Links

Talk Python To Me - #540: Modern Python monorepo with uv and prek

Monorepos -- you've heard the talks, you've read the blog posts, maybe you've seen a few tantalizing glimpses into how Google or Meta organize their massive codebases. But it's often in the abstract and behind closed doors. What if you could crack open a real, production monorepo, one with over a million lines of Python and over 100 of sub-packages, and actually see how it's built, step by step, using modern tools and standards? That's exactly what Apache Airflow gives us.

On this episode, I sit down with Jarek Potiuk and Amogh Desai, two of Airflow's top contributors, to go inside one of the largest open-source Python monorepos in the world and learn how they manage it with uv, pyproject.toml, and the latest packaging standards, so you can apply those same patterns to your own projects.

Episode sponsors

Agentic AI Course
Python in Production
Talk Python Courses

Guests
Amogh Desai: github.com
Jarek's GitHub: github.com

definition of a monorepo: monorepo.tools
airflow: airflow.apache.org
Activity: github.com
OpenAI: airflowsummit.org
Part 1. Pains of big modular Python projects: medium.com
Part 2. Modern Python packaging standards and tools for monorepos: medium.com
Part 3. Monorepo on steroids - modular prek hooks: medium.com
Part 4. Shared “static” libraries in Airflow monorepo: medium.com
PEP-440: peps.python.org
PEP-517: peps.python.org
PEP-518: peps.python.org
PEP-566: peps.python.org
PEP-561: peps.python.org
PEP-660: peps.python.org
PEP-621: peps.python.org
PEP-685: peps.python.org
PEP-723: peps.python.org
PEP-735: peps.python.org
uv: docs.astral.sh
uv workspaces: blobs.talkpython.fm
prek.j178.dev: prek.j178.dev
your presentation at FOSDEM26: fosdem.org
Tallyman: github.com

Watch this episode on YouTube: youtube.com
Episode #540 deep-dive: talkpython.fm/540
Episode transcripts: talkpython.fm

Theme Song: Developer Rap
🥁 Served in a Flask 🎸: talkpython.fm/flasksong

---== Don't be a stranger ==---
YouTube: youtube.com/@talkpython

Bluesky: @talkpython.fm
Mastodon: @talkpython@fosstodon.org
X.com: @talkpython

Michael on Bluesky: @mkennedy.codes
Michael on Mastodon: @mkennedy@fosstodon.org
Michael on X.com: @mkennedy

Big Technology Podcast - AI Backlash Intensifies, Nvidia GTC Preview, Meta’s Embarrassing Delay

Ranjan Roy from Margins is back for our weekly discussion of the latest tech news. We cover: 1) Backlash against AI & specifically Sam Altman's comments about AI as a utility 2) Is this because people are worried about AI taking their jobs? 3) NBC poll shows AI is one of the least popular things in the U.S. 4) YouGov poll shows broadly negative feelings toward AI 5) Pew finds datacenters are very unpopular 6) Consequences of AI's unpopularity 7) Nvidia GTC preview: A rallying cry for AI 8) Could Jensen Huang be the guy that turns this around? 9) Amazon's AI code is messing things up 10) McKinsey's AI tool hacked 11) Meta can't get its act together with Avocado delayed 12) Should Meta's AI use Google's Gemini tech

---

Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice.

Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b

Learn more about your ad choices. Visit megaphone.fm/adchoices

The Government Huddle with Brian Chidester - 203: The One with the Data Resilience SME

Mark Bentkower, Principal Technologist, Americas at Veeam Software joins the show to unpack what data resilience really means for public sector organizations. Together we explore the shift from traditional compliance checklists to a resilience-first mindset built on zero trust, automation, and cross-functional alignment. He also shares insights into how agencies can move beyond siloed operations and what separates organizations stuck in reactive mode from those building true operational resilience. Finally we dive into Veeam’s Data Resilience Maturity Model (DRMM) developed with McKinsey and discuss how agencies can benchmark their posture, align people, process, and technology, and make a business case for modernization.

array(3) { [0]=> string(64) "https://mcdn.podbean.com/mf/web/5smw6stfjzwiui7w/Veeam_Final.m4a" [1]=> string(0) "" [2]=> string(8) "35363913" }

The Stack Overflow Podcast - Open source for awkward robots

Ryan is joined by Jan Liphardt,  CEO and co-founder of OpenMind, to chat about the rapidly evolving world of humanoid robotics and what it means for humans, why OpenMind is building an open source operating system for robots that processes logic in natural language, and how putting Asimov’s Laws on the blockchain might be the key to robotics guardrails.

Episode notes: 

OpenMind’s OM1 is an open source OS for robots that allows robots to perceive, adapt, and act within human environments. 

Connect with Jan on LinkedIn and GitHub.

This week’s shoutout goes to user Sean, who won a Lifejacket badge for their answer to Creating the simplest HTML toggle button?.

TRANSCRIPT

See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Code Story: Insights from Startup Tech Leaders - S12 Bonus: Martina Zrnec, Stacklist

Martina Zrnec is located in Croatia and grew up playing basketball, spending every minute she could on the court. And when I say every minute, I mean it - she would even skip the last few hours of school and hit the court for some practice. Eventually, her mother decided for her that she should not pursue it professionally, and should focus on her schooling. Outside of tech, she's married with 2 kids. She notes that she is not just a coding person - she likes to socialize! She plays piano, and as a family, they spend a lot of time outside, biking, playing sports and being in nature.

Martina's co-founder, Kyle, had this idea that he wanted to create - a platform that allowed people to organize the products, services and experiences they love into stacks. He found Martina on a freelancing platform, and they instantly connected on the idea - and got to building.

This is the creation story of Stacklist.

Sponsors

Links




Support this podcast at — https://redcircle.com/codestory/donations

Advertising Inquiries: https://redcircle.com/brands

Privacy & Opt-Out: https://redcircle.com/privacy

Lex Fridman Podcast - #493 – Jeff Kaplan: World of Warcraft, Overwatch, Blizzard, and Future of Gaming

Jeff Kaplan is a legendary Blizzard game designer of World of Warcraft and Overwatch, now preparing to launch a new game, The Legend of California, from his new studio Kintsugiyama – available to wishlist on Steam today, with alpha later in March.
Thank you for listening ❤ Check out our sponsors: https://lexfridman.com/sponsors/ep493-sc
See below for timestamps, and to give feedback, submit questions, contact Lex, etc.

CONTACT LEX:
Feedback – give feedback to Lex: https://lexfridman.com/survey
AMA – submit questions, videos or call-in: https://lexfridman.com/ama
Hiring – join our team: https://lexfridman.com/hiring
Other – other ways to get in touch: https://lexfridman.com/contact

EPISODE LINKS:
The Legend of California (Steam Page): https://store.steampowered.com/app/2550530/The_Legend_of_California
Jeff’s Game Studio: https://www.kintsugiyama.com/

SPONSORS:
To support this podcast, check out our sponsors & get discounts:
Fin: AI agent for customer service.
Go to https://fin.ai/lex
Blitzy: AI agent for large enterprise codebases.
Go to https://blitzy.com/lex
BetterHelp: Online therapy and counseling.
Go to https://betterhelp.com/lex
Shopify: Sell stuff online.
Go to https://shopify.com/lex
CodeRabbit: AI-powered code reviews.
Go to https://coderabbit.ai/lex
Perplexity: AI-powered answer engine.
Go to https://perplexity.ai/

OUTLINE:
(00:00) – Introduction
(02:24) – Sponsors, Comments, and Reflections
(10:47) – Early games: Pac-Man, Zork, Doom, Quake
(25:12) – Writing career – 170 rejection letters
(40:45) – EverQuest obsession
(53:43) – Getting hired at Blizzard
(1:09:11) – Lowest point in Jeff’s life
(1:15:16) – One of Us
(1:19:33) – Early Blizzard culture
(1:39:15) – Building World of Warcraft
(1:56:59) – How WoW changed video games
(2:14:21) – Single-player vs Multi-player
(2:35:15) – How Blizzard made great video games
(3:01:04) – Online toxicity
(3:08:38) – Why Titan failed
(3:25:48) – Overwatch in six weeks
(3:52:46) – Best Overwatch heroes
(4:01:16) – The challenge of matchmaking
(4:04:40) – Rust
(4:15:01) – Why Jeff left Blizzard
(4:37:14) – Diablo IV
(4:38:42) – Getting back to making video games
(4:47:38) – The Legend of California
(5:01:23) – Greatest video game of all time
(5:09:30) – AI and future of video games

PODCAST LINKS:
– Podcast Website: https://lexfridman.com/podcast
– Apple Podcasts: https://apple.co/2lwqZIr
– Spotify: https://spoti.fi/2nEwCF8
– RSS: https://lexfridman.com/feed/podcast/
– Podcast Playlist: https://www.youtube.com/playlist?list=PLrAXtmErZgOdP_8GztsuKi9nrraNbKKp4
– Clips Channel: https://www.youtube.com/lexclips

Big Technology Podcast - AI’s Unpopularity + Competing With ChatGPT — With Olivia Moore

Olivia Moore is an AI partner at Andreessen Horowitz. Moore joins Big Technology Podcast to discuss whether startups still have a real shot at competing with the biggest AI chatbots as ChatGPT, Claude, and Gemini grow more capable. Tune in to hear why she believes the AI economy will be more distributed than many expect, where startups can still win, and how agentic products like OpenClaw could reshape software and work. We also cover AI’s image and video app shakeout, chatbot memory, AI companions, enterprise adoption, and what happens to incumbents as every company is pushed to become AI-native. Hit play for a sharp conversation about where value in the AI economy is actually headed.

---

Enjoying Big Technology Podcast? Please rate us five stars ⭐⭐⭐⭐⭐ in your podcast app of choice.

Want a discount for Big Technology on Substack + Discord? Here’s 25% off for the first year: https://www.bigtechnology.com/subscribe?coupon=0843016b

Learn more about your ad choices. Visit megaphone.fm/adchoices

Code Story: Insights from Startup Tech Leaders - Founder Chats – Vadim Dedov

Today, we are dropping another episode in our "chats" series, specifically on the Founder side, - hearing from those scaling the companies themselves.

In this episode, we are talking with Vadim Dedov, CEO at Catchers. Vadim is going to walk us through what problem he wanted to solve with Catchers, and how his product development journey took him through architectural decisions, product optimization, team building and more.

Questions

  • Before we talk about Catchers, I’d love to understand you a bit better.
  • What experiences or responsibilities earlier in your life shaped how you think about work, systems, and accountability today?
  • What problem were you dealing with before Catchers existed? Not as a product idea yet, but as a real operational pain you kept running into.
  • At what point did you realise this couldn’t be solved with people, spreadsheets, or manual coordination anymore and that technology was the only way forward?
  • How did Catchers actually start taking shape as a product? What was the very first version you built, and what did “good enough” mean in a business where mistakes affect people’s income and compliance?
  • How long did it take to get to something usable, and what constraints defined your MVP?
  • Looking back, what were the most important trade-offs you made early on?
  • Things you consciously postponed or simplified, knowing they might come back later.
  • Let’s zoom in on the product itself. What is the core product insight behind Catchers — the thing you believe differentiates it from a typical HR or staffing platform?
  • How did your thinking about architecture evolve as scale increased? Was there a moment when you had to stop moving fast and redesign parts of the system properly?
  • How did you approach building your core team around such a complex, operations-heavy product? What qualities mattered most in the people you trusted with this system?
  • Can you share a decision that didn’t go as planned and how you and your team dealt with the consequences?
  • When you step back and look at what you’ve built today, what are you most proud of not in terms of features, but in terms of reliability, impact, or how the system holds under pressure?
  • As you look ahead, how do automation and AI change the way you think about workforce platforms — and what advice would you give to someone building infrastructure-heavy products today?

Sponsors

Links




Support this podcast at — https://redcircle.com/codestory/donations

Advertising Inquiries: https://redcircle.com/brands

Privacy & Opt-Out: https://redcircle.com/privacy